97

[深度学习]实现一个博弈型的AI,从五子棋开始(1) - xerwin

 6 years ago
source link: http://www.cnblogs.com/erwin/p/7828956.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

[深度学习]实现一个博弈型的AI,从五子棋开始(1)

好久没有写过博客了,多久,大概8年???最近重新把写作这事儿捡起来……最近在折腾AI,写个AI相关的给团队的小伙伴们看吧。

搞了这么多年的机器学习,从分类到聚类,从朴素贝叶斯到SVM,从神经网络到深度学习,各种神秘的项目里用了无数次,但是感觉干的各种事情离我们生活还是太远了。最近AlphaGo Zero的发布,深度学习又火了一把,小伙伴们按捺不住内心的躁动,要搞一个游戏AI,好吧,那就从规则简单、老少皆宜的五子棋开始讲起。

好了,废话就说这么多,下面进入第一讲,实现一个五子棋。

小伙伴:此处省去吐槽一万字,说好的讲深度学习,怎么开始扯实现一个五子棋程序了,大哥你不按套路出牌啊……

我:工欲善其事必先利其器,要实现五子棋的AI,连棋都没有,AI个锤子!

老罗:什么事?

五子棋分为有禁手和无禁手,我们先实现一个普通版本的无禁手版本作为例子,因为这个不影响我们实现一个AI。补充说明一下,无禁手黑棋必胜,经过比赛和各种研究,人们逐渐知道了这个事实就开始想办法来限制黑棋先手优势。于是出现了有禁手规则,规定黑棋不能下三三,四四和长连。但随着比赛的结果的研究的继续进行,发现其实即使是对黑棋有禁手限制,还是不能阻止黑棋开局必胜的事实,像直指开局中花月,山月,云月,溪月,寒星等,斜指开局中的名月,浦月,恒星,峡月,岚月都是黑棋必胜。于是日本人继续提出了交换和换打的思想,到了后来发展成了国际比赛中三手交换和五手二打规则,防止执黑者下出必胜开局或者在第五手下出必胜打。所以结论是,在不正规的比赛规则或者无禁手情况下,黑棋必胜是存在的。

(1)五子棋下棋逻辑实现

这里用Python来实现,因为之后的机器学习库也是Python的,方便一点。

界面和逻辑要分开,解耦合,这个是毋庸置疑的,并且之后还要训练AI,分离这是必须的。所以我们先来实现一个五子棋的逻辑。

我们先来考虑五子棋是一个15*15的棋盘,棋盘上的每一个交叉点(或格子)上一共会有3种状态:空白、黑棋、白棋,所以先建个文件 consts.py

做如下定义:

from enum import Enum

N = 15

class ChessboardState(Enum):
    EMPTY = 0
    BLACK = 1
    WHITE = 2

棋盘的状态,我们先用一个15*15的二维数组chessMap来表示,建一个类 gobang.py

currentI、currentJ、currentState 分别表示当前这步着棋的坐标和颜色,再定义一个get和set函数,最基本的框架就出来了,代码如下:

from enum import Enum
from consts import *

class GoBang(object):
    def __init__(self):
        self.__chessMap = [[ChessboardState.EMPTY for j in range(N)] for i in range(N)]
        self.__currentI = -1
        self.__currentJ = -1
        self.__currentState = ChessboardState.EMPTY

    def get_chessMap(self):
        return self.__chessMap

    def get_chessboard_state(self, i, j):
        return self.__chessMap[i][j]

    def set_chessboard_state(self, i, j, state):
        self.__chessMap[i][j] = state
        self.__currentI = i
        self.__currentJ = j
        self.__currentState = state

这样界面端可以调用get函数来获取各个格子的状态来决定是否绘制棋子,以及绘制什么样的棋子;每次下棋的时候呢,在对应的格子上,通过坐标来设置棋盘Map的状态。

所以最基本的展示和下棋,上面的逻辑就够了,接下来干什么呢,得考虑每次下棋之后,set了对应格子的状态,是不是需要判断当前有没有获胜。所以还需要再加两个函数来干这个事情,思路就是从当前位置从东、南、西、北、东南、西南、西北、东北8个方向,4根轴,看是否有连续的大于5颗相同颜色的棋子出现。假设我们目前落子在棋盘正中,需要判断的位置如下图所示的米字形。

17518-20171113225927046-1792416204.png

那代码怎么写呢,最最笨的办法,按照字面意思来翻译咯,比如横轴,先看当前位置左边有多少颗连续同色的,再看右边有多少颗连续同色的,左边加右边,就是当前横轴上的连续数,如果大于5,则胜利。

    def have_five(self, current_i, current_j):
        #四个方向计数 竖 横 左斜 右斜
        hcount = 1

        temp = ChessboardState.EMPTY

        #H-左
        for j in range(current_j - 1, -1, -1):  #横向往左 from (current_j - 1) to 0
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
#H-右 for j in range(current_j + 1, N): #横向往右 from (current_j + 1) to N temp = self.__chessMap[current_i][j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break hcount = hcount + 1
#H-结果 if hcount >= 5: return True

以此类推,再看竖轴、再看左斜、再看右斜。于是,have_five函数变成这样了:

    def have_five(self, current_i, current_j):
        #四个方向计数 横 竖 左斜 右斜
        hcount = 1
        vcount = 1
        lbhcount = 1
        rbhcount = 1

        temp = ChessboardState.EMPTY

        #H-左
        for j in range(current_j - 1, -1, -1):  #横向往左 from (current_j - 1) to 0
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
        #H-右
        for j in range(current_j + 1, N):  #横向往右 from (current_j + 1) to N
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
        #H-结果
        if hcount >= 5:
            return True
#V-上 for i in range(current_i - 1, -1, -1): # from (current_i - 1) to 0 temp = self.__chessMap[i][current_j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break vcount = vcount + 1 #V-下 for i in range(current_i + 1, N): # from (current_i + 1) to N temp = self.__chessMap[i][current_j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break vcount = vcount + 1 #V-结果 if vcount >= 5: return True
#LB-上 for i, j in zip(range(current_i - 1, -1, -1), range(current_j - 1, -1, -1)): temp = self.__chessMap[i][j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break lbhcount = lbhcount + 1 #LB-下 for i, j in zip(range(current_i + 1, N), range(current_j + 1, N)): temp = self.__chessMap[i][j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break lbhcount = lbhcount + 1 #LB-结果 if lbhcount >= 5: return True
#RB-上 for i, j in zip(range(current_i - 1, -1, -1), range(current_j + 1, N)): temp = self.__chessMap[i][j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break rbhcount = rbhcount + 1 #RB-下 for i, j in zip(range(current_i + 1, N), range(current_j - 1, -1, -1)): temp = self.__chessMap[i][j] if temp == ChessboardState.EMPTY or temp != self.__currentState: break rbhcount = rbhcount + 1 #LB-结果 if rbhcount >= 5: return True

这样是不是就写完了,五子棋的逻辑全部实现~ 

NO,别高兴得太早,我想说,我好恶心,上面那个代码,简直丑爆了,再看一眼,重复的写了这么多for,这么多if,这么多重复的代码块,让我先去吐会儿……

好了,想想办法怎么改,至少分了4根轴,是重复的对不对,然后每根轴分别从正负两个方向去统计,最后加起来,两个方向,也是重复的对不对。

于是我们能不能只写一个方向的代码,分别调2次,然后4根轴,分别再调4次,2*4=8,一共8行代码搞定试试。

因为有45°和135°这两根斜轴的存在,所以方向上应该分别从x和y两个轴来控制正负,于是可以这样,先写一个函数,按照方向来统计:

xdirection=0,ydirection=1       表示从y轴正向数;

xdirection=0,ydirection=-1     表示从y轴负向数;

xdirection=1,ydirection=1       表示从45°斜轴正向数;

不一一列举了,再加上边界条件的判断,于是有了以下函数:

    def count_on_direction(self, i, j, xdirection, ydirection, color):
        count = 0
        for step in range(1, 5): #除当前位置外,朝对应方向再看4步
            if xdirection != 0 and (j + xdirection * step < 0 or j + xdirection * step >= N):
                break
            if ydirection != 0 and (i + ydirection * step < 0 or i + ydirection * step >= N):
                break
            if self.__chessMap[i + ydirection * step][j + xdirection * step] == color:
                count += 1
            else:
                break
        return count

于是乎,前面的have_five稍微长的好看了一点,可以变成这样:

def have_five(self, i, j, color):
        #四个方向计数 横 竖 左斜 右斜
        hcount = 1
        vcount = 1
        lbhcount = 1
        rbhcount = 1

        hcount += self.count_on_direction(i, j, -1, 0, color)
        hcount += self.count_on_direction(i, j, 1, 0, color)
        if hcount >= 5:
            return True

        vcount += self.count_on_direction(i, j, 0, -1, color)
        vcount += self.count_on_direction(i, j, 0, 1, color)
        if vcount >= 5:
            return True

        lbhcount += self.count_on_direction(i, j, -1, 1, color)
        lbhcount += self.count_on_direction(i, j, 1, -1, color)
        if lbhcount >= 5:
            return True

        rbhcount += self.count_on_direction(i, j, -1, -1, color)
        rbhcount += self.count_on_direction(i, j, 1, 1, color)
        if rbhcount >= 5:
            return True

还是一大排重复的代码呀,我还是觉得它丑啊,我真的不是处女座,但是这个函数是真丑啊,能不能让它再帅一点,当然可以,4个重复块再收成一个函数,循环调4次,是不是可以,好,就这么干,于是have_five就又漂亮了一点点:

    def have_five(self, i, j, color):
        #四个方向计数 横 竖 左斜 右斜
        directions = [[(-1, 0), (1, 0)], \
                      [(0, -1), (0, 1)], \
                      [(-1, 1), (1, -1)], \
                      [(-1, -1), (1, 1)]]

        for axis in directions:
            axis_count = 1
            for (xdirection, ydirection) in axis:
                axis_count += self.count_on_direction(i, j, xdirection, ydirection, color)
                if axis_count >= 5:
                    return True

        return False

嗯,感觉好多了,这下判断是否有5颗相同颜色棋子的逻辑也有了,再加一个函数来给界面层返回结果,逻辑部分的代码就差不多了:

    def get_chess_result(self):
        if self.have_five(self.__currentI, self.__currentJ, self.__currentState):
            return self.__currentState
        else:
            return ChessboardState.EMPTY

于是,五子棋逻辑代码就写完了,完整代码 gobang.py 如下:

#coding:utf-8

from enum import Enum
from consts import *

class GoBang(object):
    def __init__(self):
        self.__chessMap = [[ChessboardState.EMPTY for j in range(N)] for i in range(N)]
        self.__currentI = -1
        self.__currentJ = -1
        self.__currentState = ChessboardState.EMPTY

    def get_chessMap(self):
        return self.__chessMap

    def get_chessboard_state(self, i, j):
        return self.__chessMap[i][j]

    def set_chessboard_state(self, i, j, state):
        self.__chessMap[i][j] = state
        self.__currentI = i
        self.__currentJ = j
        self.__currentState = state

    def get_chess_result(self):
        if self.have_five(self.__currentI, self.__currentJ, self.__currentState):
            return self.__currentState
        else:
            return ChessboardState.EMPTY

    def count_on_direction(self, i, j, xdirection, ydirection, color):
        count = 0
        for step in range(1, 5): #除当前位置外,朝对应方向再看4步
            if xdirection != 0 and (j + xdirection * step < 0 or j + xdirection * step >= N):
                break
            if ydirection != 0 and (i + ydirection * step < 0 or i + ydirection * step >= N):
                break
            if self.__chessMap[i + ydirection * step][j + xdirection * step] == color:
                count += 1
            else:
                break
        return count

    def have_five(self, i, j, color):
        #四个方向计数 横 竖 左斜 右斜
        directions = [[(-1, 0), (1, 0)], \
                      [(0, -1), (0, 1)], \
                      [(-1, 1), (1, -1)], \
                      [(-1, -1), (1, 1)]]

        for axis in directions:
            axis_count = 1
            for (xdirection, ydirection) in axis:
                axis_count += self.count_on_direction(i, j, xdirection, ydirection, color)
                if axis_count >= 5:
                    return True

        return False

小伙伴:大哥,憋了半天,就憋出这么不到60行代码?

我:代码不在多,实现则灵……

明天来给它加个render,前端界面就有了,就是一个简单的完整游戏了,至于AI,别急嘛。

好吧,就这样…

UI部分在这里:

[深度学习]实现一个博弈型的AI,从五子棋开始(2)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK