ICS 180A, Spring 1997:
Strategy and board game programming

Lecture notes for April 8, 1997
Board representations

In order to operate, any game program needs to be able to store and manipulate two kinds of objects: game positions, and game moves. These representations need to allow the program to perform the following operations: Everything except displaying the board must be fast since it happens in the inner loops of the search routine. (Board display can be slower since it doesn't happen often.)

The internal representation of moves should be very concise (since we don't want to spend too much time generating long lists of moves) and quickly decodable. But (very important) it should also able to represent all possible moves! E.g. for chess, a typical computer move representation is to store the starting square and ending square of the piece being moved; for instance the common beginning move of the king's pawn forward two squares would be represented "e2e4" where e2 is the name for the initial position of the pawn and e4 is the name for its final position. The piece being captured (if any) does not need to be stored as part of the move since it is determined by the final position. In the computer, these positions can be represented as 6-bit values, so the whole move could be stored internally as two bytes. But (even though some programs are based on it) this representation is not quite capable of representing all moves! In castling, two pieces move, a king and a rook, but we can handle this as a special case in which we list only the king movement. More importantly, if a pawn moves from the seventh rank to the eighth, it can be replaced by any of four pieces: queen, rook, knight, and bishop. The representation above doesn't allow us to specify which replacement is happening. So when designing a move representation, one should be careful to make sure that it covers all the special cases that might happen in your game.

The onternal representation of board can be less concise but should still not be too huge. It must represent all relevant information, not just all visible information, but not including irrelevant information. E.g. in chess, we need to know the positions of pieces on the board (the obvious visible information), but we also need to know some invisible information: who's on move, whether either player can castle, whether an en passant capture is possible, and how many moves it's been since the last capture or pawn move. We also need to know something about what positions have occurred in the past (because of triple repetition) but don't need to know the entire list of past moves.

Example of Multiple Representation Possibilities: Chess

There are many possible ways of representing even something with as clearly defined a structure as a chessboard in a computer. Here are some of the methods that have been used by chess programs.

Representation 1: 8x8 array of squares. Within each square, keep a value indicating which piece is present in the square (e.g. enum { empty, wK, wN, wB, wR, wQ, wP, bK, bN, bR, bQ, bP }). Advantages: (1) simple. (2) easy to compute material scores:

    for  (i=0;i<8;i++)
        for(j=0;j<8;j++)
            score += value[square[i,j]];
It's a little messy but not really hard to compute possible moves; you can loop through the squares finding pieces of appropriate color and branch according to piece type:
    for  (i=0;i<8;i++)
        for(j=0;j<8;j++)
            switch (board[i,j]) {
            case wP:
                if (board[i+1,j] empty) generate move to (i+1,j)
                if (i==2 && board[i+1,j] empty && board[i+2,j] empty)
                    generate move to (i+2,j)
                if (j > 0 && board[i+1,j-1] contains black piece)
                    generate capture of (i+1,j-1)
                if (j < 7 && board[i+1,j+1] contains black piece)
                    generate capture of (i+1,j+1)
                break;
            ...
            }
however there are various annoying boundary conditions to check (e.g. a pawn on rook-file shouldn't try to capture to one side) making this code complicated and slower than necessary.

Representation 2: extended array. 10x10, containing extra boundary squares containing a special "boundary" value added to the enum. This simpifies some of the cases (reduces number of conditions in the if-statements above) at the expense of a little space.

Representation 3: 0x88. The name of this representation comes from a trick for testing whether a square is a valid move involving the binary representation of the number 136 (which in hexadecimal is 0x88). We give each square of the board a number (a single byte), of which the high 4 bits are the row and the low 4 bits are the column:

    112 113 114 115 116 117 118 119
    96  97  98  99  100 101 102 103
    80  81  82  83  84  85  86  87
    64  65  66  67  68  69  70  71
    48  49  50  51  52  53  54  55
    32  33  34  35  36  37  38  39
    16  17  18  19  20  21  22  23
    0   1   2   3   4   5   6   7
Then the square left of i is i-1, right is i+1, up is i+16, down is i-16 etc. Then represent the board as an array of 128 squares (of which 64 correspond to actual squares on the board). The advantages of this representation are (1) it speeds up the program a little by using only one index instead of two in the array references, and (2) you can test really quickly and easily whether a move stays on the board: i is a legal board position if and only if (i&0x88)==0. [Work it out, moving off the board either overflows the column giving i&0x08 nonzero, or overflows the row giving i&0x80 nonzero.] This is a pretty commonly used technique.

Representation 4: bitboards. I'll go into this in a lot more length than the other representations because it's probably more unfamiliar, but I think it's also likely to work better. Instead of having an array of squares, each containing a piece types, have an array of piece types, each of which stores a packed array of bits listing the squares containing that piece. Since there are 64 possible squares, each of these packed arrays can be stored in a 64-bit number (two 32-bit words). The big advantage is that you can perform certain evaluation and move generation operations very quickly using bitwise Boolean operations. Think of it as a way of getting your computer to do massively parallel computations by packing things into long words. For example, in the following position:

The bitboard for the white pawns (call this 64-bit value "wP") would consist of the bits

    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    0 0 0 0 0 1 0 0
    0 0 0 0 0 1 0 0
    0 0 0 0 1 0 0 0
    0 0 0 0 0 0 0 0
    1 1 1 0 0 0 0 1
    0 0 0 0 0 0 0 0
Then the bitboard squares occupied by black can be computed by a formula
    bOcc = bP | bN | bB | bR | bQ | bK
(where bP etc are bitboards for the different kinds of black pieces). Similarly we can compute the white occupied squares, and or these two bitboards together to get all occupied squares. The bitboard of possible white pawn one-square move destinations can then be computed by a formula:
    single_pawn_moves = (wP << 8) & ~occupied
Let's look at this in slow motion. Shifting wP by 8 produces a bitboard of positions one place in front of each pawn:
    0 0 0 0 0 0 0 0
    0 0 0 0 0 1 0 0
    0 0 0 0 0 1 0 0
    0 0 0 0 1 0 0 0
    0 0 0 0 0 0 0 0
    1 1 1 0 0 0 0 1
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
The negation of occupied gives a bitboard of empty squares:
    0 0 1 1 0 0 1 0
    1 0 1 0 1 0 0 0
    1 1 1 0 0 0 1 1
    1 0 1 1 1 0 1 1
    1 0 1 1 0 1 1 1
    1 0 1 1 1 0 1 1
    0 0 0 1 1 1 1 0
    0 1 0 1 0 0 1 0
The bitwise and of these two bitboards then gives the positions in front of a pawn, that are not already occupied:
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    0 0 0 0 1 0 0 0
    0 0 0 0 0 0 0 0
    1 0 1 0 0 0 0 1
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
Similarly you can find two-square pawn moves by taking the bitboard of one-square moves, shifting it another 8 bits, anding it with the non-occupied squares again, and anding it with a constant bitboard (shown below) of the squares on the fourth row (the only row onto which pawns are allowed to move two squares):
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    1 1 1 1 1 1 1 1
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0
Note that this constant bitboard can be generated at compile time rather than each time we want to generate moves. Pawn captures are similar (shift by seven or nine, and with a constant to eliminate captures off the left and right side of the board, and with bOcc).

The point of this technique is not that your code is simpler when you program with bitboards (it's a little more complicated) but that you generate the pawn moves all at once rather than one at a time. Also, a lot of the intermediate expressions you need (such as bOcc) get used over and over, and only need to be computed once. So bitboards end up being very efficient, and I think would be even better for games other than chess in which there are fewer types of pieces.

One complication arises: it's often important to count the number of nonzero bits in a bitboard, or to find a nonzero bit (e.g. to turn the bitboard of possible pawn moves into an explicit list of moves). Counting can be done one byte at a time, looking up in a 256-entry table the number of nonzero bits in each byte. There's a cute trick for finding a single nonzero bit: x^(x-1) (where the uparrow is C notation for exclusive or) gives a binary number ...000111... where the first one of x^(x-1) is the last nonzero bit of x. If you need to turn this into an actual bit, take the result modulo some carefully chosen number M (for which the numbers ...000111... are all different mod M), and look the result up in a table. As a simple example, the following code finds the index of the last nonzero bit of a byte:

    int T = { -1, 0, 7, 1, 3, -1, 6, 2, 5, 4, -1, -1 };
    int last_bit(unsigned char b) { return T[(b^(b-1)) % 11]; }

How to Undo?

Remember we said our board representation needed to handle undo operations. There are two possible methods: (1) Keep a stack in which each stack item holds a whole board representation; to make a move push it on the stack and to undo a move pop the stack. Probably this is too slow... (2) Keep a stack storing only the move itself together with enough extra information to undo the move and restore all the information in the board position. E.g. in chess you would need to store the identity of a captured piece (if any) and enough information to restore castling and en passant capturing privileges.

Repetition Detection

Some games e.g. Go, Chess have special rules about what happens when the same position is repeated (in chess, third repetition of a position gives the player making the repetition the right to declare a draw). How to tell? Short answer: make a hash function translating the position to a reasonably large number (we'll talk more about this later because this is also very important for speeding up the search). Then keep a list of the hash codes for previous game positions and test if your position shows up in it. Typical hash function: make 64*13 table of large random numbers; when piece x is on position y, look up table[x,y] and add it to hash ignoring overflow [Zobrist]. Note that, when making a move of a piece y from positions x to z, you can update the hash very quickly: just subtract table[x,y] and add table[z,y].
David Eppstein, Dept. Information & Computer Science, UC Irvine, .