ICS 180A, Spring 1997:
Strategy and board game programming
Lecture notes for April 29, 1997
Alpha-beta tells us how to search, but we still need to know when to
expand a node (search its children) and when to just stop and call the
Which nodes to search? Full-width vs. selective search
Brute Force and Selectivity
Shannon's original paper on computer chess listed two possible strategies.
Nowadays, neither of these ideas is used in its pure form.
Instead, we use a synthesis of both: selective extension.
We search all lines to some fixed depth, but then extend
extend some lines deeper than that horizon.
Sometimes we'll also do some pruning (beyond the safe pruning done by
alpha-beta), but this is usually extremely conservative because it's too
hard to pick out only the good moves;
but we can sometimes pick out and ignore really bad moves.
For games other than chess, with higher branching factors,
it may be necessary to use more aggressive pruning techniques.
- The most obvious is what the pseudo-code I've shown you so far does:
a full-width, brute force search to a fixed depth. Just pass in a
to your program, decrement it by one for each level of search, and stop
when it hits zero. This has the advantage of seeing even wierd-looking
lines of play, as long as they remain within the search horizon.
But the high branching factor means that it doesn't search any line very
deeply (bachelor's degree: knows nothing about everything).
Even worse, it falls prey to what's known as the horizon effect.
Suppose, in chess, we have a program searching seven levels deep,
and that it has trapped its knight in the corner of the board (say in
exchange for a rook) in a way that, if it played well, would end up with
the knight captured within the next seven moves. But, by sacrificing a
pawn elsewhere on the board, maybe the program could delay that capture
by another move. That delay would then push the capture past the
program's search horizon, so what it sees along that line of play would
just be the loss of a pawn instead of the loss of a knight. The knight
would still end up being captured, but far enough from the current
position that the program doesn't see it. So, it sacrifices its pawn,
thinking that will save the knight. The very next move, the same
situation occurs, so it sacrifices another pawn, and another, until very
soon it's lost a lot more material than the knight was ever worth, and
will still end up losing the knight anyway.
This sort of tailspin, in which a sequence of moderately bad moves
is used to delay some worse consequence, is known as the "horizon effect"
and it used to be an easy way to win games against computers
(until better algorithms let them avoid this problem).
- The other method suggested by Shannon was selective pruning:
again search to some fixed depth, but to keep the branching factor down
only search some of the children of each node (avoiding the "obviously
bad" moves). So, it can search much more deeply, but there are lines it
completely doesn't see (ph.d.: knows everything about nothing).
Shannon thought this was a good idea because it's closer to how humans think.
Turing used a variant of this idea, only searching capturing moves.
More typically one might evaluate the children and only expand the
k best of them where k is some parameter less than the
true branching factor.
Unfortunately, "obviously bad" moves are often not bad at all,
but are brilliant sacrifices that win the game. If you don't find one you
should have made, you'll have to work harder and find some other way to win.
Worse, if you don't see that your opponent is about to spring some such
move sequence on you, you'll fall into the trap and lose.
When to extend?
What is the point of extending?
To get better (more accurate) evaluations.
So, should extend
(or some combination of both).
- when the current evaluation is likely to be inaccurate, or
- when the current line of play is a particularly important part of
the overall game tree search
How do we know when the eval is likely inaccurate?
- In chess or other games in which there are both capturing and
non-capturing moves (checkers, go, fanorona), if there are captures to be made,
the evaluation will change greatly with each capture.
Quiescence search is the idea of, after reaching the main
search horizon, running a Turing-like search in which we only expand
capturing moves (or sometimes, capturing and checking) moves. For games
other than chess, the main idea would be to only include moves which
make large changes to the evaluation. Such a search must also include
"pass" moves in which we decide to stop capturing. Once both players
pass, we stop expanding. That way, the evaluation function is only
called on "quiescent" nodes at which it isn't about to change by
making a capture.
- If the position has been active in the recent past, maybe we guess
that it should still be
active. So we extend the search depth if the search passes thru
an "interesting" move e.g. a capture. In the alpha-beta pseudocode,
this would be accomplished by replacing the depth-1 parameter to the
recursive call to the search routine by the value
depth-1+extension. You have to be careful not to do this too often,
though, or you could end up with a hugely expanded (even possibly
infinite!) search tree.
One trick helps make sure this extension idea terminate:
only extend by a fraction of a level. Specifically, make the "depth"
counter record some multiple of the number of levels you really want to
search, say depth=levels*24. Then, in recursive calls to alpha-beta
search, pass a value of depth-24+extension. If the extension is always
strictly less than 24, the method is guaranteed to terminate, and you
can choose which situations result in larger or smaller extensions.
- I haven't seen this third technique done but maybe it should be.
Include a separate
evaluation function to determine how complicated a position is.
For instance, a position is probably complicated if it has lots of
contradictory evaluation terms, so compute the "complication
evalutation" by taking the normal evaluation function and replacing each
term in it by the absolute value of that term.
How to combine accuracy with importance?
So far, we've just looked at trying to find the points at which the
evaluation may be inaccurate. But maybe we don't care if it's
inaccurate for unimportant parts of the tree, but we really do care for
nodes on the principal variation. How do we take importance into
account when performing selective extensions?
- Don't, let alpha-beta sort out importance and just extend based on
- Extend lines that are part of (or near)
the principal variation (e.g. singular extensions -- used in Deep Blue
and/or its predecessors -- if there is one move much better than others in
a position, extend the search on that move).
- Moving away from alpha-beta...
conspiracy number search -- what is the minimum number of positions the
value of which would have to change to force program to make a different
move? Search those positions deeper.
Dept. Information & Computer Science,