ICS 180A, Spring 1997:
Strategy and board game programming

Lecture notes for April 15, 1997
Tuning the evaluation function

Last time I talked about the different types of functions that evaluate features in a position, and how to combine them into an evaluation function by adding the values of many such functions. But, where do the numbers come from?

E.g., in Othello, you might have say four functions:

f(pos) = material (# of my pieces - # of opponent pieces)
g(pos) = corners (# I control - # opponent controls)
h(pos) = mobility (# moves I have available)
You want to form an evaluation function by combining these (probably with some other terms): eval = a f + b g + c h. For instance, you might try eval = -1*f + 10*g + 1*h. But where do these numbers come from? What combination of numbers gives the best performance?

There are various methods for finding numbers by hand:

... and without human intervention (much of this should be review from 171, for those students who've taken 171 already; you probably won't have time to do much more than hand-tweaking): All of these methods require some method of automatically evaluating the performance of a program. What has actually been done in automatically learning evaluation weights? A good source for this is Jay Scott's "machine learning in games" web page. He lists two experiments that I think are particularly interesting:
David Eppstein, Dept. Information & Computer Science, UC Irvine, .