Monday, June 16, 2008

Give It A Go

While my primary area of expertise is First Person Shooter AI, I still get a number of questions about other AI applications. One of the hardest game AI problems to date, if not the hardest, is the game of Go. What's so surprising is that Go has simpler rules than Chess, but Go AI is several orders of magnitude more complicated than good chess AI. Now consider that the best chess AI to date has only beaten a grand master, Kasparov, when it was specifically trained to counteract Kasparov's play style. In the first set of games, which Kasparov won, he commented that he did better when he started making what he considered the second best move instead of the best move. Apparently the first AI was so hyper-tuned against Kasparov's play style that it defeat other similarly good plays that it did not think Kasparov would make.

Yes, that's the best the world has done for Chess AI: A program that can defeat one really good player. But Go? The best AI programs just rank as decent amateurs. And listen to how simple the rules are:
  • Players take turn placing stones of their color (black or white) on a square board (usually 19x19 but sometimes 9x9).
  • Stones of the same color that are adjacent vertically or horizontally are part of the same group and live or die together.
  • When a group is completely surrounded (no plays exist that add a stone to the group), that group is captured. Each lost stone is -1 point at the end of the game.
  • If a play is made that would cause both the placed stone and an opposing group to be surrounded, the place stone lives and the opposing group is captured.
  • You may not make a play that results in a previous board position.
  • The game is over when both players consecutively pass, opting not to play a stone. Each player scores 1 point for each empty space that is enclosed only by their color of stones.
What makes this problem so hard to solve? Well for starters, the 19x19 board is enormous. There are a lot of possible positions. Moreover, even though the plays are just of single stones, they come together to form different shapes like pixels form a picture. Different shapes have different properties, strengths, and weaknesses. A typical alpha-beta search becomes rapidly unusable when the average branching factor is on the order of 300.

Nevertheless, in the past few years there have been a few significant advancements in the field of Go AI. Rather than trying to solve Go in the same way Chess would be solved, some Monte-Carlo algorithms have shown a lot of promise. In particular, in 2006, "Upper Confidence Bounds applied to Trees" (or UCT) was developed. I'm not an expert on this subject, but in brief, the concept works like this.

While it's very hard to determine which space on a 19x19 board is the best, it's very easy to compute the score of a Go board. Contrast this with Chess where the actual pieces in play are secondary to question of whether or not a King has been captured. So the very rough idea is to try most options (pruning the obviously bad plays) and play a semi-random game for each of them (again, pruning obviously bad plays). Far less time is spent at each stage of the tree. If there are 50 possible moves that aren't terrible, one of them is just selected at random. There's no aggressive search to determine which of those 50 is the best. When the test game is done for a given play, that provides an estimate of how good that particular initial placement is. Other placements are tried and the one with the highest estimated score is selected as the computer's official move.

Of course, the actual algorithm is quite a bit more complicated than that, but that's the general idea behind the innovation. The crucial lesson here for AI designers is this:

Focus on your goal.

In the end, Go is just a score maximization problem. It's very easy to get bogged down in the details of whether a move nets 5 points or 7 points, quickly producing a state where far too many calculations need to be done. Often the best solution involves determining how much processing time you can actually spend, and then working backwards to figure out what methods will give you the most information for the amount of time spent. In Go, calculating the board's score is relatively fast, so latest algorithm that exploited this fact had a major advantage over those that did not.


Anonymous said...

In 2006 the best chess player in the world was defeated in a 6 game set with the computer having 2 wins and 4 ties, 0 losses. That probably rates as a stronger computer than deep blue.

Ted Vessenes said...

Ah, I hadn't heard that. At any rate, the matches against Kasparov show that computing power and algorithms are right now on the cusp of being able to defeat the best Chess players in the world, and they still aren't anywhere near solving the problem of Go. I'm certainly they'll solve it one day, but it's probably a few decades off.

Anonymous said...

I couldn't find it last night, but here :