BrainWorks

Monday, July 28, 2008

Mission Impossible

One of the basic realities of life is that if something is easy, it's already been done. In the grand scheme of things, all unsolved problems are hard problems. While that might not have been true 300 years ago, the advancements of widespread education and information technology have made it easy to find potentially talented individuals and make sure everyone knows about their work. The result of that for almost every technological problem you can think of, either someone has solved it already or it's very difficult.

So what do you do when you need to do something that hasn't been done before and it turns out the problem is very hard? How do you break an impossible problem into something solvable? While there's no right or wrong way to go about this, there are three basic approaches you can use. Assume that the objective of a perfect solution is either practically or literally impossible, and you just want the best solution that's actually feasible.

#1: Improve on an existing base

The concept here is to focus on the small, incremental progress you can achieve rather than the large, difficult end objective. This is the approach I took when solving the general high level goal selection in BrainWorks. Bots must constantly decide what they are most interested in doing. Should they track down a target or look for more players? In capture the flag, should they defend their flag or get the opponent's? I confess I've added nothing revolutionary to this logic, and the correct choice is something even experienced human players have trouble making. Instead I just added a few pieces of additional logic to help the bots decide. BrainWorks bots will consider tracking down a target it's recently heard but not seen, for example. It's not algorithmically amazing but it's progress and that's what matters.

#2: Break the problem into pieces and only do some of them

I used this approach when designing the item pickup code. Roughly stated, the implemented algorithm is to estimate how many points the bot would get if it were to pick up different items and it selects the item that gives the most points. While the item selection is reasonably good, there's a lot of corners cut for the purpose of reducing complexity. How valuable is a personal teleporter anyway? BrainWorks just values it at 0.6 points and calls it a day, but clearly such an item is far more useful in some situations than others. And the estimations of where enemies are likely to be are based on proximity to items, not on actual geographic locations (like being in a specific room or a hallway). The estimates aren't remotely perfect, but they are still good enough for the purpose they serve: a rough estimate of what item the bot should pick up next. Designing this code wasn't even possible without ignoring all the truly hard parts of item pickup.

#3: Accept defeat and deal with the consequences

This is essentially The Attitude of the Knife. Sometimes things just don't work out and can't be fixed, and all you can do is accept that reality and handle the consequences as best you can. This was the approach I took for the navigation code in BrainWorks. As I've mentioned before, the navigation tables have huge issues on anything other than a flat, two dimensional map. While I'd love to solve this problem and I'm certain I could do it, I just didn't have enough time to tackle it. There were a few clearly bad navigation decisions (such as jumping between ledges) that I wrote code to manually fix, but those were essentially stopgap measures.

While it's not ideal, there's nothing wrong with accepting defeat against hard problems. After all, you aren't the only person in the world who couldn't solve the problem.

Monday, July 21, 2008

Gaming The System

Typically, a game artificial intelligence developer views their job as creating an opponent that will entertain the player. There's nothing wrong with this view. After all, if the player isn't having fun, it doesn't matter what the AI does well. The most realistic play in the world doesn't mean a thing if the player doesn't enjoy playing against it. It's possible to make AI that plays like that, of course. If a human can play so well it frustrates their opponent, an AI can do that too. But is the problem the AI design or the game design? I'd argue the game design is at fault. In other words,

AI design is a counterpoint to game design.

Game design really has two parts to it: making the game fun and making the game challenging. Alternatively, you can think of this as making the game fun for the winners and making the game fun for the losers. If a game is too easy for you, then chances are your opponents are getting crushed and aren't having a good time. A good multiplayer game is one in which everyone enjoys it regardless of whether they win. And a good single player game is one where it's neither too hard nor to easy to win.

That means AI can be a valuable tool in the design feedback loop for a game. If the AI continually exploits a certain unsatisfying strategy, you can expect players will do the same, and the game should be changed. When the AI uses a variety of interesting strategies, that's a good sign your game design is solid.

In the design of BrainWorks, I came across a few instances of AI behavior that commented on the game design of Quake 3. For example, on any level with a reasonably accessible BFG, the bots will do everything in their power to grab and use the weapon. They might even camp BFG ammo just to make it harder for other people to use the gun. While it makes gameplay boring and predictable, they aren't necessarily doing the wrong thing. The AI determined that the BFG was 10 times better than the next best option and made it choices accordingly.

There's also the issue of the machinegun. You might notice that if there are a lot of players on a level, the bots will sometimes flat out skip weapon upgrades and just use the machinegun. This also isn't a mistake. When there are more players on the level, each player's life expectancy decreases. When the bot won't live that long anyway, there's no point in wasting time for a new weapon. The starting machinegun is only slightly worse damage and comes with a reasonable amount of ammunition.

In the case of the BFG, the weapon is obviously too powerful, but it's pretty clear that was the point. The machinegun is a different case though. Is it really too powerful or do players just start with too much ammo? Or maybe it's fine that this is a strategy when a level becomes overpopulated, and human players that waste too much time picking up weapons are just playing it the wrong way.

This is just opinion, but I believe the problem is the amount of damage the machinegun deals. The reason I think this is that there are actually has two damage values for machinegun bullets. Normally it's 7 damage per hit, but in teamplay mode it's only 5 damage. Apparently iD software determined that in a teamplayer game, the standard 7 damage was too much. And they're right, but the problem wasn't the weapon's use in team games. It's the weapon's initial power in games with lots of people, even large free-for-all games. What would be best is if the weapon just dealt 5 damage always. Then freshly respawned players had incentive to grab a new weapon as soon as possible. In game design, the simpler solution is usually the best choice.

On a related note, game design is also a hobby of mine. My wife and I have two upcoming board games that will hopefully be released end of this year or beginning of next. If people are interested in more columns explaining the thought process that goes into game design, I'm happy to write more on that subject. In my mind, designing a game is very similar to designing artificial intelligence.

Monday, July 14, 2008

The Power of Spite

In almost any situation where there are limited resources, people are presented with the option of spiteful action. These are actions where someone else is penalized for an action where you received no benefit. Most of the time the spiteful play even had a slight cost, such as additional time. In war, there can be spiteful military decisions, such as General Sherman's sowing of salt during his march to the sea during the American civil war. Sowing salt makes the land completely unfarmable for a period of several years, and Sherman's choice to do this crippled a large part of Georgia's agriculture.

Of course, one of the cardinal rules of war is "waste nothing" since resources are tight. While these spiteful actions cost Sherman's troops a little time and resources, they had a tremendously detrimental effect on the Southerners, both demoralizing them and making it harder for the local citizens to survive. Without commenting on the ethics of this decision, I will merely note that Sherman's spiteful actions gave an advantage to his own side, the North.

So in a first person shooter game like Quake 3, players have countless opportunities to make spiteful plays. Every time they pick up an item they don't need, they are potentially preventing another player from getting that item. A good AI player needs to know when it should pick up an item out of spite. And naturally it shouldn't pick up every item, or it would waste far too much time. There has to be a method for deciding when the spiteful play is worthwhile and when it's not.

In BrainWorks, the additional time cost is automatically factored into the item selection code, as extra time spent grabbing items means a loss of points from delaying the bot's main goal. But the bots still need to know exactly how much an item pickup costs the opponents. Once that value is known, the bot estimates its relative score gain as the spiteful point loss divided by the number of opponents. In other words, keeping an item from 1 of 2 opponents is a lot more valuable than keeping it from 1 of 10. The fewer opponents the bot has, the more they prefer spiteful pickups. So while these spite values don't matter much in large group games, they can have a significant impact on a bot's item pickup selection in 1-on-1 matchups.

This still begs the question of exactly how much a spiteful item pickup costs opponents. Players know that it's great to grab armor even when you're at the maximum and grabbing ammo will rarely have an effect. But BrainWorks bots need to derive this fact. The AI does that by predicting how an individual item pickup will increase a players score for each of three typical players. There's a freshly spawned player, who generally wants armor and weapons. There's a hurt player, who benefits the most from health. And there's a decked out player who probably wants powerups. The AI computes the scores for each player with and without each possible item. The item's spite value is defined as the greatest point gain it gives any type of player. The code uses the maximum estimate rather than the average because players that most need an item are the most likely to pick it up.

That means the spite value is really just an estimate of the item's general usefulness. In fact, the bots use these estimations to isolate the most valuable items on the level and time their respawns, and just as a general area that other players are likely to be in.

Monday, July 7, 2008

A Simple Solution

As the story ended last week, I had uncovered a very strange bug in BrainWorks. There was up to a 10% accuracy difference between the first bot added to the game and the last bot. No one guessed the correct cause, but this one from cyrri was the closest:

bots occupy client slots in the same order as they are added.
in the very rare cases of two clients killing a third one simultaneously, it is allways the one with the lower slot id that gets the frag, becuase his usercommands are processed first. the other one gets a missed shot.

This actually does happen, but it only accounts for a 1% to 2% accuracy change between the first and last bots. Also, this value increases as bot accuracy increases, since it's more likely that two bots will be lined up for a good shot at the same time.

The real culprit was the mechanism through which the server processes client commands. The server processes each human player's inputs as soon as it receives them (as fast as 3 milliseconds for a human), but the inputs for bots are processed exactly once every 50 milliseconds. In turn, each bot handles its attacks and then moves. Then the next bot is processed, and they do this in the order they were added to the game. See the problem?

Every bot made its decisions based on where the other bots were currently located at the end of the last server frame, but only the first bot will actually attack against targets in those positions. By the time the last bot aims, every target had already spent 50 milliseconds moving. The first bot had 0 ms of latency and the last bot had 50 ms. Now 50 milliseconds isn't too bad, except the last bot was playing as if its latency was 0. That's why it missed around 10% more.

Since one of my original project constraints was that nothing would change in the server code, that meant that at least some bots had to play with 50 milliseconds of latency. There were no changes I could make to reduce this problem. So the solution was to add latency to all the bots. I wrote a feature into the AI core that tracked the positions of all players in the game for the last half second worth of updates or so, for all bots and all humans. Then if a bot needed to know where a target was, it looked up the properly lagged position and did basic extrapolation to guess where the target would be at the exact moment of attack.

Implementing this system gave all the bots very similar accuracies (within 1% due to the issue cyrri pointed out). But now the problem was that all bots the same accuracy as the worst bot, when they should have had the accuracy of the best bot. It turned out this "basic extrapolation" wasn't good enough. The original Quake 3 AI code used linear trajectories to estimate where a target would end up, nothing more sophisticated than that. So if a bot aimed at a target running to the bottom of a staircase, the bot would keep aiming up into the ground for a while, even though a human would know the target would move straight.

I tried some slightly more advanced solutions, doing basic collision checking against walls, but that didn't solve the problem of running down a ledge. I eventually concluded that humans have a learned sense of physics they take into account, and the bots would need that same sense if they were to play like humans. Solving this problem was both straightforward and time consuming.

I made an exact duplicate of the entire physics engine, modified it for prediction, and placed it in the AI code.

Every detail needed to be modeled-- friction, gravity, climbing up and down ledges, movement into and out of water. Even the force of gravity acting on a player on an inclined ledge. Everything had to be duplicated so that the bots could get that extra 10% accuracy in a human-like manner. It was not easy, and I learned far more than I wanted to know about modeling physics in a 3D environment. But in the end, it was worth it to see bots that could aim like humans.

Monday, June 30, 2008

A Peculiar Bug

In my postmortem of BrainWorks, I mentioned one of the big things I did right was creating a powerful debugging infrastructure directly in the code base itself. In general, it's easiest to test and maintain software when as much of the testing is fully automated as possible. Now that's easy for a program like Excel, where you can create a battery of sheets with strange formulas that might cause errors, and then confirm that the numbers all match up. If the software creates something measurable, it can usually be automatically tested.

Naturally, this means automated testing is far harder for Artificial Intelligence than other areas of computer science. The goal of BrainWorks is "bots that appear to be human". How do you teach a computer to measure whether a player's decisions appear to be human? That's tantamount to writing a program that can judge a Turing test, which is as hard as passing one. Fully automated testing of the AI isn't an option, but there are certainly things that can assist testing. All you have to do is identify an measurable component and check if it meets the human expectations.

For aiming, BrainWorks already measures accuracies with each weapon in each combat situation. The bot knows that it scores perhaps 35% with the shotgun against an enemy a few feet away and 5% against an enemy a hundred feet away. But it doesn't know if those numbers are reasonable until I tell it what to expect. The vast majority of the testing I did with aiming involved forcing a bot to use a particular weapon, making it play for hours, and then checking if the numbers "looked good". Generally they didn't, and I had to figure out why they weren't matching my expectations. In this testing process, I uncovered a variety of interesting bugs. Large sections of the aiming algorithm were rewritten around a dozen times trying to get things right. I found several errors in how accuracy was even being measured. But the strangest error I encountered was something totally unexpected.

I was monitoring Railgun accuracy as way of testing the overall ability to aim. It's an instant hit weapon with no spread and infinite range, so it's a great initial test case. I loaded up eight bots on a wide open level and forced them all to have exactly the same skill, then ran them for several hours. Curiously when I checked the results, their accuracies weren't all the same. The best had an accuracy around 75% and the worst was around 65%. Moreover, their scores reflected this.

I activated some mechanics in the system to modify aiming behavior. First I turned off all errors, so the aiming should be flawless, like watching a human who doesn't make mistakes. Their accuracies were still stratified. Then I completely circumvented the entire aiming code, forcing the bots to aim directly where they wanted to. That gave the bots what most people think of as an aim hacking, so their aim should have been perfect. But even still, testing showed that some bots would get higher accuracies than others. Sure, there was one bot that would score 99.9% accuracy, but another bot would only score 97%. When a bot has perfect information and reflexes, it should not miss one in 30 shots.

Then one day I noticed that all eight bots were sorted in alphabetical order. The bot with the first name had the highest score (and accuracy) down to the bot with the last name having the lowest score. Since the odds of this are 1 in 8! = 40,320, I considered this curious but still possibly a coincidence. So I tested it again, and each time the bots were sorted alphabetically! That was the final clue I needed to isolate this bug.

The script I used to start testing adds the bots in alphabetical order, so I tried swapping the order different bots were added and their accuracies changed as a result. Each time, the most accurate bot was added to the game first and the least accurate bot was added last. For some reason, the internal numbering of bots was affecting their aim.

So why exactly was this the case? I'll let you puzzle over it for the week. Next week I'll explain why this happened and the extreme amount of work that went into solving it.

Monday, June 23, 2008

Abstract Art

As part of a special program in high school, one of my classmates, Chris, applied to work under an orchestral conductor during his senior year. Chris was a brilliant musician, but the interview process with this particular conductor was a grueling two hour endeavor, much more than the typical person in the program would have gone through. The conductor wanted to find out if Chris could do more than play music well. He wanted to know if Chris understood what music was. Chris related that the experience was more of an intellectual sparring match than an interview, and everything revolved around answering one question:

What is art?

Now there are all kinds of meaningless philosophical answers to this question. "Art is whatever you say it is" or "Art is beauty". If you dig into these answers, you just end up with the tautology "Art is art". That's true, mind you, but not very helpful. However, the music conductor had a very good answer to the question, and he accepted Chris' application when Chris understood the answer.

Art is expression.

When I heard the answer, it really resonated with me. That's a real answer, not a tautology! The purpose of art is for the artist to convey a point to those people experiencing the art. That means art can be measured by two factors:

How important the point is to the audience
How well the audience understands the point

Note that the audience is a crucial part of art. Immediately when I heard this answer, I understood why I'd always hated abstract art. It's bad art because it neglects the audience. Generally abstract art boils down to one of two types.

Type 1: "This black square set against a mess of orange and red circles represents the current triumph of humanity against nature, but the ultimate futility of the struggle." While that might be a good point, it's not particularly novel, and the point isn't conveyed very well. Other artists have done a much better job of creating works that express this point without even intending to. Fundamentally I feel like this art exists so that art snobs can feel elitist, because they "understand" the point and other people don't. But obfuscating your point doesn't make you clever; it makes you a bad artist.

Type 2: "This black square set against a mess of orange and red circles doesn't represent anything because art doesn't need to have a point. It can mean anything and everything!" In that case, the artist isn't expressing anything at all. They are doing an exceptional job of doing nothing. It's not art because it's not expressing a real point.

This doesn't mean art needs to be hyper-realistic. Reality isn't as fuzzy as the impressionistic paintings Monet did of water lilies, yet millions of people from different cultures and backgrounds have all recognized the beauty of that artwork. The true test of art is not whether its point is realistic, but whether the point has been successfully conveyed to the audience.

With that as the framework, I'd like to talk about something different from Artificial Intelligence: the writing of this column. You might notice that I express a lot of stuff in layman's terms, glossing over details, sometimes going back to them and sometimes not. It's my goal that this column will never require more than a first year undergraduate's experience to understand. That's explicitly because it's crucial that people can understand what I'm writing. If I can't explain things simply, then I'm not doing my job as a writer.

Each week I pick a topic that might be interesting. This column is partially just a personal soap box-- I talk about things I find interesting and usually I relate it to AI. But I cannot escape the purpose of art, that art is only meaningful to the extent that your audience cares. So I want to get a better understand about what topics and styles you, the reader, appreciate the most. It's my job to write about things you want to read.

I analyzed the articles I've written and they fall into one of four categories:

Technical discussions of exactly how an algorithm works. Discusses actual code.
Overview discussions of how to approach an algorithm. Discusses situations that would require programming, but not actual code.
Reflections on artificial intelligence and computer science in the abstract. Discussion relates to philosophy.
Philosophical discussions. Usually relates to intelligence, may or may not relate to artificial intelligence.

I'd love to know what kinds of articles people enjoy the most so I can plan a good mix of them. Please, post your preferences in the comments! And if you're interested in hearing a particular topic covered, also mention that and I'll try to cover it at a later date.

Monday, June 16, 2008

Give It A Go

While my primary area of expertise is First Person Shooter AI, I still get a number of questions about other AI applications. One of the hardest game AI problems to date, if not the hardest, is the game of Go. What's so surprising is that Go has simpler rules than Chess, but Go AI is several orders of magnitude more complicated than good chess AI. Now consider that the best chess AI to date has only beaten a grand master, Kasparov, when it was specifically trained to counteract Kasparov's play style. In the first set of games, which Kasparov won, he commented that he did better when he started making what he considered the second best move instead of the best move. Apparently the first AI was so hyper-tuned against Kasparov's play style that it defeat other similarly good plays that it did not think Kasparov would make.

Yes, that's the best the world has done for Chess AI: A program that can defeat one really good player. But Go? The best AI programs just rank as decent amateurs. And listen to how simple the rules are:

Players take turn placing stones of their color (black or white) on a square board (usually 19x19 but sometimes 9x9).
Stones of the same color that are adjacent vertically or horizontally are part of the same group and live or die together.
When a group is completely surrounded (no plays exist that add a stone to the group), that group is captured. Each lost stone is -1 point at the end of the game.
If a play is made that would cause both the placed stone and an opposing group to be surrounded, the place stone lives and the opposing group is captured.
You may not make a play that results in a previous board position.
The game is over when both players consecutively pass, opting not to play a stone. Each player scores 1 point for each empty space that is enclosed only by their color of stones.

What makes this problem so hard to solve? Well for starters, the 19x19 board is enormous. There are a lot of possible positions. Moreover, even though the plays are just of single stones, they come together to form different shapes like pixels form a picture. Different shapes have different properties, strengths, and weaknesses. A typical alpha-beta search becomes rapidly unusable when the average branching factor is on the order of 300.

Nevertheless, in the past few years there have been a few significant advancements in the field of Go AI. Rather than trying to solve Go in the same way Chess would be solved, some Monte-Carlo algorithms have shown a lot of promise. In particular, in 2006, "Upper Confidence Bounds applied to Trees" (or UCT) was developed. I'm not an expert on this subject, but in brief, the concept works like this.

While it's very hard to determine which space on a 19x19 board is the best, it's very easy to compute the score of a Go board. Contrast this with Chess where the actual pieces in play are secondary to question of whether or not a King has been captured. So the very rough idea is to try most options (pruning the obviously bad plays) and play a semi-random game for each of them (again, pruning obviously bad plays). Far less time is spent at each stage of the tree. If there are 50 possible moves that aren't terrible, one of them is just selected at random. There's no aggressive search to determine which of those 50 is the best. When the test game is done for a given play, that provides an estimate of how good that particular initial placement is. Other placements are tried and the one with the highest estimated score is selected as the computer's official move.

Of course, the actual algorithm is quite a bit more complicated than that, but that's the general idea behind the innovation. The crucial lesson here for AI designers is this:

Focus on your goal.

In the end, Go is just a score maximization problem. It's very easy to get bogged down in the details of whether a move nets 5 points or 7 points, quickly producing a state where far too many calculations need to be done. Often the best solution involves determining how much processing time you can actually spend, and then working backwards to figure out what methods will give you the most information for the amount of time spent. In Go, calculating the board's score is relatively fast, so latest algorithm that exploited this fact had a major advantage over those that did not.