Monday, September 29, 2008

Anatomy Of The Brain

Last week I wrote in brief about the differences between conscious and non-conscious thought, and how experiments show humans use both. Even the most primitive animal brain is an extremely complicated organ, and the human brain is obviously the most advanced brain we've encountered. Conceptually there are three main sections of the human brain, corresponding to the evolutionary processes that produced the brain.

Pardon the aside, but I wanted to say a word to the Christian readers who believe in Intelligent Design instead of evolution. Feel free to interpret "evolution" with "how God chose to create life". Your personal belief about how humans came to exist won't change your interpretation of what I have to say, so don't let my choice of language get in the way. Remember, I was a Christian for 30 years. I was taught creationism by my parents. In high school I believed the theory of evolution, and just said "God used evolution to create humans". Now I just think "humans were created through evolution". I still think people who believe in Intelligent Design are wrong given the data, but I see no reason to be condescending or judgmental about it.

As I was saying, there are three main sections of the brain. At the core is the so called "Reptilian Brain", or more formally the brain stem. This section handles basic reflexive responses such as "fight or flight", mating instincts, and the fear of other species. It also handles exactly one emotion: rage. Next is the Mammalian Brain, or Limbic system. This is the area of the brain that handles all other emotions, as well as concepts like family, culture, and attachments. Some aspects of conscious thought and self-identity are handled by the mammalian brain as well. Last is the Neo-cortex, which is responsible for higher level thought such as speech, reasoning, imagination, and speculation.

While there's a clear physical boundary between the reptilian and mammalian brains, the division between mammalian (limbic) and neo-cortex is not as clear, and there seems to be a stronger bleed between the functions. For example, some conscious thoughts are handled by the mammalian brain while others are handled by the neo-cortex.

Humans have all three brain sections, as do higher mammals such as other primates and larger mammals. Small primates such as rodents do not have a neo-cortex, although they do have the limbic system and reptilian brain. And as the name implies, all reptiles have the reptilian brain, but lack the mammalian brain and neo-cortex. So the brain sections correspond to different evolutionary forks. The primary feature that separates mammals from reptiles is not hair or internal gestation, but a more advanced brain.

If that's true, then reptiles don't actually have conscious thought. They simply respond to stimuli in the same fashion every time. But mammals, having a memory, can learn from past experiences and modify their behavior. Perhaps this is how mammals survived the dinosaurs after a natural disaster hit the earth. Dinosaur brains weren't programmed to handle the "meteor crashed into the earth and all your normal food dies" situation, whereas mammals could learn to find new food sources.

In computer science terms, the reptilian brain is analogous to a state machine, or a simple circuit board. It's pure hardware, and if you give it the same set of inputs, it produces the same set of outputs every time. It has no memory. The mammalian brain is more like a simple computer program, in that its responses are based both on sensory input and on past memories and experiences. Running the same computer program multiple times might produce different results. Mammals have the ability to learn throughout their life while reptiles do not. And since the neo-cortex handles imagination, reasoning, and "what if?" scenarios, it's closer in function to an operating system. An OS can run multiple programs in parallel, similar to the neo-cortex's ability to think about multiple thoughts at the same time, even conflicting thoughts.

This biological framework provides an interesting context from which to answer the questions "What are emotions?" and "Can a computer have emotions?" Some day I'll write on that, but there's a lot more to say than should be stuffed at the end of this column. In the meantime you'll just have to speculate.

Monday, September 22, 2008

Armchair Neuroscience

I'm really grateful for my undergraduate education at the University of Chicago. I believe it's the best liberal arts school in the country, in large part because it lacks the name recognition of Harvard or Yale while still placing academically in the top ten undergraduate institutions in America. No one goes to the University of Chicago for the fame or connections they might get at Harvard, and as a result the school attracts only those students who really want to learn. At the University of Chicago, they teach you how to think and analyze everything.

The joke in college was that a University of Chicago student didn't need to know anything about a subject to have an opinion on it. This is pretty much true. I distinctly remember getting an A on a paper about The Brother's Karamazov when I had only read the first quarter of the book. The trick was to relate the concepts discussed in class with the material I had actually read and with other texts read in that class. While I'm sure my professor would have been mortified to find this out, it speaks a lot about learning to analyze abstract concepts and come to conclusions that happen to be right.

In other words, The University of Chicago taught me to bullshit well.

Most of the time I talk about a topic, I really do understand the subject matter. But I'm not willing to let a lack of expertise or study prevent me from sharing my opinion, so today's topic is neuroscience. As a disclaimer, I've never studied this subject in an official capacity in my life. All the information I have is from personal reading into other studies, most of which are (I think) peer reviewed.

There! That's a more accurate disclaimer than you'll hear from any journalist, and yet they are often no better qualified than I am to write about these topics. In particular, I recently read an article on msnbc.com that hypothesized that your brain could be controlled by an inner zombie.

Yes, that's really their conclusion. And by "conclusion" I mean "hypothesis supported by circumstantial evidence that's phrased as a question so no one can refute your claim because you're not officially making one". Since there's no challenge in ridiculing a claim like this, let me instead give a fair and impartial summary of the article. Then I'll rip it to shreds and propose a hypothesis that better fits Occam's Razor by several orders of magnitude.

I'll admit the term "inner zombie" is just a term chosen to generate interest in the article. And in some sense it worked-- I'm linking to it. All they really mean by inner zombie is a thought process that's not conscious and not (always) connected to the rational decision making. The article cites some studies where people are blinded by deactivating the visual processing section of their brain, and then shown a word. The people have no conscious knowledge of the word, but when they are asked to "guess" what word might have been shown, they properly guess the word with much higher than expected accuracy. There's also a study that shows people who are temporary blinded can still reflexively react to visual information they can't see. And while the article doesn't mention it, there's also a study done on people who are blind because their brain doesn't process images even though their eyes are fine. Compared to other blind people, these people do a much better job than of dodging obstacles while walking down the street, like other people or sign posts.

I'd like to note that these results are all from real, peer reviewed studies. Scientists aren't questioning the validity of these results. But they also aren't jumping to the same conclusion that MSNBC did. Here's the basic argument the article makes:
  1. People show the ability to reflexively process and react to visual information even when they consciously report blindness.
  2. Therefore an unconscious process has control over their conscious minds.
  3. Perhaps all consciousness is controlled by these unconscious processes and consciousness is a myth.
It's step 3 that doesn't make any sense. Sure, humans have unconscious processes that handle the same information conscious sections of the brain handle. But why does that mean the whole brain is controlled by the unconscious part? Given our daily experience, I'd suggest that the reverse is true. Our general actions are primarily controlled by conscious processes, and only when consciousness shuts off do the unconscious sections take control. There's no reason to suggest consciousness is a myth and a "zombie" is "controlling your brain".

But the brain is a really strange organ. The reasons the brain work this way are rooted in the differences between human brains, standard mammal brains, and the reptilian brain. Next week I'll talk more about these things and explain why I think dogs have consciousness but lizards do not.

Sunday, September 14, 2008

The Last Place You Look

When I was growing up, I always found it ironic how lost things were always found in the last place you looked for them. For some reason, whenever I was looking for a lost toy it was never in my bedroom or the play room. I'd find it outside underneath the deck, in the basement, or inside the sofa. It wasn't until I was older that I realized this wasn't coincidence; it was a mathematical truth. Put another way, when you're looking for something and you've found it, you stop looking. Things are guaranteed to be in the last place you look.

That's not the same as saying all places are equally likely to hold the lost object, of course. If I'm looking for the TV remote, it's less probable that the dog took it as a chew toy than that the remote fell between sofa cushions. And it's far less probable that the remote is now part of an exhibit in the Smithsonian Institute.

The general purpose search algorithm that humans use for locating lost objects involves prioritizing possible locations from most to least likely and searching them roughly in that order. The actual order will get rearranged to save travel time. For example, if an object is most likely to be on top of your dresser, second most likely to be on the bathroom counter, and least likely to be under the bed, you'll still probably check your dresser first and under the bed second before heading to the bathroom. But in general the search is from most likely to least likely. That also explains why the place you find an object is always the last place you'd think to look. The optimal searching algorithm stipulates that you search in the least likely places last.

So this brings me back to last week's topic of octrees. I described how to build the structure that BrainWorks uses to determine which item a player is nearest. But I haven't actually explained how BrainWorks uses the octree to do this. Consider this two dimensional tree (a quadtree):


Each item in the tree is one of the colored dots. The player is the black dot with the circles around it. Every rectangle refers to a particular subsection of the tree. In this example, the red dot is the root node of the tree. The lower left section of the tree is divided by the brown dot into four sub-regions, and the brown dot's lower left region is further subdivided because it contains another item (the purple dot). The circles represent the boundary of the potential search space as the algorithm progresses.

When the search algorithm first starts, it checks the distance from the root node to the player. This distance defines a circle around the player. If an item is closer to the player than the root node is, it must be inside that red circle. At this point the algorithm doesn't know that the green or blue dots are inside the circle. It doesn't even know that no items in the lower left quadrant are closer than the red (root) item is. But it can mathematically guarantee that nothing in the upper right quadrant could be closer, so that quadrant is ignored.

Because the player is in the lower left quadrant, the algorithm prioritizes search in that section first. This is the quadrant that contains the greatest area of the red circle. The next most likely place to look is the lower right quadrant, as the player is closer to the lower right quadrant than the upper left. The upper left quadrant is scheduled third. So the areas to check are, in order:
  • Lower Left
  • Lower Right
  • Upper Left
Then the algorithm merely recurses. First it checks the brown item in the lower left. A quick distance check shows the brown item is too far away, but the search continues on the brown region's subquadrants of upper right, then upper left, then lower right. As none of these regions have any items, the search of the brown region terminates quickly. And at this point in time, the red item is still the closest object to the player.

When the algorithm starts checking the lower right quadrant of the red item, it notices the blue item is closer. So from now on, it compares all potential search areas against the blue circle instead of the red one. No further items exist in the lower right quadrant, so the blue item remains the closest item to the player.

Last, the algorithm wants to check the region to the upper left of the red (root) item. However, when compared to the new blue circle, the search algorithm realizes that the region does not intersect the circle. Since the circle is the area of potentially closer items, nothing in the upper left can be closer that the blue item, so the entire upper left region is ignored. And since all regions were prioritized by distance to the player (from closest to farthest), the search process is done. All potential items have been checked, so the blue item is the closest. The algorithm doesn't even need to check how close the green item is to the player because that entire upper left quadrant was skipped.

The octrees in BrainWorks function the same way, only in three dimensions instead of two, and using spheres instead of circles. That answers last week's question of why the algorithm might need to search up to 7 subregions to find the optimal solution. The closest item isn't necessarily in the same section of the octree as the player, but the diametrically opposed region can always be ignored. That leaves 7 of the 8 regions to check. In practice most regions are quickly pruned away when a closer item was found, in the same way the region containing the green item was ignored when the blue item was discovered. Even though the example tree contained seven items, only three of them needed to be checked (Red, Brown, then Blue) before the closest item could be found.

Monday, September 8, 2008

Dude, Where's My Rocket Launcher?

Back in March I wrote about how the AI conceptualizes item placement. Rather than thinking of position in terms of raw coordinates, bots consider each item (or cluster of nearby items) to be a potential landmark. When statistical tracking is done for areas players are likely to be, for example, they are grouped by which item they are closest to. Statistical tracking is one of the most powerful tools available to AI if you have a sufficiently large and well organized data set. BrainWorks uses this data to analyze which areas of the level a player is most likely to be in, for example. Not only can this information help bots track down enemies; it also helps a hurt bot them avoid enemies. The more data it can track, the better. So by extension, the faster it can translate from raw coordinates to the nearest item, the better. However, this is not a simple request.

If a level has N items in it and a player is located at position (X, Y, Z), what's the fastest way to determine which item is closest to that player?

Well you can just check the distance to every item and use whichever item is closest. If you read last week's column, you'll remember that this search requires linear time. If you double the number of items on the level, it will take twice as long to find the closest. Certainly we can do better than linear time.

There are a number of data structures that can solve this problem. Some of them might require more setup time but give slightly faster search time. The structure I chose is an octree. It's a tree that holds some number of three dimensional locations. If you want to learn more, I recommend the Wikipedia article on quadtrees as well. A quadtree is the two dimensional analog of an octree, so it's a bit easier to visualize.

In the example I gave last week of a filing cabinet holding 100 pieces of paper, 1 cabinet held 10 folders and each folder held 10 papers. This is a total of 111 pieces of data stored (100 papers + 10 folders + 1 cabinet). In computer science terms, each data storage location in a tree is called a "node". The node where searching begins is called the "root node". In the filing cabinet example, the filing cabinet itself is the root node. The cabinet and folder nodes split 10 ways, and the pieces of paper do not contain any sub-nodes. Nodes that do not further branch are called "leaf nodes".

Getting back to the octree example, the BrainWorks AI code needs to store the spacial locations of 50 or even 200 items in a tree. Each node in the octree contains is one of the stored points, and the node branches 8 ways. Visually think of each node as dividing space into 8 octants-- upper north west, upper north east, upper south west, upper south east, and four more counterparts for lower sections. All points that are above, north, and west of the root node are stored underneath the upper north west node, and so on for all other octants.

For example, suppose there are 50 items on a level. One of them is stored in the root-- say it's a rocket launcher. Of the 49 remaining items, 6 of these item locations are above, north, and west of the rocket launcher. The root node will link to a node for one of these six items-- a megahealth. The other five items will be stored undernearth that megahealth.

Again, the octant division property applies to all nodes, not just the root. So the megahealth also divides its space into eight octants, and the five remaining items will be appropriately placed underneath it. Keep in mind that each of these items is guaranteed to be above, north, and west of the rocket launcher. But items that are above, north, and west of the megahealth will be stored in a different octant than those above, north, and east of the health. Given that there are eight possible areas and only five remaining items to store, chances are that these items will all be stored in one of the eight sub-nodes of the megahealth. (If two items happen to both be below, north, and east of the megahealth, then the first will be stored under the megahealth, and the second will be stored under the first.)

Searching a tree is generally proprortional to the depth of the tree-- the greatest distance from the root node to a leaf node. And for a tree of 50 items, the depth is usually 3 or 4. Going up to 100 items might push the depth to 5 for some level layouts. So the octree structure makes answering the question of "which item am I nearest?" much, much faster than checking all 50 or 100 items.

Given that items have fixed locations on the level, the octree needs to be created once when the level starts, but the tree will never modify during the game. (Ignore items that drop from dead players-- those are handled differently and not useful for statistical tracking anyway.) The actual algorithm is pretty simple, a bit slow, and very likely to produce a well balanced tree. Here's how it works.

Given a set of items:
  • Average together all their positions to find the center
  • Check each item to find which one is closest to the center
  • Add that item as the dividing node
  • Separate all remaining items into one of the eight octants this center divides
  • Repeat the process on each octant that contains at least one item
By "not terribly fast", this algorithm could take up to half a second on a very, very large map. So still inconsequential, and well worth the one-time hit to produce a better tree.

Of course, there's still the question of how you search the tree to find the nearest item, and that I will be explaining next week. Here's some food for thought though: even if you identify which octant a point would reside in, you still need to search up to seven of the eight octants to guarantee you've found the closest item to your input point. Searching this tree is not done like most tree searches. For those of you who like thinking through these kinds of problems, try to figure out how the search algorithm actually works. I'll give the answer next week.

Given a set of item locations arranged in an octree and one player location, describe an algorithm to find the item nearest the player.

Good luck!

Monday, September 1, 2008

Needle In A Haystack

For a long time, I kept all my important documents in a single box. That would be stuff like my passport, car title, insurance information, and so on. After my wife and I bought a home, we had enough papers to file that our simple box wasn't sufficient. We bought large filing cabinet and it's served our needs ever since.

It's not that we needed to look up stuff any more frequently than before. But when we went from 100 pieces of paper to 1000, we didn't want it to take 10 times longer to find anything. To find stuff in the box, I riffled through it until I found the paper I needed, sometimes searching the entire thing. More stuff means the search takes longer.

A filing cabinet solves this problem by providing a natural structure for organizing the papers. For example, all the papers on the home can go in one file folder, all the car information in another, and all the identification documents like passports in a third. Then the lookup process becomes simpler, as I can ignore everything that's not part of the folder that keeps whatever I'm looking for.

This is a real savings in time. Imagine you have 100 different pieces of paper. You can either stuff them all in one box or you can organize them into 10 folders, each containing 10 papers. What's the average number of papers you'll have to look at to find any given piece, assuming you always know what folder should contain that piece of paper?

When looking through the box, you could look at anywhere between 1 and 100 pieces with equal probability. So you'll average (1 + 100) / 2 = 50.5 pieces before you find it. When looking through the folders, you have to find the correct folder from 10 options and then the correct paper of the 10 pieces in that folder. It will take between 1 and 10 tries to find the folder, or 5.5 on average. And similarly it takes on average 5.5 tries to find the paper in the folder. That's 11 tries total for the folder compared to 50.5 for the box. The filing system can be searched roughly five times faster!

Or is it really five times faster? What happens if you have 1000 pieces of paper instead of 100? When you have this much information, you could create folders and subfolders to further organize the papers. If you have 10 main folders, each of which contain 10 subfolders, and each of those contain 10 papers, then it will take 5.5 + 5.5 + 5.5 = 16.5 tries to find any one of the 1000 papers. It would take 500.5 tries if they were in a box. So in this case the filing system is about 30 times faster.

The more data there is to store, the more efficient the filing system is compared to the box. That's because the filing system is a whole order of magnitude more efficient. In mathematical terms, searching through the box takes linear time. When you add ten times as much stuff, it takes ten times as long. But searching the filing system takes logarithmic time. Adding ten times as much stuff just means it takes one more search step. Very roughly speaking, searching is proportionate to the number of zeros on the size of the data. Searching 100 things takes twice as long as 10 things, and searching 1000 things takes three times long. Want to search 1,000,000 things? That's six times longer than searching 10 things.

You might be wondering what all of this has to do with artificial intelligence. At the heart of almost all AI is data, and data doesn't mean anything if you can't find it when you need it. That means efficient data storage is the required foundation for artificial intelligence. An algorithm is only as good as the data it has access to, and good AI needs a lot of data. If all the data were just shoved in a box, there's no way the computer could run the AI's thought process fast enough. Indeed in almost all computer science, the way in which data is organized is just as important as the actual data being stored. After all, what good is information if you can't find it? Next week I'll give a concrete example of how BrainWorks puts this concept to use.