Challenges and Approaches Towards Building a Successful StarcraftII AI System

The focus of this paper is on the challenges we currently face towards building a successful StarCraft II AI system and the existing approaches to solve the problems, containing major topics on decision making, micro managing and learning.

Hongze Yu, Blogger

November 23, 2016

16 Min Read

One year ago, the Google Deep Mind completed the bet and most intelligent AI system that human have seen, AlphaGo. With its learning abilities and decision making abilities, AlphaGo was able to defeat the Go world champion, Lee Sedol, by 4-1 in a Go match. Ever since the victory of AlphaGo was announced, the computer science community believed that its next challenge will be StarcraftII. StarcraftII, a real-time-strategy (RTS) game, has proved to be the highest level of esports and the highest difficulty of video games. As a StarcraftII player, I have played numerous matches against the current AI system, and they do not seem challenging at all. Even the interviews with StarcraftII professional players concluded that these pros do not think that AlphaGo can ever beat humans in this game. However, in the past week, on Blizzcon Anaheim, Blizzard Entertainment, the video game company that developed StarraftII, announced that they will make the Starcraft AI an open source to players and computer scientists. On the same stage, the representative of Google Deep Mind AI announced that they will start an corporation with StarcraftII, which brought everyone, including myself, to be extremely excited about a possibility that we can see AlphaGo challenge the best human player. Nonetheless, we cannot deny that AlphaGo still needs to go a long way before they can actually become successful at this game, and I have become interested in exploring the challenges it still face.

StarcraftII is said to be the beacon of RTS games. Thus, the first thing to discuss would be what is the real difficulty to design an AI for RTS games. The name of this game type suggests that the most important element is strategy, and the ability to make decisions and change your choices immediately in real-time.

From a theoretical point of view, the RTS games present a significant difference to chess games like Go. First, in RTS games, multiple players can be making decisions and issuing moves simultaneously, and their moves can be highly influential to each other; Second, the real-time element means that each player has a very short time to make a move and it takes more time to complete your move. For example, when one of your structures is under attack, you need to decide whether to try to save it in one split second, because if you decide to do so too late, you will lose that structure in the process, while if you decide incorrectly, you may lose your troops and both situations can lead to a defeat. Meanwhile, if you decide to save your structure, you will need a period of time to move your troops, and during this time the situation of the battleground can change drastically again, and you must react to this change. This is significantly more complex than Go, in which you think for a long time to make a move, pick up a piece and move it to complete the move, while your opponent cannot challenge you during this period. Go is estimated to have 10^170 magnitude of complexity, yet RTS games are even much more complicated, up to 10^1685 magnitude of complexity.[1]

Learning is one of the first problems brought to the table in terms of designing an AI for StarCraftII. Through my personal experience as a player, I have found that the current StarCraft AI is incapable of learning. They only have their programmed strategies stored, with at most ten variants of their play styles. They cannot react well to the different situations that us human players present to them, and sometimes they will go into a completely wrong decision because its inner program told them to do so. Meanwhile, these AI cannot keep up-to-date. They still maintain the strategy from years ago, and have not changed up their playstyle at all. As learning has proved to be the key to how AI can be successful in games, this is the first element exploited in the development. The major method of learning is source exploring, which means how we can exploit the currently available databases to improve AI. There is a huge number of data that can be used, such as replays. Build orders, ability usage, building locations, units positioning, all the key information can be extracted from the replays and used for learning purposes. The problem lies in how do we utilize these data and train our AI agents.[2]

Currently researchers have put out two major approaches to learning. The first approach is using the Extended Learning Classifier System (XCS). This approach contains five essential elements. The first element is the condition. It is the initial situation from which the player, or the AI in this case, must decide the next action from. It is also the state of the game that an action can result to be in. The second element is action. This is the possible moves that the player can take based on the condition, and the player must pick one to proceed. The third element is prediction. This is a value that the system uses to calculate how much reward each action can approximately result in, given that this action is executed in the condition described beforehand. The fourth element is prediction error, which reflects how much the prediction of the reward of each action deviates from the real reward that it can bring. The fifth element is fitness, which describes to what degree the AI has made a correct decision. The flow of the system goes like this: A condition is presented, and a list of possible actions is shown, the AI predicts the possible outcomes of each action, and then take the action that it considers best and proceed to create the next condition. Meanwhile, the AI compares its decision to the database and evaluate how much its predictions are off and how much its decisions are off to the human professional players.[3] Thus this allows the AI to adjust its inner program to make better predictions and more successful action choices, which sums up to “learning”.

The second approach to the learning procedure is the usage of the Bayesian network. This approach involves the imitation of the decisions of a human player. The implementation includes three steps. The first step is where the human player is given the situation that the AI want to learn about, and plays a number of matches under that specific situation, i.e. 30-50, and all his moves are recorded into the database. The second step is the construction of the Bayesian network with the software called GeNIe. It defines a model by imitating the decisions made by the human player. It generates the nodes that represent the variables from the database that the decision making process runs through, and the values that influenced the player’s decision are recorded. Then, the relationship between variables is sought and defined, and the probabilities of each outcome of a move are calculated. The third step is to implement the Bayesian network into the AI, and allow the AI to make use of the relationships of variables to sense the changes of the battleground and make decisions like a real player does.[4]

Thus, the current approach to StarCraftII “learning” includes the XCS process that uses the prediction-decision-award-adjustment system and the Bayesian model that uses variable connections and imitations to complete the task. While the AI will be trained with many thousands of replays, ideally they should also be able to record each of the games that they played against a human player, and learn from the strategy choices of that player, and develop itself alongside the game and the human players. This presents the highest level of challenge to StarCraftII AIs.

Now that we have a procedure for learning, we must find out what are the key elements within the StarCraft scene to learn. Since there are too many possibilities and too many elements to concern in the game, we must know what is the most important and implement our learning towards that part of the game.

Researchers have designed an experiment to figure out the key elements for the StarCraftII AI systems. They selected 20 experienced StarCraft gamers and 6 AI bot systems. 140 games were played between the players and the AIs, and the players were asked to give their opinions on the AIs under six criteria: Production, which is the capability to produce units or buildings massively and efficiently; Micro Management, which is skills in controlling individual units; Combat, which is skills in controlling armies to win battles; Decision Making, which is the strategic/Tactical decision making under uncertainty; Performance, which is overall evaluation; and Human Likeliness, which is how close the AI seem to be when compared to human players. The result of this experiment showed that “micro management” and “decision making” were the focus points of the human players when they evaluate StarCraftII AI systems,[5] and thus the following paragraphs shall address the challenges AI face in these two areas.

The first key element to address is the challenge upon micro management. Micro management is the ability to control units individually. For example, a good StarCraftII player can use one unit to kill two of the same unit controlled by a lower-level player by utilizing skills like positioning, kiting, and the map elements. When it comes to the high level matches between professional players, one micro mistake can cause you to lose the entire game. Thus the AI must find a way to control its units properly. Currently, the approach is to use a manager, which is a class that contains all the methods required to give commands to this unit. It contains an order called “Execute”, which is the process with which the system determines which move to use depending on the situation. It also contains the basic moving commands, such as attacking a specific enemy unit, moving towards any direction, and casting abilities at a certain area. The system also uses many specific micro managers, separating them by the races in StarCraft and by their attack range. This allows the AI system to react to situations and perform the correct micro control.[6]

However, there are two major challenges present. The first challenge is how can AI move units actually like a human player. For example, a human player may choose to sacrifice one of his units in order to make his other units deal damage safely, while the inner program of the AI will most likely tell you to run the injured unit away. A human player may also need to consider sending melee units in front and setting long-ranged units on the terrains to maximize the damage output. The current best approach to solving this problem is to learn, which goes back to the challenge of AI learning addressed previously.

The second challenge is putting the limitation on the AI micro controls. Here we need to introduce the concept of APM, actions per minute. Basically, the more actions you can do in a minute, the more commands you can give to your units and the more specific they can follow their goals and the better you can perform your task. Normal human players have an average APM around 200. Really good professional players have an average APM around 500. They will even have their figures and wrists injured from moving units so much all the time, which indicates that the average 500 APM is probably the highest that humans can reach. However, the computers and AIs do not have this limitation. If given free reign, they can have thousands of APM. There has been one AI with 2000 APM designed for StarCraft, and it gives so many commands instantly that its units can fire at you in one split second and immediately move out of your range before you react. This AI defeated every single player very easily. Thus we must have a limitation on the APM of the AI. Yet the APM of a human player is not stable. One second in the game it can be 20, yet the next second it can be 1000, and then it may come back down to 200, depending on the tasks that the player is trying to perform. Again, we need to know how much APM is required under each situation in order to give a reasonable limit to the AIs, which falls back again upon the learning process.

The final challenge to address in this paper is the decision making for AI systems in StarCraftII. This is based on strategies and tactics, the highest level of abstraction in the game, basically giving an answer to the simplest questio: “How are we going to win the game?”. We can have many different choices, like attacking early, or play conservatively, or try to get more resources and play defensively. After making this decision, players need to train units, build buildings, decide whether to attack, and react to the opponent strategy. For example, if you are playing conservatively and find your opponent playing greedily, i.e. getting much more resources and technology without building his army, you should change up your strategy and go for an immediate attack. Meanwhile, AI needs to be able to have reactive control, which means exploiting the map and gathering information from the opponent.[7]

And this leads to the approach of the “adaptive strategy decision mechanism”.

This is based on the fact that each building and each unit has a connection with each other. For example, a Protoss player must build a pylon before then can build any other structures, and a Terran player must build a barracks before they can build an armory. In terms of combat and fighting, the Protoss unit colossus is very effective against Terran units like marines, while the zerg unit corruptors are extremely good against the colossus, although marines kill corruptors very easily. A simple connection graph of relationships between Terran and Protoss structures is shown below.

This connection between units leads to the discussion of adaptive strategy decision, which means reacting to your opponent. The AIs that we currently have for StarCraftII are very straight forward and cannot adjust to situations. Thus, facing this challenge, a new mechanism is proposed. This proposal has five key components: Using well-known strategies; Utilize received information from opponents; Utilize information of units; Keep following the overall strategy and tactics; and Attempt to prevent the opponent from gaining information. With these five key components, the two following charts depict the decision making process with a very sophisticated algorithm.[8]

Thus we currently have a proposed solution towards the decision making of the AI, but we must fill in the variables within these equations, which yet again must be completed through the learning process.

Finally, here we present a graph which shows the general process in which an AI should learn and perform in the game of StarCraft.[9]

The AI system for StarCraft still faces several major challenges. It must be able to find a balance upon its unit control efficiency and have a micro management at a degree similar to human players. It must also be wise and variant at choosing strategic and tactical actions and be able to adapt and react to the behaviors and decisions of its opponent. These challenges sum up to the ultimate ability for the AI to learn through the XCS and Bayesian models presented in this paper. If AlphaGo wants to conquer the world of StarCraftII gaming one day, it must be able to learn enough games and keep a learning system along with the games it plays, “think” like a human player under learning process, and perform well in micro management and decision making.

Bibliography

1. Bober, Filip Cyrus, On Design of an Effective AI Agent for StarCraft(2014), Poznan University of Technology, Department of Computing Institute of Computing Science.

2. Kim, Man-Je, et.al, “Evaluation of StarCraft Artificial Intelligence Competition Bots by Experienced Human Players”, CHI’16 Extended Abstracts, May 07-12, 2016, San Jose, CA, USA, ACM 978-1-4503-4082-3/16/05, http://dx.doi.org/10.1145/2851581.2892305.

3. Ontanon, Santiago, et.al, “RTS AI: Problems and Techniques”.

4. Ricardo, Parra and Leonardo, Garrido, “Bayesian Networks for Micromanagement Decision Imitation in the RTS Game Starcraft”, Advances in Computational Intelligence: 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, San Luis Potosi, Mexico, October 27- November 4, 2012(433-443).

5. Stephan, Rudolph, et,al. “Design and Evaluation of an Extended Learning Classifier-Based Starcraft Micro AI”, Applications of Evolutionary Computation: 19th European Conference, EvoApplications 2016, Porto, Portugal , March30---April 1, 2016, Proceedings, Part 1 (669-681)

6. Yi, Sangho, “Adaptive Strategy Decision Mechanism for StarCraft AI”, EKC 2010, SPPHY 138, pp.47-57.

[1] Ontanon, Santiago, et.al, “RTS AI: Problems and Techniques”.

[2] Bober, Filip Cyrus, On Design of an Effective AI Agent for StarCraft(2014), Poznan University of Technology, Department of Computing Institute of Computing Science.

[3] Stephan, Rudolph, et,al. “Design and Evaluation of an Extended Learning Classifier-Based Starcraft Micro AI”, Applications of Evolutionary Computation: 19^th European Conference, EvoApplications 2016, Porto, Portugal , March30---April 1, 2016, Proceedings, Part 1 (669-681)

[4] Ricardo, Parra and Leonardo, Garrido, “Bayesian Networks for Micromanagement Decision Imitation in the RTS Game Starcraft”, Advances in Computational Intelligence: 11^th Mexican International Conference on Artificial Intelligence, MICAI 2012, San Luis Potosi, Mexico, October 27- November 4, 2012(433-443).

[5] Kim, Man-Je, et.al, “Evaluation of StarCraft Artificial Intelligence Competition Bots by Experienced Human Players”, CHI’16 Extended Abstracts, May 07-12, 2016, San Jose, CA, USA, ACM 978-1-4503-4082-3/16/05, http://dx.doi.org/10.1145/2851581.2892305.

[6] Bober, Filip Cyrus, On Design of an Effective AI Agent for StarCraft(2014), Poznan University of Technology, Department of Computing Institute of Computing Science.

[7] Bober, Filip Cyrus, On Design of an Effective AI Agent for StarCraft(2014), Poznan University of Technology, Department of Computing Institute of Computing Science.

[8] Yi, Sangho, “Adaptive Strategy Decision Mechanism for StarCraft AI”, EKC 2010, SPPHY 138, pp.47-57.

[9] Ontanon, Santiago, et.al, “RTS AI: Problems and Techniques”.

About the Author

Hongze Yu

Blogger

See more from Hongze Yu

Related Topics

Related Topics

Recent in More

Related Topics

Challenges and Approaches Towards Building a Successful StarcraftII AI System

About the Author

Latest News

Trending

Featured Blogs

Related Topics

Related Topics

Recent in More

Related Topics

<span class="ArticleBase-LargeTitle">Challenges and Approaches Towards Building a Successful StarcraftII AI System</span>Challenges and Approaches Towards Building a Successful StarcraftII AI System

About the Author

Latest News

Trending

Featured Blogs

Challenges and Approaches Towards Building a Successful StarcraftII AI System