AI agent learns to play computer games and enjoys rewards
Participation in game developer contests organised by Unity and Microsoft constitutes part of research carried out by doctoral students at the School of Computing. AI development in a limited game environment offers insight into deep learning.
Unity is a big name in the game industry. One of the company’s recent game developer contests was geared towards the development of an artificial intelligence (AI) that can play puzzle games.
At first, the game seems simple, but looks are deceiving. A human player wouldn’t take long to figure out how to take the girl through the stone maze. But how about an AI?
“This is a research problem we are trying to solve in a limited game environment,” Senior Researcher Ville Hautamäki from the School of Computing says.
Hautamäki specialises in speech technology, bioinformatics and machine learning. Participation in game developer contests is part of the way his research group works.
“The term ‘artificial intelligence’ was originally coined by John McCarthy, referring to an ‘intelligent machine’. Artificial intelligence is an aspect of deep learning. This contest was particularly interesting, since it was our first effort to develop this type of AI.”
The team, with Hautamäki, Anssi Kanervisto and Janne Karttunen as members, made it to level 10 in the game. To win, the AI would have had to complete level 20. This, however, would not have been possible, because there wasn’t enough time and computing capacity to teach the AI.
“In this contest, the aim was to teach the AI agent to function in a simulated, non-structured space. In other words, the agent had to learn to react to changes in its environment. It plays the game independently and, after dozens of repetitions, it learns to utilise the information it gathers,” Hautamäki explains.
“Each new level is a tabula rasa, a fresh new start, as the room changes every time. This means that the agent has to do more than repeat previously learned motion sequences.”
“On level 5, for example, the door is closed, and the agent needs to understand that it needs a key to open it. On level 10, on the other hand, the agent has to move a block of stone to a particular location.”
According to Hautamäki, these tasks are not difficult for a human. For an AI, they are. When developing the agent, it is important to pay attention to figure control so that it can perform certain moves. In the end, the agent gets a reward when it figures out how to pass through the door by performing a certain sequence of moves in the room.
“In other words, the agent’s code doesn’t instruct it to turn left here or to turn right there,” he explains.
“The speed at which the agent moves forward in the game is stunning, equivalent to up to decades of gaming, as was the case with the winning agent.”
“Some research groups benefited greatly from access to high computing capacity.”
Researchers are developing yet another agent for a Minecraft contest
Currently, Hautamäki’s research group is participating in a Minecraft contest organised by Microsoft, which is scheduled to end by the turn of the year.
“In this game, levels are created automatically. The agent plays the game all the time, and each agent can only be taught for one day,” Hautamäki says.
“The contest organiser checks and runs the code, making the task even more difficult. Computing capacity, however, is limited, so this contest treats all teams more fairly. We finished MineRL Round 1 in Top 10, and our goal is to make it to Top 5 this time,” Hautamäki says.
“At the beginning of the contest, Microsoft shared some examples of how humans have played the game. These can be used when developing the agent. However, we were not able to use the AI we developed for the Unity contest earlier: it did not know how to play this time.”
“We are interested in a more general AI, one that can solve problems. Self-taught agents are considerably more flexible,” Hautamäki notes.
In the future, AI will be increasingly used in robotics, also at home. One would think that researchers are keen to develop a large, all-encompassing AI.
According to Hautamäki, however, this is not the case. Instead of one large AI, there is a preference to develop sufficiently small AIs.
“A robot at home doesn’t need more electronics; it needs more intelligence.”
“Would it be possible to develop a method that can give feedback on how an AI should work? It could be a robot that can operate in a specific environment, like robot vacuum cleaners.”
“There is a large market for robotics today. In the future, the significance of robotics will grow greatly in, for example, health can care services.”
Photo: The team’s game video on YouTube.
More information and game videos:
- Unity Obstacle Tower https://blogs.unity3d.com/2019/08/07/announcing-the-obstacle-tower-challenge-winners-and-open-source-release/
- MineRL competition homepage https://www.aicrowd.com/challenges/neurips-2019-minerl-competition
- Round 1 leaderboard: https://www.aicrowd.com/challenges/neurips-2019-minerl-competition/leaderboards?challenge_round_id=118
- MineRL home page (data set and API): http://minerl.io/
- MineRL competition paper: https://arxiv.org/abs/1904.10079
- Obstacle Tower paper : https://arxiv.org/abs/1902.01378
- Video of Hautamäki team's OTC submission: https://www.youtube.com/watch?v=arQ1I7H8yhI
- MineRL Round 1 winner agent playing: https://twitter.com/wgussml/status/1189641610893709312