The doctoral dissertation in the field of Computer Science will be examined at the Faculty of Science and Forestry online.
What is the topic of your doctoral research?
My doctoral research advances computer programs that act autonomously, dubbed "agents", which are one area of artificial intelligence research (AI). Agents can be used to automate tasks. For example, an agent could control a power grid to reduce power waste or speed up documentation processing by automatically detecting mistakes or unclear information.
However, the traditional programming of agents may be difficult as we have to write the rules on how the agent should act in different situations. Instead, my research focused on training agents, either by demonstrating how the agent should act ("imitation learning") or by allowing the agent to learn via trial-and-error what actions are good or bad ("reinforcement learning"). With deep neural networks or deep learning, we can train increasingly intelligent agents and apply them to more complex domains like competitive strategy games, which previously were out of the reach of computer agents.
My work uses video games as a benchmark to measure how well agents are learning different tasks and select changes which improve the agent training procedure. Video games provide a wide range of challenges that are readily available but are not necessarily solved by agents. By improving agents in video games we can apply the improvements to more practical tasks.
What are the key findings or observations of your doctoral research?
Agent training methods are commonly evaluated in one or two environments, and the majority of the existing research use the same benchmarks. While this leads to comparable results across studies it might lead to insights that only work in these environments. My research focused on using multiple, novel environments as a benchmark, which video games offer. With this experiment flow, we demonstrated how a simple imitation learning method, behavioural cloning (mimic what human does), can be an effective training method despite previous research using it as a baseline result (i.e. at the bottom of ranking).
However, to reach this effectiveness one has to tune the training algorithm per environment, and we found out that behavioural cloning does not work as a “one size fits all” solution over multiple games. On a practical note, we also demonstrated that imitation learning can be used to augment human-computer usage, specifically by supporting a human player’s mouse control in a video game. For self-learning agents (reinforcement learning), we found out that it is better to minimize the number of options agents can choose from while acting instead of giving many different options. Even if having more options led to better performance in some games, the training procedure failed in the majority of the games as the agent had to spend too much time trying out the options.
Akin to these results, we also found out that one can transfer an agent trained in a video game to a robot with a completely different action space with only little training required. This method can be used in the future to speed up training agents in complex scenarios by re-using already trained agents and not training agents from scratch with zero knowledge.
How can the results of your doctoral research be utilised in practice?
In the immediate future, the results of this research can be used in virtual environments and video games. Game developers can, for example, utilize the results on how to design the video games in a way the agent is able to better learn ideal strategies or how imitation learning should be used. This can be used to create more interesting non-player characters with more complex behaviour, more dynamic opponents in competitive games like Starcraft II or supportive players in open-world games like Minecraft. Agents that can play the game can also be used to test the game automatically more thoroughly, which speeds up the development process and ultimately translates to more complex but error-free games.
The same methods could be applied to productive software as well: for example, an agent can be trained to retrieve important information regarding a user's query even with free-form language, making finding information easier. In the larger scope, the experimental setup and more robust insights obtained during this research will benefit the AI research field by supporting the use of multiple benchmarks (e.g., video games) to validate their results rather than one.
What are the key research methods and materials used in your doctoral research?
The main procedure of research involved designing the procedures for evaluating different training methods, selecting the benchmarks (video games) and studying the final results. If a training method with a unique modification stands above other methods, that modification is deemed beneficial. These experiments were repeated over multiple video games and multiple training algorithms to ensure the insights generalize to different scenarios (ideally, the same insight should apply to any application from video games to robotics).
The doctoral dissertation of Anssi Kanervisto, MSc, entitled Advances in deep learning for playing video games will be examined at the Faculty of Science and Forestry online, on 21 January at 12 noon (UTC+2). The opponent will be Professor Georgios Yannakakis, University of Malta, and the custos will be Senior Researcher Ville Hautamäki, University of Eastern Finland. Language of the public defence is English.
For further information, please contact:
Anssi Kanervisto, firstname.lastname@example.org