Teaching Mario to play with himself: AI, machine learning, and Super Mario Bros.


One of the most challenging aspects of artificial intelligence is teaching the computer how to measure, understand, and react to the world around us. Actions that are second nature to a human must be painstakingly “taught” to a robot. A team at the University of Tubingen in Germany has created a project that turns the concept of a real-world robot AI on its head by tackling a different challenge: teaching Mario to play his own game.

The team has created a video that explains how the system works, step-by-step, but the high-level overview is that Mario’s various actions and responses can all be quantified as values. The AI appears to start with very basic information about how to navigate the world and where he is in relation to various other objects. The team also created a means of tracking Mario’s curiosity about the world (he explores his environment more when curious) and how much he focuses on collecting coins (represented by hunger).

As it encounters enemies, the AI notes their existence. When queried with “What do you know about Goomba?” Mario responds with “I do not know anything about it” After experimental interactions, Mario learns that he can jump or land on Goomba, and that Goomba dies when he does so.



This, then, is translated back into human speech: “If I jump on Goomba, then he certainly dies.” (This last may be an artifact of German-English translation and sentence structure). Mario learns how to navigate his environment, how to jump to higher areas to reach inaccessible locations, and how to trigger question blocks to grab power ups or other useful items. The AI has different rules for whether Mario is small or big and his behavior can vary depending on whether he’s got a fire flower or just a mushroom.



Mario has an idea about how the world works — the first time he successfully jumps on Goomba, he says “If I jump on Goomba, it maybe dies.” He then tests this hypothesis on future Goombas, comparing the expected outcome with the actual result.

Mario doesn’t use scripted responses — he responds to syntax and understands a vast array of words and phrases. He can be told things “If you jump on Goomba, Goomba dies,” or he can learn them on his own.


Humans, of course, apply these principles of learning and communication thousands of times a day, but we learn them when we’re infants. Teaching Mario that jumping on a Goomba will kill it is a fascinating example of AI, particularly since Mario simultaneously learns that jumping into a Goomba will hurt or kill himself.

Projects like this might one day be a useful step in training more advanced artificial intelligences rules about how to interact with humans. A properly constructed game could teach an AI the most basic rules of interacting with its environment first, then introduce more complex concepts and ideas as the game ran on. The best games already use these sorts of rules; many games will fold a basic tutorial on movement, attacks, and various player capabilities into the game itself, revealing these options as the game progresses and unlocking new abilities as the player demonstrates mastery of previous concepts.



If players can learn these rules within the relatively fixed and simple game environment, AIs may be able to learn them as well. The dangers and risks of AI have been explored a great deal of late, with multiple scientists calling for caution in our continuing research. Teaching Mario to play his own game seems relatively tame compared to the risk of societal upset and autonomous security drones.