Minecraft bot made by OpenAI paves the way for self-driving car and humanoid robot


A bot that plays Minecraft as well as high-level human players could be the next milestone in artificial intelligence. Its differential is the innovative technique that allowed training with 70,000 hours of video. Who had this epiphany was the creator of ChatGPT, OpenAI.

Training neural networks—the technology behind AI innovations that simulate the human brain—with videos takes a lot of work. Each action represented in the image needs a description.

For example, a viral video on Twitter of a man clipping a cat’s fur would need dozens of annotations for about ten seconds of footage.

First published in the specialized journal MIT Technology Review, the solution that OpenAI researchers found for this problem was to feed a neural network with 2,000 hours of video from the work of workers on temporary service platforms —they were hired to play Minecraft and had the actions on your keyboards and mouse and the recorded screenshots.

With that data, the first neural network learned to tag Minecraft videos with the commands. Thus, it can handle the 70,000 hours of videos that would be used to train the second neural network.

Training a second neural network was necessary because a neural network reacted to images later, while the second was trained to act from the data it captured in the game, says Eric Aislan Antonelo, professor of automation engineering at UFSC (Federal University of Santa Catarina ).

Another differential of the model was to mix two techniques: learning by imitation and learning by reinforcement.

The first consists of making the artificial intelligence try to imitate instructions and is called imitation learning. This technique has already been used to train autonomous cars, autonomous robotic arms and even computer activities.

In the second, the researchers give a complex instruction and the artificial intelligence tries to execute it by exhaustive trial and error. This is how automated opponents in racing or football video games are trained.

The first training, by imitation, made the bot able to perform moves that require 970 actions in sequence, such as building planks and turning them into a table.

The robot’s techniques were then refined with reinforcement learning, which allowed it to make sequences with more than 20,000 commands. This has allowed technology to build so-called diamond tools — which require 20 minutes of high-speed clicking.

According to Unicamp professor Leonardo Tomazeli Duarte, scientific director of BI0S (Brazilian Institute of Data Science), the first imitation training allows restricting the possibilities of success and error in reinforcement learning. This allows for better results in less time.

This method creates the possibility of using huge databases of videos such as Youtube to train different Artificial Intelligence models. The experts heard by the report mention solutions in car automation, health and agribusiness.

Although the volume of data available in these sources is sufficient for training, before using them, it is necessary to include notes or references in this data. “This step, which is called ‘data labeling’, is quite laborious and often requires specialists to do it, which makes this process relatively expensive”, says professor of electrical engineering at Unicamp Denis Gustavo Fantinato.

Much of the human work has been reduced with the artificial intelligence pre-training strategy.

Antonello, professor at UFSC, however, points out that there are still technical limitations to taking these techniques beyond the frontier of screens. OpenAI researchers were able to turn keyboard and mouse commands into various binaries — yes and no information. This made the chain of command easier.

“When we are going to train a car, for example, the directions are continuous variables, they can assume several values. This makes the chain of command more complex”, says the UFSC professor, who tests ways to automate vehicles, including learning techniques by imitation.

Therefore, expectations that this technology will be used to perform digital tasks, such as filling out forms or spreadsheets, are more realistic than expecting humanoid robots trained by YouTube tutorials.

You May Also Like

Recommended for you

Immediate Peak