A new player is ready as a bot from OpenAI has watched 70,000 hours of players on Minecraft and is ready to game. According to a report from the MIT Technology Review, a bot was trained on tens of thousands of hours of players playing the popular sandbox title. Now it’s able to complete complex actions, such as crafting in-game diamond tools – a task that would take well-versed human players at least twenty minutes. The Minecraft AI learned to perform these complicated sequences after binging on sites such as YouTube. Using this as a source of training data, the bot has been punching trees like any other player.
What makes this interesting is that using a bot to view players interacting with a game brings to life a new technique and potentially an untapped source of training data for future AI programs. As the report states, this bot’s results are a breakthrough for imitation learning. This is where neural networks are trained to perform tasks by watching humans and imitating actions. If pushed to its next logical step, this could be a step forward for robotics and automation for vehicles. For those training AI, the sheer quantity of video data online thanks to sites such as YouTube is astronomical. A simple search on the tab for “How to…” shows you why.
Millions of people flock to these websites to learn new skills. So using that principle, you can train AI to learn new skills by imitating humans accomplishing tasks, the same way humans do when we learn a new skill. The researchers point to GPT-3, and what it did for large language models. Bowen Baker at OpenAI, a member of the team behind the new Minecraft bot stated, “In the last few years we’ve seen the rise of this GPT-3 paradigm where we see amazing capabilities come from big models trained on enormous swathes of the internet,… A large part of that is because we’re modeling what humans do when they go online.”
But how does the team accomplish this feat, without the limitations of bottlenecking? Well according to them, they are using an approach called Video Pre-Training. It gets around the blockers by training another neural network to label videos automatically. According to Peter Stone, executive director of Sony AI America, who has worked on imitation learning, “Video is a training resource with a lot of potential.” As an alternative to reinforcement learning, this technique has been used to get around the need of a clear goal which reinforcement learning excels at. This is where Minecraft comes to play. Since it’s a sandbox game, there isn’t a clear goal. The world is whatever you’d want to make of it. So this open-ended world was the perfect settling for VPT to train.
The AI won’t be just limited to Minecraft in the future. According to the researchers, the AI can be trained to carry out other tasks for automation that requires clicking and keyboard use. From booking flights, navigating websites, and online shopping, the bot, in theory, can be trained to do many real-world actions. It would just need to learn how to from copying a person doing it first. Baker and the team plan on continuing to feed the AI more Minecraft videos. The goal? Getting as human as possible, “But with more data and bigger models, I would expect it to feel like you’re watching a human playing the game, as opposed to a baby AI trying to mimic a human.” Baker stated.