Thanks to a collaborative effort between Google’s Deepmind and 33 academic labs, the world of robotics might be turned upside down. The goal of this venture is to break robotics out of the rigid paradigm of specific datasets for specific tasks.
According to their blog, if successful, the Open X-Embodiment dataset and the RT-X model could see the birth of a general-purpose robot that could learn and adapt across diverse robot types. In the paper written of the project, this large team of institutions worked together to amass data from an impressive array of 22 robot embodiments.
In this colossal effort, more than 500 skills and 150,000 tasks were demonstrated across a staggering 1 million episodes. This dataset stands as the most extensive and diverse collection of robotic data to date.
Without the proper hardware, the software wouldn’t have a vessel to operate, and that’s where The RT-X model comes into play. This model was born from Google DeepMind’s robotics transformer models RT-1 and RT-2. It was trained on the Open X-Embodiment dataset, and demonstrates remarkable performance across various robot embodiments.
In evaluations across partner academic universities, RT-1-X consistently outperformed models tailored for specific tasks, boasting an average improvement rate of 50%. This success signals a monumental shift towards the development of more versatile, adaptable robots.
Now the real magic of RT-X lies in its ability to acquire emergent skills by drawing knowledge from different robots. Experiments with RT-2-X show a threefold improvement in performing tasks that were previously beyond its capabilities.
This includes better spatial understanding, as seen in the nuanced difference between “move apple near cloth” and “move apple on cloth” commands. RT-2-X’s expanded skill set highlights the immense potential of co-training with diverse data, especially when coupled with high-capacity architectures.
As this work demonstrates, models capable of generalizing across embodiments are not only possible but incredibly promising. The possibilities are vast, from integrating self-improvement mechanisms to exploring the impact of different dataset mixtures on cross-embodiment generalization. These advancements are propelling robotics research into uncharted territories.
But with advancements such as this, the day of general-purpose robotics may be closer than previously expected. And if proven to be cost-effective to build on a mass scale, then it could help revolutionize the industry in a similar way that the steam engine did.
Google DeepMind also released an interesting animation on the RT-X which you can see below: