Imagine having a robot that understands how to use tools and can rapidly learn to carry out repairs around your home with a hammer, wrench, and screwdriver. This isn't a scene from a sci-fi movie; it's a real possibility, thanks to a new technique developed by researchers at the Massachusetts Institute of Technology (MIT).
Their innovative approach combines training data across domains, modalities, and tasks using generative AI models known as diffusion models. This method gives a robot the ability to learn to perform new tasks in unseen environments by creating a combined strategy from different datasets.
Redefining robot training
The existing data used for robotic training significantly varies. Some datasets are made up of color images, while others consist of tactile imprints. In addition to this, data could be collected in different domains, such as simulations or human demonstrations. Each of these datasets may depict a unique task and environment.
Combining such diverse data effectively into one machine learning model has always been a challenge. Hence, many methods utilize just one type of data to train a robot. The downside is that robots trained this way, with a relatively small amount of task-specific data, often struggle to perform new tasks in unfamiliar environments.
"Diffusion models" lead the way
Addressing this issue, the MIT researchers have developed a technique that uses a type of generative AI known as diffusion models to combine multiple sources of data across different modalities, tasks, and domains.
The researchers train a separate diffusion model to learn a strategy or policy for completing one task using one specific dataset. Then they amalgamate the policies learned to form a general policy. This allows a robot to perform multiple tasks in different settings. The method, named Policy Composition (PoCo), has shown an impressive 20 percent improvement in task performance compared to traditional techniques.
The road ahead
According to Lirui Wang, an electrical engineering and computer science (EECS) graduate student and lead author of the paper on PoCo, this innovative approach could be a significant step forward for the robotics field. PoCo's superiority lies in being able to combine policies to achieve a superior outcome. As the policies are trained separately, one could mix and match them for the task at hand, leading to enhanced results.
The researchers have successfully tested PoCo on real robotic arms performing a variety of tasks. In the future, they plan to apply this technique to long-horizon tasks where a robot would pick up one tool, use it, then switch to another tool, all while incorporating larger robotics datasets for superior performance.
Disclaimer: The above article was written with the assistance of AI. The original sources can be found on ScienceDaily.