Google DeepMind recently gave insight into two artificial intelligence systems it has created: ALOHA Unleashed and DemoStart. The company said that both of these systems aim to help robots perform complex tasks that require dexterous movement.
Dexterity is a deceptively difficult skill to acquire. There are many tasks that we do every day without thinking twice, like tying our shoelaces or tightening a screw, that could take weeks of training for a robot to do reliably.
The DeepMind team asserted that for robots to be more useful in people’s lives, they need to get better at making contact with physical objects in dynamic environments.
The Alphabet unit‘s ALOHA Unleashed is aimed at helping robots learn to perform complex and novel two-armed manipulation tasks. DemoStart uses simulations to improve real-world performance on a multi-fingered robotic hand.
By helping robots learn from human demonstrations and translate images to action, these systems are paving the way for robots that can perform a wide variety of helpful tasks, said DeepMind.
ALOHA Unleashed enables manipulation with two robotic arms
Until now, most advanced AI robots have only been able to pick up and place objects using a single arm. ALOHA Unleashed achieves a high level of dexterity in bi-arm manipulation, according to Google DeepMind.
The researchers said that with this new method, Google’s robot learned to tie a shoelace, hang a shirt, repair another robot, insert a gear, and even clean a kitchen.
ALOHA Unleashed builds on DeepMind’s ALOHA 2 platform, which was based on the original ALOHA low-cost, open-source hardware for bimanual teleoperation from Stanford University. ALOHA 2 is more dexterous than prior systems because it has two hands that can be teleoperated for training and data-collection purposes. It also allows robots to learn how to perform new tasks with fewer demonstrations.
Google also said it has improved upon the robotic hardware’s ergonomics and enhanced the learning process in its latest system. First, it collected demonstration data by remotely operating the robot’s behavior, performing difficult tasks such as tying shoelaces and hanging T-shirts.
Next, it applied a diffusion method, predicting robot actions from random noise, similar to how the Imagen model generates images. This helps the robot learn from the data, so it can perform the same tasks on its own, said DeepMind.
DeepMind uses reinforcement learning to teach dexterity
Controlling a dexterous, robotic hand is a complex task. It becomes even more complex with each additional finger, joint, and sensor. This is a challenge Google DeepMind is hoping to tackle with DemoStart, which it presented in a new paper. DemoStart uses a reinforcement learning algorithm to help new robots acquire dexterous behaviors in simulation.
These learned behaviors can be especially useful for complex environments, like multi-fingered hands. DemoStart begins learning from easy states, and, over time, the researchers add in more complex states until it masters a task to the best of its ability.
This system requires 100x fewer simulated demonstrations to learn how to solve a task in simulation than what’s usually needed when learning from real-world examples for the same purpose, said DeepMind.
After training, the research robot achieved a success rate of over 98% on a number of different tasks in simulation. These include reorienting cubes with a certain color showing, tightening a nut and bolt, and tidying up tools.
In the real-world setup, it achieved a 97% success rate on cube reorientation and lifting, and 64% at a plug-socket insertion task that required high-finger coordination and precision.
Training in simulation offers benefits, challenges
Google says it developed DemoStart with MuJuCo, its open-source physics simulator. After mastering a range of tasks in simulation and using standard techniques to reduce the sim-to-real gap, like domain randomization, its approach was able to transfer nearly zero-shot to the physical world.
Robotic learning in simulation can reduce the cost and time needed to run actual, physical experiments. Google said it’s difficult to design these simulations, and they don’t always translate successfully back into real-world performance.
By combining reinforcement learning with learning from a few demonstrations, DemoStart’s progressive learning automatically generates a curriculum that bridges the sim-to-real gap, making it easier to transfer knowledge from a simulation into a physical robot, and reducing the cost and time needed for running physical experiments.
To enable more advanced robot learning through intensive experimentation, Google tested this new approach on a three-fingered robotic hand, called DEX-EE, which was developed in collaboration with Shadow Robot.
Google said that while it still has a long way to go before robots can grasp and handle objects with the ease and precision of people, it is making significant progress.