Researchers Use AI Concepts to Enable Robots to Adapt on Their Own
Engineering360 News Desk | May 25, 2015Researchers at the University of California, Berkeley, have developed algorithms that enable robots to learn motor tasks through trial and error. They do this using a process that closely approximates the way humans learn, marking what they say is a milestone in the field of artificial intelligence (AI).
The researchers demonstrated their technique, a type of reinforcement learning, by having a robot complete various tasks—putting a clothes hanger on a rack, assembling a toy plane, screwing a cap on a water bottle and more—without preprogrammed details about its surroundings.
"What we're reporting on here is a new approach to empowering a robot to learn," says Professor Pieter Abbeel in UC Berkeley's Department of Electrical Engineering and Computer Sciences. "The key is that when a robot is faced with something new, we won't have to reprogram it." (Watch a video as a research robot uses deep learning concepts to assemble a toy.)
Most robotic applications are in controlled environments where objects are in predictable positions, the researchers say. One challenge of putting robots into real-life settings, like homes or offices, is that those environments are constantly changing. The robot must be able to perceive and adapt to its surroundings.
The university researchers turned to a new branch of AI known as deep learning, which they say is inspired by the neural circuitry of the human brain when it perceives and interacts with the world.
Deep learning programs create "neural nets" in which layers of artificial neurons process overlap the raw sensory data—whether sound waves or image pixels. This helps the robot to recognize patterns and categories among the data it is receiving. People who use Siri on their iPhones, Google's speech-to-text program or Google Street View might already have experienced deep learning concepts in speech and vision recognition.
Applying deep reinforcement learning to motor tasks has been more challenging, however, since the task goes beyond the passive recognition of images and sounds.
"Moving about in an unstructured 3D environment is a whole different ballgame," says one member of the Berkeley research team. "There are no labeled directions, no examples of how to solve the problem in advance. There are no examples of the correct solution like one would have in speech and vision recognition programs."
In the experiments, the UC Berkeley researchers worked with a Willow Garage Personal Robot 2 (PR2), which they nicknamed BRETT, or Berkeley Robot for the Elimination of Tedious Tasks.
They presented BRETT with a series of motor tasks, such as placing blocks into matching openings or stacking Lego blocks. The algorithm controlling
BRETT's learning included a reward function that provided a score based upon how well the robot was doing with the task.
BRETT takes in the scene, including the position of its own arms and hands, as viewed by the camera. The algorithm provides real-time feedback via the score based upon the robot's movements. Movements that bring the robot closer to completing the task will score higher than those that do not. The score feeds back through the neural net, so the robot can learn which movements are better for the task at hand.
This end-to-end training process underlies the robot's ability to learn on its own. As the PR2 moves its joints and manipulates objects, the algorithm calculates good values for the 92,000 parameters of the neural net it needs to learn.
With this approach, when given the relevant coordinates for the beginning and end of the task, the PR2 could master a typical assignment in about 10 minutes. However, when the robot is not given the location for the objects in the scene and needs to learn vision and control together, the learning process takes about three hours.