Generally speaking, robots respond to spoken human language by identifying cues in commands and sentence structures. They infer the desired action, which triggers an algorithm.
But consider the varying levels of specificity involved in communication. Someone working in a warehouse alongside a robotic forklift might say, “Grab that pallet.” The command is highly abstract, implying a number of smaller sub-steps – lining up the lift, putting the fork underneath, hoisting it up. Other commands, such as “Tilt the fork back a little,” are far more specific. If the robot is unable to take the level of specificity into account, it might overplan for simple instructions, or underplan for more complex ones.
A new system based on research by Brown University computer scientists is designed to address that very problem. As presented at the Robotics: Science and Systems 2017 conference in Boston, the system adds an additional level of sophistication to existing robot language models. In addition to simply inferring a desired task, the system also analyzes the language to infer a distinct level of abstraction.
The researchers used a virtual task domain called Cleanup World and Mechanical Turk, Amazon's crowdsourcing marketplace, to develop the new model. The online domain consists of a few color-coded rooms, a robotic agent and an object to be manipulated. By using a chair as the object, Mechanical Turk volunteers watched the robot agent perform a task in the virtual domain; they were then asked to list the instructions they would have given to get the robot to perform the task they’d just witnessed.
The volunteers were guided as to which of three levels of specificity their instructions should contain. At the high level, for example, an instruction might simply be, “Take the chair to the blue room.” At the stepwise-level, that same instruction might read, “Take five steps north, turn right, take two more steps, get the chair, turn left, turn left, take five steps south." A third level of abstraction used terminology in between those two.
The researchers then used those instructions to train their system as to what kinds of words are used at each level – enabling the system to adjust its hierarchical planning algorithm accordingly. They tested it in Cleanup World as well as in the physical world, with a Roomba-like robot operating in a similar space. When the robot was able to infer both the task and the specificity of the instructions, it responded to commands in one second 90 percent of the time. By contrast, when no level of specificity was inferred, 50% of the tasks required 20 seconds or more of planning time.
The work was done in the lab of Brown computer science professor Stefanie Tellex, who specializes in human–robot collaboration. "We ultimately want to see robots that are helpful partners in our homes and workplaces," she said. “This work is a step toward the goal of enabling people to communicate with robots in much the same way that we communicate with each other."