Researchers from the Georgia Institute of Technology (Georgia Tech) have developed a new approach that enables robots to detect the location of the humans they work alongside using sounds that the humans naturally produce as they move around the environment.

Current approaches that enable robots to localize humans — thereby avoiding accidents and collisions — often rely on computer vision techniques via cameras or visual sensors, for instance. However, this new approach relies on the subtle sounds humans make and not the extraneous sounds like talking or clapping that people are typically required to produce for acoustic human detection.

Source: Georgia TechSource: Georgia Tech

Instead, the acoustic localization approach described by the Georgia Tech team is exploring the use of machine learning algorithms to detect the subtle sounds inadvertently produced by humans as they move around a space.

To accomplish this, the team created a dataset dubbed the Robot Kidnapper to train their algorithm. The dataset features 14 hours of four channel audio recordings coupled with 360 RGB camera footage collected during trials wherein people moved around robots in assorted ways.

The researchers reportedly recorded participants as they walked in different ways around a Stretch RE-1 robot from Hello Robot. This data was then used to train machine learning models that collect audio in the form of spectrograms and ultimately make predictions about whether there is a human nearby and what their location is relative to the robot.

Because the machine learning technique was trained to localize humans exclusively based on sound, the Georgia Tech team trained the model so that it ignored external and irrelevant noises — like those emitted from heating, ventilation and air conditioning systems and sounds made by the robot itself, for example.

"We believe our audio-based method for human detection is important for the development of multi-modal person detection systems that are robust to failures," the researchers explained. "Robots commonly use cameras or lidar to navigate around people, but should those sensors fail or become unavailable (low-lit environments, occlusions, etc.), our method allows robots to fall back solely onto audio, which is usually already available in most hardware setups. Moreover, when interacting with robots, people should not be expected to intentionally create extra sounds, which is what previous works rely on."

The team reported that the technique performed twice as well as other acoustic localization methods and show promise for use with robots that work alongside humans — such as robots used for household, industrial and security applications.

An article detailing the technique, The Un-Kidnappable Robot: Acoustic Localization of Sneaking People, appears in the journal arXiv.

For more information, watch the accompanying video that appears courtesy of Georgia Tech.

To contact the author of this article, email mdonlon@globalspec.com