Teaching robots to be even smarter by asking questions
A team of Princeton University and Google engineers have found a novel method of teaching robots to be even smarter through the use of LLMs.
Contemporary robots possess the ability to sense their surroundings and respond to language, yet often their lack of knowledge is more crucial than their acquired knowledge. The ability to instruct robots in requesting assistance is fundamental for enhancing their safety and efficiency.
Engineers at Princeton University, in collaboration with Google, have developed a novel approach to instruct robots in recognising their own uncertainties. This method quantifies the vagueness inherent in human language and utilises this measure to indicate when robots should seek additional instructions. For example, instructing a robot to retrieve a bowl from a table with a single bowl is straightforward. However, the instruction becomes more complex when multiple bowls are present on the table, increasing the likelihood of the robot seeking clarification.
Given the complexity of tasks beyond simple commands like “pick up a bowl,” the engineers employed large language models (LLMs) — technologies underpinning tools such as ChatGPT — to evaluate uncertainty in intricate settings. According to Anirudha Majumdar, assistant professor of mechanical and aerospace engineering at Princeton and the senior author of the study, while LLMs significantly enhance a robot's ability to comprehend human language, their outputs can often be unreliable.
Majumdar emphasised the importance of LLM-based robots in recognising their limitations to avoid potentially unsafe or untrustworthy actions. The system also enables users to set desired success levels, linked to specific uncertainty thresholds that prompt a robot to ask for assistance. For instance, a surgical robot would be programmed with a lower tolerance for errors compared to a robot cleaning a living room.
Allen Ren, a graduate student in mechanical and aerospace engineering at Princeton and the study's lead author, remarked on the balance between achieving user-desired success levels and minimising the robot's need for help. Ren's work was recognised with a best student paper award at the Conference on Robot Learning in Atlanta on 8 November. Their method demonstrated high accuracy while reducing the necessity for human assistance compared to other methods.
The research team tested their approach using a simulated robotic arm and various robots at Google facilities in New York City and Mountain View, California. Ren, serving as a student research intern, conducted experiments involving a robotic arm for sorting toy food items and additional tests with a robotic arm mounted on a wheeled platform in an office kitchen.
The researchers implemented a statistical method called conformal prediction, combined with a user-defined success rate, to determine when the robot should request human input. This approach was exemplified in an experiment where a robot, faced with the task of placing a bowl in a microwave but presented with two bowls, utilised the algorithm to seek clarification.
In contrast, a scenario involving the disposal of a rotten apple, identified as a more probable action than other options, did not trigger a request for help. Majumdar highlighted the efficacy of conformal prediction in achieving high success rates while minimising help requests.
The physical limitations of robots often provide unique insights into design challenges, as noted by coauthor Andy Zeng, a research scientist at Google DeepMind. Majumdar and Ren's collaboration with Zeng stemmed from his participation in the Princeton Robotics Seminar series. Zeng's insights into Google's use of LLMs in robotics and the challenges therein inspired Ren's internship and subsequent development of the new method.
Ren is currently expanding this research to encompass active perception challenges for robots, such as locating household objects from different locations within a home. This necessitates a model that integrates vision and language information, introducing new challenges in estimating uncertainty and deciding when to seek assistance.
The paper, titled “Robots That Ask for Help: Uncertainty Alignment for Large Language Model Planners,” was presented on 8 November at the Conference on Robot Learning. The research team, including Ren, Zeng, Majumdar, and others from Princeton and Google DeepMind, received support from the U.S. National Science Foundation and the Office of Naval Research.