GoogleDeepMindThe robotics team at the company announced three new advances designed to help robots make faster, better, and safer decisions in complex environments. One of them is a system for collecting training data, equipped with a "robot constitution" to ensure yourAI RobotsThe office assistant can fetch you more printing paper without bumping into human colleagues on the way.
Google’s data collection system AutoRT uses a visual language model (VLM) and a large language model (LLM), collaborate with each other to understand the environment, adapt to unfamiliar situations, and decide on appropriate tasks. The "Robot Constitution," inspired by Isaac Asimov's "Three Laws of Robotics," is described as a set of "safety-focused cues" that guide LLMs to avoid tasks involving humans, animals, sharp objects, and even electrical appliances.
For additional safety, DeepMind programmed the robot to stop automatically when the force on its joints exceeded a certain threshold, and added a physical kill switch for human operators to use to deactivate it. Over a seven-month period, Google deployed 53 AutoRT robots in four different office buildings and conducted more than 77,000 trials. Some robots were remotely controlled by human operators, while others operated according to scripts or completely autonomously using Google's Robotic Transformer (RT-2) AI learning model.
AutoRT performs the following four steps for each task. The robots used in the trial look more practical, equipped with cameras, robotic arms, and mobile bases. Google noted in its blog post: "For each robot, the system uses the VLM to understand its environment and objects within its field of view. Next, the LLM proposes a list of creative tasks, such as 'Put a snack on the counter,' and plays the role of a decision maker, selecting the appropriate task for the robot to perform."
Other new technologies from DeepMind include SARA-RT, a neural network architecture designed to improve the accuracy and speed of the existing Robotic Transformer RT-2, and RT-Trajectory, which adds 2D contours to help robots better perform specific physical tasks, such as wiping a table.
While we still seem a long way off from robots that can serve you and make your pillows completely autonomously, when they do arrive they may have learned some lessons from systems like AutoRT.