December 30th.Spiritual IntelligencereleaseThe first end-to-end reinforcement learning (RL) basedbody model Psi R0.
1AI has learned that the model supportsDual dexterity for complex operationsThe Psi R0 can be used to generate an intelligent body with reasoning ability to accomplish and close the loop of long-range dexterous operation tasks by mixing and training multiple skills in tandem. Moreover, Psi R0 can also generalize across item and scene levels.
Taking an e-commerce scenario as an example, the packing of goods is a typical long-distance task, requiring tens of thousands of items to be grabbed, scanned, placed, and tied in plastic bags, etc. The Psi R0 is able to complete this series of actions smoothly with a pair of dexterous hands (officially known asThis series of movements can replace a complete workstation at the customer's site.), becoming the first embodied robot trained to perform long-range dexterous manipulation tasks based on reinforcement learning.
Officially, the RL-based Psi R0 model uses massive simulation data to train a two-handed operating intelligence, and connects multiple skills in tandem through a bidirectional training framework, which is the first in the industry to complete long-range tasks in open environments, with strong generalization capabilities and high robustness.
This skill training framework abstracts key information from object spatio-temporal trajectories to construct a generalized objective function, thus solving the problem of difficult reward function design. In the post-training phase, the success rate of long-range tasks is further improved by aligning a small amount of high-quality real-machine data.
In addition, the transfer feasibility function in the bi-directional training framework plays an important role in fine-tuning the skills to improve the success rate and generalization of the tandem, and at the same time gives the model the ability to switch skills autonomously, so that it can quickly adjust its strategy when it encounters operational failures to ensure a high success rate.