OpenAI Releases CoT Civic Chain Research Results, Monitoring to Stop Big Model Malicious Behavior

March 11th.OpenAI today announced its latest research, which attempts to use CoT (Chain of Thought) monitoring to monitor the "thoughts" of reasoning models, thereby preventing AI from Large Modelbehaviors such as babbling, hiding true intentions, etc., and it is also one of the effective tools for monitoring supermodels.

Note: AI occasionally spouts nonsense, which we generally call AI hallucinations. In addition to this, AI can be lazy or deliberately bad, such as tricking the user or simply giving up when a problem is too difficult.

OpenAI used the newly released cutting-edge model o3-mini as the monitored object and the weaker GPT-4o model as the monitor. The test environment was a coding task that required the AI to implement functionality in the codebase to pass unit tests.

The results show that CoT monitors are excellent at detecting systematic "reward hacking" behavior, with a recall rate of 951 TP3T, far exceeding the 601 TP3T for monitoring behavior alone.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Guangdong: support enterprises to integrate artificial intelligence and robotics industry chain and innovation chain resources

2025-3-10 21:18:43

Information

Reducing Costs and Increasing Efficiency: Musk is Trying to Replace U.S. Civil Servants with AI

2025-3-11 12:08:41

Search