OpenAI releases CoT Civic Chain research results, monitoring to stop big model malicious behavior

March 11th.OpenAI today announced its latest research, which attempts to use CoT (Chain of Thought) monitoring to monitor the "thoughts" of reasoning models, thereby preventing AI from Large Modelbehaviors such as babbling, hiding true intentions, etc., and it is also one of the effective tools for monitoring supermodels.

Note: AI occasionally spouts nonsense, which we generally call AI hallucinations. In addition to this, AI can be lazy or deliberately bad, such as tricking the user or simply giving up when a problem is too difficult.

OpenAI used the newly released cutting-edge model o3-mini as the monitored object and the weaker GPT-4o model as the monitor. The test environment was a coding task that required the AI to implement functionality in the codebase to pass unit tests.

The results show that CoT monitors are excellent at detecting systematic "reward hacking" behavior, with a recall rate of 951 TP3T, far exceeding the 601 TP3T for monitoring behavior alone.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

OpenAI Releases CoT Civic Chain Research Results, Monitoring to Stop Big Model Malicious Behavior

Guangdong: support enterprises to integrate artificial intelligence and robotics industry chain and innovation chain resources

Reducing Costs and Increasing Efficiency: Musk is Trying to Replace U.S. Civil Servants with AI

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Guangdong: support enterprises to integrate artificial intelligence and robotics industry chain and innovation chain resources

Reducing Costs and Increasing Efficiency: Musk is Trying to Replace U.S. Civil Servants with AI

Big models usher in the "AppStore moment", OpenAI's new imagination for 2024

Sky-high price! OpenAI's new large models "Strawberry" and "Orion" plan to charge $2,000 per month!

OpenAI, Meta to Train Big Models in African Languages to Make AI Reach More People

Some netizens exposed the invitation email sent by OpenAI to red team testers: GPT-5 has started red team testing

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow