o3 压台登场：OpenAI 卷动推理 AI模型风云，迈向 AGI 新巅峰

On December 21st, "12 Days of OpenAI"The event has drawn to a close with OpenAI's o3 The series of large models on the stage.Officials claim that in some scenarios, its reasoning ability is very close to that of general-purpose artificial intelligence (AGI).

name

Why the latest AI model skips o2 and is called o3?OpenAI CEO Sam Altman, speaking at a live event this morning, said it's to circumvent a trademark conflict with British telecom operator O2.

Invitation to Security Testing

o3 is the successor to the o1 inference model and contains both a full version and a lite version (o3-mini), the latter of which has been fine-tuned primarily for specific tasks.

OpenAI has not yet fully opened up the o3 and o3-mini models, but is inviting security researchers to sign up for a preview version of the o3-mini model, and then launch a preview version of the o3 model.

Now, interested parties can submit an application at https://openai.com/index/early-access-for-safety-testing/.

Altman has not announced a specific open date for the o3 model, revealing only that the o3-mini will be launched at the end of January 2025, with the o3 to follow.

o3 Model reasoning

One of the biggest differences between OpenAI o3 models and mainstream AI models is that fact-checking is carried out so that some common modeling pitfalls can be circumvented, but this process incurs a response delay of typically a few seconds to a few minutes, depending on the difficulty of the reasoning.

Another highlight of the o3 family of models is the use of a "private chain of thought" for "thinking", which allows one to pause before responding, consider the cues and interpret their reasoning, and ultimately summarize the most accurate answer. the most accurate answer.

One of the new features of o3 is the ability to adjust the inference time, which is categorized into three computation levels: low, medium, and high; the higher the computation level, the better the task execution performance of o3.

Performance and AGI

The full name of AGI is artificial general intelligence, directly translated as general artificial intelligence, which refers to AI that can perform any task like humans, and is officially defined by OpenAI as "a highly autonomous system that exceeds human beings in the most economically valuable work".

The OpenAI company is moving aggressively towards its AGI goals, which have particular implications in the investment space, in addition to solidifying its position in the AI space.

Under the terms of OpenAI's deal with Microsoft, a close partner and investor, the company is no longer obligated to provide its state-of-the-art technology (i.e., technology that meets OpenAI's AGI definition) to Microsoft once OpenAI reaches AGI.

And o3 is OpenAI is an important step toward that goal, in the ARC-AGI benchmarkingThe o3 scored 87.5% on the high compute setting and 75.7% on the low compute setting, tripling the performance of the o1.

Admittedly high-computing setups are very expensive, costing thousands of dollars per task, says ARC-AGI co-founder François Chollet.

Citing the outlet, 1AI reported that the o3 performed well in other benchmarks:

In the SWE-Bench Verified Programming Task Benchmark, o3 is better than o1 22.8 percentage points higher;
In the Codeforces Programming Skills Test.o3 has received 2727 ratings;
In the 2024 U.S. Math Invitational.o3 Score 96.71 TP3T;
In the GPQA Diamond Postgraduate Level Biology, Physics and Chemistry test.o3 Score 87.7%;
In EpochAI's Frontier Math Benchmark.o3 Resolved 25.2% (no other model exceeds 2%), setting a new record.

These results are from OpenAI's internal evaluation and await further validation from benchmarking results from external customers and organizations.

Safety

The release of o3 marks an important step for OpenAI in the field of general-purpose AI. o3's capabilities are impressive, but its potential risks require attention. While o3's capabilities are impressive, its potential risks need to be taken seriously, and OpenAI is committed to working on model safety and collaborating with other organizations to build a better benchmarking system.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

o3 Pressing the Stage: OpenAI Rolls Up the Reasoning AI Modeling Storm, Toward a New Peak of AGI

OpenAI Updates ChatGPT Client for macOS: Reads System Memo Apps, Analyzes Multiple IDEs Simultaneously

Google Expands Gemini AI Depth Research Model to Support 40+ Languages, Including Chinese

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

OpenAI Updates ChatGPT Client for macOS: Reads System Memo Apps, Analyzes Multiple IDEs Simultaneously

Google Expands Gemini AI Depth Research Model to Support 40+ Languages, Including Chinese

OpenAI and Google: The fast pace of AI online battle

OpenAI CEO Altman predicts AGI could be realized in 5 years, but with little short-term societal impact

OpenAI Altman: Humans may see systems with generalized AI for the first time next year

OpenAI Calls Out Musk Again: You Can't Achieve AGI With Lawsuits

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow