Peking University, Tsinghua University and others jointly released LLaVA-o1: the first spontaneous visual AI model, inference computing Scaling new ideas

Nov. 19 - A research team from Peking University, Tsinghua University, Pengcheng Lab, Alibaba's Dharmo Academy, and Lehigh University, theThe latest launch of the LLaVA-o1This is the first GPT-o1-like systematic reasoning that is spontaneous, as explained at the end of this article.visual language model.

LLaVA-o1 is a novel visual language model (VLM), which was designed with the goal of performing autonomous multi-stage reasoning.

LLaVA-o1, with 11 billion parameters, was developed based on the Llama-3.2-Vision-Instruct model and designed with 4 reasoning stages: summary, caption, reasoning and conclusion.

Peking University, Tsinghua University and others jointly release LLaVA-o1: the first spontaneous visual AI model, a new idea of inference computing Scaling

The model is fine-tuned using a dataset called LLaVA-o1-100k, derived from visual quizzing (VQA) sources and structured inference annotations generated by GPT-4o.

LLaVA-o1 employs the inference time Scaling technique of stage-level beam search, which is capable of generating multiple candidate answers at each inference stage and selecting the best answer.

The model has a strong ability to handle complex tasks, and can break through the limitations of traditional visual language models in complex visual question and answer tasks.

Compared to the base model, LLaVA-o1 improves performance by 8.9% in multimodal inference benchmarks, outperforming many large and closed-source competitors.

Peking University, Tsinghua University and others jointly release LLaVA-o1: the first spontaneous visual AI model, a new idea of inference computing Scaling

The introduction of LLaVA-o1 fills an important gap between textual and visual question-and-answer models, and its excellent performance in several benchmark tests, especially in the area of reasoning about visual problems in math and science, demonstrates the importance of structured reasoning in visual language models.

Spontaneous AI (Spontaneous AI) refers to AI systems that can mimic the spontaneous behavior of animals. Research in this technology has focused on how to design robots or intelligent systems with spontaneous behavior through machine learning and complex temporal patterns.

Attach reference address

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Peking University, Tsinghua University and others jointly release LLaVA-o1: the first spontaneous visual AI model, a new idea of inference computing Scaling

Global Shipments of Personal Smart Audio Devices Grow 15% as Market Continues to Rebound

NVIDIA Announces New AI Hardware: H200 NVL PCIe GPU and GB200 NVL4 Superchip

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Global Shipments of Personal Smart Audio Devices Grow 15% as Market Continues to Rebound

NVIDIA Announces New AI Hardware: H200 NVL PCIe GPU and GB200 NVL4 Superchip

The draft shows that the United States is ready to go all-out to implement national rules for rapidly developing AI technology

Stability AI launches commercial membership program, charging for commercial use of AI models

GitHub Accelerator 2024 supports 11 open source AI projects to promote technological innovation

Meta Open Source Small-Language AI Models MobileLLM Family: Smartphone Friendly, 125M-1B Version Available

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow