Zhipu AI open source video understanding model CogVLM2-Video, can answer time-related questions

Zhipu AI open-sources video understanding model CogVLM2-Video, which can answer time-related questions

Zhipu AI Announced that a new video understanding model has been trained CogVLM2-Video, andOpen Source.

It is reported that most current video understanding models use frame averaging and video tag compression methods, which results in the loss of temporal information and cannot accurately answer time-related questions. Some models that focus on temporal question-answering datasets are too limited to specific formats and applicable fields, causing the models to lose broader question-answering capabilities.

Zhipu AI open-sources video understanding model CogVLM2-Video, which can answer time-related questions

▲ Official effect demonstration

Zhipu AI proposed aAutomatic time positioning data construction method based on visual model, generating 30,000 time-related video question-answering data. Based on this new dataset and existing open-domain question-answering data, we introduced multi-frame video images and timestamps as encoder inputs and trained the CogVLM2-Video model.

Zhipu AI said that CogVLM2-Video not only achieved state-of-the-art performance on public video understanding benchmarks, but also excelled in video subtitle generation and temporal positioning.

Attached related links:

Code:https://github.com/THUDM/CogVLM2
Project website:https://cogvlm2-video.github.io
Online Trial:http://36.103.203.44:7868/

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

{{userData.name}}Verify

Zhipu AI open-sources video understanding model CogVLM2-Video, which can answer time-related questions

China Telecom launches the "Xingchen Smart Answer" service, the world's first AI large-scale model that can be used by texting

The results of the first "Miss AI" beauty pageant are out, but the controversy behind it is far from over

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

China Telecom launches the "Xingchen Smart Answer" service, the world's first AI large-scale model that can be used by texting

The results of the first "Miss AI" beauty pageant are out, but the controversy behind it is far from over

Zhipu AI announces the open source of GLM fourth-generation model GLM-4-9B

Microsoft's open source multimodal model LLaVA-1.5 is comparable to GPT-4V

Russian tech giant Yandex announces open source "YaFSDP" large language model training tool: greatly improves GPU utilization, and can achieve 26% acceleration for Llama 3

B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow