Beanbag Open Source Video Generation Model VideoWorld: First Language-Free Model Dependent Cognitive World

February 10th.Bean curdVideoWorld", an experimental video generation model jointly developed by Big Model team, Beijing Jiaotong University and University of Science and Technology of China, is open-sourced today. Unlike mainstream multimodal models such as Sora, DALL-E, and Midjourney, VideoWorld realizes for the first time in the industry that you don't need to rely on a language model to know the world.

It is stated that most of the existing models rely on language or labeled data to learn knowledge, and rarely involve the learning of purely visual signals. However, language does not capture all knowledge in the real world. For example, complex tasks such as origami and bow tie are difficult to express clearly through language. VideoWorld, on the other hand, removes the language model and realizes unified execution of comprehension and reasoning tasks.

At the same time, it is based on a potentially dynamic model that can beEfficient compression of video frame-to-frame variation informationVideoWorld is a powerful robot that can significantly improve the efficiency and effectiveness of knowledge learning. Without relying on any reinforcement learning search or reward function mechanisms, VideoWorld reaches the professional 5-dan 9x9 Go level and is able to perform robotic tasks in a variety of environments.

1AI Attach the relevant address below:

  • Link to paper:https://arxiv.org/abs/2501.09781

  • Code Link:https://github.com/bytedance/VideoWorld

  • Project home page:https://maverickren.github.io/VideoWorld.github.io

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Everything can access DeepSeek, 44 domestic platforms accessing R1 with super-detailed inventory

2025-2-10 12:25:17

Information

The news said that Meituan "All in AI", Wang Xing, Wang Puzhong both valued

2025-2-10 21:31:06

Search