Tsinghua team's domestically produced "Sora" is popular: Shengshu Technology releases a large video model Vidu

At the Future Artificial Intelligence Pioneer Forum of Zhongguancun Forum, Shengshu Technology andTsinghua UniversityJoin hands and officially launch ChinaThe firstWith long duration, high consistency and high dynamicsVideo mockup——"Vidu”.

The core of this leading video model lies in the U-ViT architecture that is a fusion of Diffusion and Transformer. It can not only generate a 16-second high-definition video with a resolution of 1080P in one click, but also show amazing imagination while simulating the real physical world. Multi-lens generation and high consistency of time and space are the unique charms of Vidu.

Tsinghua team's domestically produced "Sora" is popular: Shengshu Technology releases a large video model Vidu

It is worth mentioning that Vidu has made significant breakthroughs worldwide since its release.top notchThe level is comparable and is still being iterated and optimized. This achievement is inseparable from the team's deep accumulation and many original achievements in the fields of Bayesian machine learning and multimodal large models.

In particular, the U-ViT architecture proposed by the team in September 2022 is the globalThe firstThe fusion architecture of Diffusion and Transformer laid a solid foundation for the birth of Vidu. Subsequently, in March 2023, the team took the lead again and open-sourced the multimodal diffusion model UniDiffuser based on the U-ViT fusion architecture, successfully verifying the large-scale scalability of the U-ViT architecture.

Based on the in-depth understanding of the U-ViT architecture and rich engineering and data experience, the team overcame many key technical challenges in long video representation and processing in a very short time, and developed the Vidu video model. This model performs well in improving video coherence and dynamics, further promoting the development of video processing technology.

The launch of Vidu not only once again verifies the excellent performance of the U-ViT fusion architecture in large-scale visual tasks, but also demonstrates Shengshu Technology's continuous innovation capabilities and industry-leading position in the field of multimodal native large models. As a universal visual model, Vidu can generate more diverse and longer video content, and its flexible architecture will also provide unlimited possibilities for future compatibility with a wider range of modalities and expanding the boundaries of multimodal general capabilities.

Application address:

https://shengshu.feishu.cn/share/base/form/shrcnybSDE4Id1JnA5EQ0scv1Ph

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Apple plans to work with OpenAI to enhance iPhone's artificial intelligence capabilities

2024-4-28 9:29:39

Information

Tmall provides free AI tools to home improvement and home appliance merchants: Qianniu launches the "Home Work" function

2024-4-28 10:21:27

Search