Zhipu AI open-sources CogVideoX-5B video generation model, which can be run on RTX 3060 graphics card

News on August 28,Zhipu AI Open SourceCogVideoX-5B Video Generation ModelCompared with the previously open source CogVideoX-2B, the official said that its video generation quality is higher and the visual effects are better.

Official statementThe model's reasoning performance has been greatly optimized, and the reasoning threshold has been greatly reduced., you can run CogVideoX-2B on early graphics cards such as GTX 1080Ti, and run the CogVideoX-5B model on desktop "dessert cards" such as RTX 3060.

CogVideoX is a large-scale DiT (diffusion transformer) model for text-to-video tasks. It mainly uses the following techniques:

  • 3D causal VAE: achieves efficient video reconstruction by compressing video data into latent space and decoding in the temporal dimension.
  • Expert Transformer: combines text embedding and video embedding, uses 3D-RoPE as position encoding, adopts expert adaptive layer to normalize the data of two modalities, and uses 3D full attention mechanism for spatiotemporal joint modeling.

The detailed parameters of CogVideoX-5B and CogVideoX-2B are as follows:

Zhipu AI open-sources CogVideoX-5B video generation model, which can be run on RTX 3060 graphics card

Attached related links:

  • Code repository: https://github.com/THUDM/CogVideo
  • Model download: https://huggingface.co/THUDM/CogVideoX-5b
  • Paper link: https://arxiv.org/pdf/2408.06072
statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Amazon is reported to release Alexa AI subscription version in October: monthly fee is $10, sorting and summarizing the information flow that users are interested in

2024-8-28 9:42:09

Information

Zhipu AI: GLM-4-Flash large model API interface is open to the public for free

2024-8-28 9:46:52

Search