Zhipu AI today announced a new upgrade to its video generation model and the official launch of its next-generation product -- theCogVideoX.
The CogVideoX model is now available on Wisdom Spectrum's PC, mobile apps and applets.All C-suppliers can generate videos through the AI video generation function of Wisdom Spectrum.Qingying"YingExperience AI Text Generated Video and Image Generated Video for free.
The core technical features of CogVideoX are described as follows:
-
To address the problem of content coherence, Smart Spectrum AI independently developed a set of high-efficiency 3D Variable Auto-Encoder (3D VAE) structure. This structure can compress the original video data to 2% of its original size, reducing the training cost and difficulty of video diffusion generation model. Combined with the 3D RoPE positional coding module, the technique improves the ability to capture inter-frame relationships in the time dimension, thus establishing long-term dependencies in the video.
-
In terms of controllability, Smart Spectrum AI has created an end-to-end video understanding model that generates descriptions for large amounts of video data. This innovation enhances the model's ability to understand text and follow commands, ensuring that the resulting video is more closely aligned with the user's input needs and able to handle extremely long and complex prompt commands.
-
The model adopts a transformer architecture that integrates text, time and space. Instead of the traditional cross attention module, Expert Block is designed to align two different modal spaces, text and video, and optimize the inter-modal interaction through the Full Attention mechanism.
The main features of "Clear Shadow" are as follows:
-
Quick Generation:Generate a 6-second video in just 30 seconds.
-
Efficient Command Following Capability: Even with complex prompts, ClearShadow can understand and execute them accurately.
-
Content coherence: The generated video can better restore the movement process in the physical world.
-
Flexibility in staging: For example, the camera is able to smoothly follow the three dogs in the frame.
In addition, the Smart Spectrum Big Model Open Platform bigmodel.co.uk ClearShadow has also been deployed. Enterprises and developers can experience and utilize ClearShadow's text-to-video and image-to-video capabilities through API calls.