quick workerKelingThe new features “Photo to Video” and “Video Continuation” are launched today.
Figure videoFunction that supports converting static images into 5-second videos, and users can control the movement of objects in the image through prompt text;Video continuationFunction, supports one-click and multiple-time continuation of generated videos, and can generate up to about 3 minutes of video; in additionVincent VideoAdded 9:16 and 1:1 video size options.
Attached KuaishouCan LingguanWebsite:https://www.1ai.net/12558.html
Keling is a large video generation model developed by Kuaishou that can generate large-scale reasonable movements and simulate the characteristics of the physical world.
KeLing uses the DiT architecture, and Kuaishou has upgraded the dimensions of modules such as latent space encoding/decoding and time series modeling in the model.
In latent space encoding/decoding, Kuaishou developed its own 3D VAE network to achieve spatiotemporal synchronous compression, obtain high reconstruction quality, and strike a balance between training performance and effect. In temporal information modeling, Kuaishou designed a full attention mechanism as a spatiotemporal modeling module.