TencentThe Hunyuan team, Sun Yat-sen University and the Hong Kong University of Science and Technology jointly launched a newImage video model"Follow-Your-Pose-v2", the related results have been published on arxiv (attached DOI:10.48550/arXiv.2406.03035).
According to reports, "Follow-Your-Pose-v2" only needs to input a picture of a person and an action video, and it can make the person in the picture move along with the action in the video, and the generated video can be up to 10 seconds long.
Compared with the previously launched model, "Follow-Your-Pose-v2" can support multi-person video action generation with less inference time.
In addition, the model has strong generalization capabilities and can generate high-quality videos regardless of the age and clothing of the input character, how cluttered the background is, or how complex the movements in the action video are.
Tencent has released an acceleration library for the Tencent Hunyuan Text Generator open source model (Hunyuan DiT), claiming to greatly improve reasoning efficiency and shorten the image generation time by 75%.
Officials said that the usage threshold of the Hunyuan DiT model has also been greatly lowered, and users can use the Tencent Hunyuan Wenshengtu model capabilities based on the ComfyUI graphical interface.