Researchers from the University of Mines in Paris and the Technion-Israel Institute of Technology have jointly introduced an innovativeVideo Model——Slicedit. This model is able to modify the main objects in a video without changing the background of the video. For example, it is possible to turn a surfer into Iron Man, or a boy spinning a ball into an NBASuperstarCurry, etc.
The Slicedit model combines a diffusion model for Vincennes images with preprocessing of spatio-temporal slices of the video. Although some blurring and distortion may occur in the modified video, for amateurs unfamiliar with professional video editing software, Slicedit provides a quick way to accomplish video content modification, similar to a video version of Photoshop.This makes it ideal for making funny videos for use on platforms such as Ghostbusters, Jitterbugs, and Shutterbugs.
Slicedit overcomes the challenges of video editing with a few key techniques.
Spatial-Temporal Slicing:A 2D plane extracted from the 3D space of the video, either as video frames at a fixed point in time, or as a combination of consecutive frames spanning time in a specific direction. This allows the model to handle dynamic elements in the video while maintaining the stability and integrity of the background and other non-target regions.
Extended Attention:Slicedit improves upon the traditional attention mechanism by enabling it to process time-series data. The model takes into account not only the information in the current frame, but also the neighboring frames when processing the current frame, thus capturing the dynamic changes between video frames.
DDPM Inversion:Slicedit employs a backpropagation denoising process that starts with the target data and finds a set of noise vectors that are capable of reconstructing the original data after a DDPM generation process. This involves converting the input video frames into a noise space and performing conditional denoising to match the user's editing criteria.
The researchers said they plan to open source the Slicedit model soon so more developers can build their own video editors.
The development of this technology could have a significant impact on the field of video editing, making it easier and more accessible, as well as opening up more innovative possibilities for content creators.
Paper address:https://arxiv.org/pdf/2405.12211