Meta launches V-JEPA model, using AI to efficiently supplement the obscured parts of videos

Meta Chief AI scientist Yann LeCun launched the JEPA (Joint Embedding Predictive Architectures) model architecture in 2022.The following year, an “I-JEPA” image prediction model was developed based on the JEPA architecture, and a new model called “V-JEPA"ofVideo Prediction Model.

Meta launches V-JEPA model, using AI to efficiently supplement the obscured parts of videos

It is reported that the relevant JEPA architecture and I-JEPA/V-JPA models focus on "predictive ability", claiming that they can use abstraction to efficiently predict and generate the obscured parts of images/videos in a "human-understandable" way.

IT Home noticed that the researchers used a series of specific masked videos to train the I-JEPA/V-JEPA model. The researchers required the model to use an "abstract method" to fill in the missing content in the video, so that the model can learn the scene during the filling and further predict future events or actions, thereby achieving a deeper understanding of the world.

Meta launches V-JEPA model, using AI to efficiently supplement the obscured parts of videos

Meta launches V-JEPA model, using AI to efficiently supplement the obscured parts of videos

▲ Image source: Meta official press release (the same below)

The researchers said this training method allows the model to focus on the high-level concepts of the film, rather than "getting bogged down in details that are not important for downstream tasks."The researchers gave an example: "When humans watch a video containing trees, they don't particularly care about the movement of leaves." Therefore, the model using this abstract concept is more efficient than competing products in the industry..

Meta launches V-JEPA model, using AI to efficiently supplement the obscured parts of videos

The researchers also mentioned that V-JEPA uses a design structure called "Frozen Evaluations", which means that "the core part of the model will not change after pre-training", so only a small specialized layer needs to be added to the model to adapt to new tasks, making it more universal.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Google accelerates fixes for AI assistant Gemini, cuts rejection rate in half

2024-2-18 7:50:37

Information

U.S. Patent Office Denies OpenAI's Trademark Application for "GPT"

2024-2-18 7:53:25

Search