March 28 - Early this morning.AliThousand Questions on TongyiThe team announced the launch of the next generationvisual inference model QVQ-Max.
According to the official introduction, QVQ-Max not only understands picture and video content, but also provides analysis and reasoning for said information. More than analyzing and reasoning, QVQ-Max can also do things like design illustrations, generate short video scripts, and even create role-playing content based on user needs.
Core competencies: from observation to reasoning
QVQ-Max's capabilities can be summarized in three areas: careful observation, in-depth reasoning and flexible application. Here's how it performs in each of these areas.
- Careful Observation: Capturing Every Detail
- QVQ-Max is very good at parsing pictures, whether it's a complex diagram or a casual photo taken in everyday life, it can quickly recognize key elements. For example, it can tell you what items are in a photo, what text logos are there, and even point out small details that you might have missed.
- Deeper reasoning: not just "seeing" but "thinking"
- It's not enough to recognize what's in the picture; QVQ-Max can further analyze the information and draw conclusions based on background knowledge. For example, in a geometry problem, it can deduce the answer based on the graph accompanying the question; in a video, it can speculate on what might happen next based on the content of the picture.
- Flexible application: from answering questions to creating
- In addition to analyzing and reasoning, QVQ-Max can do some interesting things, such as help you design illustrations, generate short video scripts, and even create role-playing content according to your needs. If you upload a draft, it may help you refine it into a complete work; upload a daily photo, it can be transformed into a sharp critic, a soothsayer.
The QVQ-Max has a wide range of applications that can come in handy for school, work and everyday life.
- Career Tools: At work, QVQ-Max can assist with tasks such as analyzing data, organizing information, and programming and writing code.
- Learning Assistant: For students, QVQ-Max can help with difficult questions in subjects such as math and physics, especially those with diagrams. It also makes learning easier by explaining complex concepts in an intuitive way.
- Life's little helperQVQ-Max can also provide practical advice in your life. For example, it can recommend what to wear based on photos of your closet, or show you how to cook a new dish based on pictures of recipes.
1AI notes that the model is now available on Qwen Chat, where users can use QVQ-Max's reasoning power by uploading any image or video, asking a question, and clicking the "Thinking" button.
Alibaba said that this is just one stage in the evolution of the model, and will continue to optimize its performance and expand its functionality in the future.