According to Tencent Technology’s report today,Chinese Multimodal Large Model SuperCLUE-V benchmark August list released,Tencent HunyuanLarge ModelIt ranks first among domestic large models (71.95 points).
Tencent Technology claims that the modelAccurately identify image elements and generate natural language descriptions, fully understand and see the details. This evaluation covered 12 highly representative multimodal understanding models at home and abroad. Tencent Hunyuan Model scored 71.95 in multimodal basic capabilities and application capabilities.
According to the query, the August list includes 12 of the most representative multimodal understanding models at home and abroad. Tencent Hunyuan Big Model ranks second in the overall list, second only to GPT-4o. GPT-4o scored 74.36 points, leading the multimodal benchmark, and its basic multimodal cognitive ability and application ability both scored 70+ points, with a certain leading advantage in both technology and application.
▲ Image source: "CLUE Chinese Language Comprehension Assessment Benchmark" official account, the same below
SuperCLUE evaluated that in terms of basic capabilities, domestic large models still have a certain gap compared with overseas models, especially in fine-grained visual cognition tasks, where there is a gap of 5 points between the best domestic and foreign models, and further optimization and improvement of multimodal deep cognition capabilities is needed.
This evaluation selected 4 overseas models and 8 domestic representative multimodal modelsIn order to further evaluate the different progress of open source and closed source, the participating models include 4 open source models, 8 closed source models.