-
Back in the game! Gemini-Pro's multimodal capabilities are on par with GPT-4V
The recent Gemini-Pro evaluation report shows that it has made significant progress in the multimodal field, comparable to GPT-4V, and even better in some aspects. First, in the comprehensive performance on the multimodal proprietary benchmark MME, Gemini-Pro surpassed GPT-4V with a high score of 1933.4, showing its comprehensive advantages in perception and cognition. Among the 37 visual understanding tasks, Gemini-Pro performed outstandingly in tasks such as text translation, color/landmark/person recognition, and OCR, showing its excellent ability in the basic perception field. …- 3.7k