OpenAI Chief Executive Officer Sam Altman In an interview, he revealed GPT-4o and some information about GPT5. GPT-4o is a large multimodal model that can reason across text, video, and audio. Sam Altman said that he has long had the idea of controlling computers with voice, and GPT-4o's comprehensive reasoning ability will bring an unprecedented user experience. Compared with existing voice assistants, such as Apple's Siri, GPT-4o is more autonomous and excels in semantic understanding.
Source Note: The image is generated by AI, and the image is authorized by Midjourney
Altman mentioned that when he was experiencing GPT-4o, he found that one of the surprising use cases was the ability to complete many tasks that required frequent switching of applications and browsers on one platform, such as real-time translation, voice interaction, and video analysis. This is a huge change for developers and professionals who rely on staying focused and efficient.
GPT-4o has the characteristics of low latency, with an average latency of only about 200-300 milliseconds. This low latency enables GPT-4o to be applied in fields such as real-time translation, medical image parsing, and medical record analysis.
Altman said that medicine will be a field where GPT-4o can be used.maximumOne of the beneficiary groups is about GPT-5Altman revealed that GPT-5 will be a very special product and may adopt a new name. He said that GPT-5 may be similar to a "virtual brain" that can help users handle various tasks. Compared with previous GPT series products, GPT-5 will be a huge attempt.
GPT-4o and the upcoming GPT-5 demonstrate OpenAI's innovation and breakthroughs in the field of artificial intelligence. These multimodal large models will bring a smarter and more efficient experience, and provide better services and assistance to people in different fields.