According to The Information,OpenAI A new multimodal AI model that is capable of voice conversations and object recognition was recently demonstrated to some customers. Sources tell us that this may be one of the official releases OpenAI plans to make this coming May 13th.
Image source: Pexels
According to the report, the new model can process image and audio information faster and more accurately than OpenAI's existing standalone image recognition and text-to-speech models. For example, it could help customer service agents "better understand a caller's tone of voice and determine if they are using a sarcastic tone." Theoretically, the model could also assist students in learning math or translating real-world sign language.
However, the source also noted that while the model was able to outperform the GPT-4 Turbo in terms of answering certain questions, there is still the possibility of confidently giving the wrong answer.
Developer Ananay Arora posted a screenshot containing code related to calls, suggesting that OpenAI may be adding the ability to make phone calls to ChatGPT. Arora also found some evidence that OpenAI is configuring servers for real-time audio and video communication.
OpenAI CEO Sam Altman has categorically denied that the upcoming release is a large-scale language model code-named GPT-5 (which is said to be significantly better than GPT-4), and The Information says that GPT-5 could be officially unveiled before the end of the year. Altman also said that OpenAI will not release a new AI search engine.
If The Information's report is true, OpenAI's new release could still have some impact on the upcoming Google I / O developer conference. Google is also known to be testing technology that utilizes AI to make phone calls. Additionally, Google has a rumored upcoming project codenamed "Pixie," a multimodal Google Assistant replacement that recognizes objects through the device's camera, providing users with information such as "how to get to the place of purchase" or "how to get to the place of purchase" or "how to get to the place of purchase". "or how to use it.