30th local time.OpenAI announced that with immediate effect, it will provide some of the ChatGPT Plus User Open GPT-4o (Alpha version) and will be rolled out to all ChatGPT Plus this fall. subscriber.
In May of this year, OpenAI CTO Mira Murati gave a talk about it:
In GPT-4o, we train a new unified model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network.
Since GPT-4o is our first model to combine all of these modalities, we are still in the early stages of exploring the model’s capabilities and its limitations.
The OpenAI company had planned to invite a small group of ChatGPT Plus users to test the GPT-4o voice mode at the end of June this year, but officials announced a delay in June, saying that they needed toMore time to polishThe model, improving the modelDetecting and rejecting certain contentThe ability of the
According to the previously revealed information, the GPT-3.5 model has an average speech feedback delay of 2.8 seconds, while the GPT-4 model has a delay of 5.4 seconds, making it less than excellent for speech communication, and the upcoming GPT-4o can greatly reduce the delay time.Nearly seamless dialog.
The GPT-4o voice mode hasrapid response,You sound like a real person.OpenAI further claims that GPT-4o speech patterns can sense emotional tones in speech, including sadness, excitement, or singing.
OpenAI spokesperson Lindsay McCallum said, "ChatGPT You can't fake another person's voice., including the voices of individuals and public figures, and would prevent theDifferent from the preset soundsof the output."