OpenAIAre they really going to beat Google? Haha! They held a new product launch conference before Google did.
There was a live broadcast early this morning, and when I woke up in the morning I found that I could already experience the new model.
Judging from the recorded content, there is no rumored GPT5, nor is there any so-called search function.
However, this update is still outstanding.
In fact, Ultraman had already said on X that it was neither gpt5 nor search, but it was like magic to him.
After watching the entire press conference, I felt that the future has already arrived.
Obviously, ChatGPT is no longer a text model.
It canPerceive the sounds and images of the outside world, even the emotions when you speak, and give you corresponding emotional feedback.
It's easy, reminiscent of the sci-fi movie "Her".
Apparently, Sam is also heading towards this goal. She sent a tweet 7 hours ago with only one word:her”.
"Her" is a sci-fi love movie about a human and an artificial intelligence falling in love in the near future. The male protagonist's cloud lover (artificial intelligence system OS1) in the movie is voiced by the sexy goddess Scarlett Johansson.
Black Widow is the dream lover of many men. With an AI like this one, why would you need a girlfriend?
I have gone a bit far, let's get back toGPT-4o, HER is clearly still a goal, not a reality.
But it is definitely worth talking about, maybe today, we are already making history.
I haven’t carefully compared the promotional video with actual usage, but I saw someone on X (@minchoi) summarize and demonstrate ten usage scenarios of GPT-4o, which is quite interesting, so I’d like to share it with you.
We can also better understand this model through some practical usage scenarios.
1. Real-time visual assistant
This is probably the most amazing demonstration. You can discuss what you see directly with GPT4o, and it can understand what you see in real time.
This demonstration about ducks is also very interesting. Google has done this before, but later everyone found that Google relied on editing to achieve real-time interaction, while GPT4o seems to be able to interact directly in real time through the camera.
I haven't experienced it yet, but this feature is also demonstrated in the official demonstration video.
2. Assisted learning
The video demonstrates how GPT4o can directly read questions given on the IPAD and interact with parents and children in real time through voice.
This is simply good news for poor students and their parents.
GPT should be a good teacher. At least he won’t be as furious as me. Haha!
Anyone who has helped their kids with their homework knows how frustrating this process is. Soon, GPT may be able to take over this task. For me, this is a necessity.
3. Real-time translation
The real-time translation feature allows GPT to act as your personal translator, allowing two people who speak different languages to communicate fluently.
As long as you agree on the rules with GPT, you can speak Chinese directly and it will be translated into Japanese immediately. If the other party speaks Japanese, it can be translated into Chinese immediately.
Just think about it, this is a basic need for so many people.
I have studied English for decades, but I am still a wretch. In the end, I still have to rely on technology.
4. Meeting Assistant
I don't like meetings, so I'll skip this introduction and let you imagine it yourself. For example, someone will help you record the meeting, take minutes, and make a meeting summary.
5. Interrupt and change emotions in real time
Anyone who has used the old version of GPT4 voice calls will definitely feel a little bit broken. The voice feedback of GPT4 is very, very slow.
It must first convert your voice into text, then pass it to the back-end for processing, and after the processing is completed, it must convert the text into voice again, so the whole process is very painful.
Now it's good, the new version of GPT4o canInterrupt at any time.
The feedback speed is also very fast, and it seems that feedback can be given within a few hundred millimeters. This is a huge improvement in practicality.
In addition to being able to speak and interrupt at any time, it also has the ability to understand and express emotions through sound.
What this means is that she can feel your joy, anger, sorrow and happiness from your voice, and you can also let her talk to you with different emotions.
for example,"Hi, GPT is here to act cute” , “Please say in an extremely excited tone, come on, baby"Hahahaha~~
6. Add text to the image
This function does not need much explanation. The picture demonstrates it very intuitively. No need to edit the picture, just generate it directly. You can see that after adding the text to the picture, it blends in seamlessly with the picture.
7. Multi-person meeting records
You can ask questions directly through the meeting recording.How many people were in the recording and what was said?”.
The answer was "There are four people. It sounds like a project management meeting. Mark is introducing himself..."
Then who said what will be presented in the form of text.
This feature isn't amazing, but it's useful.
8. 3D Object Synthesis
Now we can not only generate pictures, but also 3D animations?
7. Poster making
Input two people's photos and make a poster of a blockbuster movie. Good friends stand together!
8. Create stylized photos
Upload a photo, add a description, and you can generate a stylized photo.
This function is not new, and many traditional software also have it. The only difference is that now it can be completed through a unified dialogue window and conversation.
This mole is well preserved!
I feel GPT canDevour All APP.
9. Advanced P-picture with precise positioning
Given an OpenAI logo, give a coaster with no branding.
By describing, engrave the OpenAI logo and text on it.
Please note that it does not feel like it is pasted on, but rather like it is engraved.
The blend is very natural.
10. Generate text in special fonts
By describingThe letters KLM NOP QRS are displayed in three lines, like a font displayed in a copybook. This is a super futuristic font, a symbol of the AI revolution." to generate special fonts.
Ten scenes have been discussed. The first few are more grand, and the latter ones are more detailed.
The biggest update this time should be based on vision and sound, in these two aspects OpenAI should be "far ahead"!
From the perspective of general large models, OpenAI's overall strength is indeed strong, but the more annoying thing is that it is not open source and difficult to copy!
Finally, please note! The new model API has increased the quantity without increasing the price, and the price has been reduced. API players can laugh 😊!
In addition, ChatGPT desktop version is coming! PC users rejoice 😁.
In addition, most of OPENAI's new models and a large number of previous paid features can be used for free.
include:
GPT4 and app store, visual function, networking function, memory function, and expanded data analysis function.