"You'll never go to a search site again."
This is what Bill Gates said at the AI Forward 2023 event in San Francisco.
17 months later.AI SearchUshering in the 2.0 era!
Just last week.Kimi The "Explore" version, which is an AI search that supports deep reasoning, was launched. Kimi's servers were overwhelmed when each person was only allowed to use it five times a day.
And once again, manufacturers are piling on the updates as if they've agreed to do so.
AI Search Engine Perplexity Waterling has gone live with a Pro search for the ChatGPT O1-mini driver!
Outrageously, in OpenAI's own talk page, o1-preview doesn't yet support on-net search.
I'm curious to know what familiar faces have "quietly" supported deep search.
One search down:
- Perplexity - "Pro Search"
- kimi - "Discovery Edition"
- Bean curd - "In-depth search"
- Zhipu - "AI search "by default is the reasoned version""
- 360AI Slow Search - "Slow Thinking Model"
It can be expected that the future 2.0 version of the AI search product will reproduce the AI model of "a hundred competitions", the reasoning ability will also become standard.
However, the hard part is that AI search is not like AI models, there are all sorts of leaderboards, just find the top ones and use them.
So today I'm going to baseEase of use, comprehension, accuracyThese three dimensions measure how well the 2.0s work!
I. Ease of use
Let's start with the conclusion:Beanbag = Wisdom Spectrum = 360AI > Perplexity > kimi
I categorize ease of use as the three main difficulties in using AI search:
- Whether there is a limit to the number of times it can be used
- Number of pages supported in a single search
- Forms of sharing search results
Perplexity's Pro Search feature provides five free uses of the Pro Search feature every four hours.Kimi Explorer is currently limited to five uses per person per day. Beanbag Deep Search, Smart Spectrum AI Search, and 360 AI Slow Search do not have an explicit limit on the number of times they can be used.
Since Kimi Explore places additional emphasis on the ability to visit more than 500 pages in a single search, I also compared how many pages per product were visited per visit by repeating the question 3 times under the same high-stakes question.
After all, the number of pages indirectly affects how big the AI model's "eyes" are.
The next step is to share the form, usually use the search engine, the most used form is to copy and paste the link to a variety of places.
Kimi, Spectrum, Beanbag, and 360AI all support copying complete text, generating images and links, and Beanbag can additionally set whether or not to allow other users to access the files that appear in the conversation log. perplexity can limit the links to be accessed only by yourself.
II. Comprehension and accuracy
Since their best feature is bothAI search that supports reasoning capabilities,
then (in that case)comprehensionThe counterpart is the first step:
After sending the question to the AIs, are they able to get it correctly and search for valid web pages;
andaccuracyThe counterpart is the last step:
Does the model accurately integrate the content of the page without making it up.
So, I had 2 levels of questions ready to go:
lit. go up to heaven or down to Hades; fig. whichever way, I don't care
- Elon Musk's SpaceX: Timeline of all 5 Starship flight tests, including dates and reasons for failures or notable successes. "Organize Elon Musk SpaceX's 5 Starship flight tests, including dates and reasons for failures or notable successes."
- OpenAI is reportedly involved in a trademark dispute with Guy Ravine, who owns the 'Open AI' (with a space) trademark. Tell me the timeline of this matter. OpenAI is reportedly involved in a trademark dispute with Guy Ravine, who owns the 'Open AI' (with a space) trademark. Tell me the timeline of this matter. Tell me the timeline of this matter."
One spans the timeline and the other has a hidden name trap.
Whispering bb, the reason for not using the OU test is to simulate as much as possible the usual usage scenarios, usually if it's not for the sake of making things difficult for the GPT, I probably wouldn't have had much of a chance to solve the questions.
(Due to image size limitations, the purpose of the screenshots below is to show the form of the interaction, and I'll put a link to the original image in the comments section)
1. Perplexity
Just to clarify, the question is in English because Perplexity has also added a setting where he will determine if the current question is worth o1's time, which means that if your question wasn't that tricky, it wouldn't be good enough for o1 to play.
Comment.: Perplexity understood both questions accurately, searching out 12 web pages and selecting five of them as sources of information all without error. All were also additionally supplemented with more detailed supplementary information without human intervention, except that the answer to the latest Starship Flight5 failed, judged as the experiment not running.
2. kimi
Point of view:Here's another shout out to Kimi's auto-expand feature, because when we use AI search, there is a need to read the source repeatedly in order to double-check the accuracy of the information. From the results, kimi's information source is half in English and half in Chinese, and is able to accurately understand both questions, and has no problem with the final answer output.
3. Bean buns
Point of view:Doubao slightly hemp claw ah, the second question every time you ask is halfway to report errors, look at the first question of the search source are Chinese sites, should be the source of the search page to do the limit, resulting in information sources are not so broad, but the final answer results in the correct, not too much affected.
4. Smart Spectrum
Point of view:The same goes for Chi Spectrum, and the search sources are all Chinese sites. In the first question, the answer to the fifth flight made the same type of mistake as Perplexity, also thinking that the experiment hadn't started yet. But the second question was answered successfully, but unfortunately still partially wrong, Guy Ravine filed a trademark application that predates the founding of OpenAI.
5. 360AI
Point of view:One strong advantage of 360AI search is that it visualizes the thought process, making it easy for us to quickly understand the process and find the problem. Here we can see that in the first step it only searched for the fifth takeoff experiment, but in the later reflection process it retrieved the results of the previous four experiments. Rather sadly the fifth experiment was still not answered correctly, and in the second question, the time record of the 2024 occurrence was lost.
Individual accounts are limited in the number of times they can be used, and the example comparison above took three days, with each repeated five times.
By this point, the conclusion is fresh:
kimi > Perplexity = Wisdom Spectrum = 360AI > Bean Bag
btw, who remembers Bing AI from far away on the shores of Damien Lake?
You can also choose your first experience with AI search by combining these three major metrics~