"Retarded Bar" Becomes the Most Popular Chinese AI Training Database, Speech Feed Big Data on CAS Research Paper

mentally deficient"Bar" isBaidu online forumA sub-forum of the forum. In this forum, users create content with puns, multiple meanings, causal inversions, and harmonized words that are either brilliant or "brain-teasing". Some of the content has logical traps that challenge even humans.

In April of this year, a team of researchers from the Chinese Academy of Sciences (CAS), in a study entitled "COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning", evaluated the results of their fine-tuned large model using the "retarded bar headings + GPT-4 responses" over other supervised fine-tuning instruction set data they had collected. In their study "COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning", the results of the fine-tuned large model evaluation using "retarded bar titles + GPT-4 responses" outperformed other supervised fine-tuning instruction set data they collected. The latter came from social media platforms including Xiaohongshu, Douban, and Zhihu.

"Retarded Bar" Becomes the Most Popular Chinese AI Training Database, Speech Feed Big Data on CAS Research Paper

Illustration: screenshot of the paper Source.

"Humor is the watershed that separates humans from machines."

The article went viral was unexpected and unanticipated. The owner of the bar, Gongsun 闬, told the Vertical News reporter: "In December last year, there was already a lot of AI vs. Retarded Bar content on the Internet, but we really didn't think that the Chinese Academy of Sciences used it to train AI."

On the video site, netizens ask the AI questions from the retarded bar, which are used to test the AI's understanding and logical analysis. "The traffic of all these videos is very high, but our own video account has little attention instead." Gongsun 闬 said with a smile, of course they do not do this for the traffic, "the important thing is that we have fun ourselves."

"How many half hours is an hour and a half?" "Sashimi is dead fish fillet" "Waiting for a red light is waiting for a green light" "Caffeine comes from coffee berries" "Putting out a fire is putting out a fire ""My newest picture is actually my oldest picture"" ......

The creations of these retarded bar members, at first glance, are full of humor and wit, but on second thought, through the deconstruction and reconstruction of the real world, the creators contribute to human thinking about logic, humor and philosophy, so it can be said that they are paragraph writers, poets, and philosophers.

"Retarded Bar" Becomes the Most Popular Chinese AI Training Database, Speech Feed Big Data on CAS Research Paper

Source: screenshot of retarded bar

In 2004, the retarded bar was established. 5 years later, 14-year-old Gongsun 闬 began to post articles and interact with his friends. He didn't think he would one day become the bar owner and make so many people aware of the bar, "It was more like a chat room, with a relaxed community atmosphere where people shared their whims and fancies."

"Humor is an important watershed between humans and machines." From the initial relaxed and lively community atmosphere, to today's big data chat library, Gongsun Zu hopes to play happily to explore the extent to which AI can understand human humor, "Now the AI has no human flavor, too serious. When I send a terrier, AI will only explain it in a single line, instantly losing the interest in communication."

"Big models get smarter, with my contribution."

Humor is a scarce and precious human ability, which may add a footnote to stand-up comedy as popular comedy.

Retarded Bar member Hu Luo Bei graduated from Tianjin Polytechnic University with a degree in Mathematics, "guarding the best harmonic terrier" is his other more familiar identity, a stand-up comedian. Last month he held a solo stand-up comedy show, which was recommended by well-known stand-up comedian Li Xueqin.

His answer to why it's called Hulubei is very "mathematical", "because a search for carrots shows food, and Hulubei is unique."

In 2019, Hu Luo Bei saw the selected posts of the retarded bar, "At that time, I was particularly impressed by the phrase 'sashimi is dead fish fillet', and I thought that I could also write, and that this place would be a good place to post what I wrote." As a result, Hu Luo Bei gradually posted his creations in the community.

The complexity of humor is what researchers call the "final frontier" of artificial intelligence. "What causes what fruits, what caffeine gets coffee fruits" is the creation of Hu Luo Bei, who confessed that he did not expect the content of the retarded bar will be fed as big data corpus, "AI seems to have nothing to do with ordinary people, but in fact, our daily routine in a way are feeding data for the future of the AI. ."

"Retarded Bar" Becomes the Most Popular Chinese AI Training Database, Speech Feed Big Data on CAS Research Paper

Photo credit: Bund Assembly

On September 7, Luo Bei Hu will share "The 'Inside' Story of My Speech on the Bund" at the Innovators Stage of the Bund Conference. Staff told reporters that the Bund Conference launched the Innovator Stage for the first time this year, hoping to introduce more interesting, diverse, and common people interested in science and technology innovators, but also to give them the opportunity to show.

"Mountains are waves of extremely slow geological age" "Trash bags in the air are filled with wind that no one wants" ...... Retarded Bar member Iihi introduced the bar's creations to Vertical News, who also likes to Using the art of rhetoric to create, "Poems need to be created out of the fixed minds of regular people, yet they need to have some relevance, and they need to find an intention that fits."

It is not difficult to find that literary languages that express complex human emotions utilize many rhetorical arts. In a sense, rhetoric is breaking the fixed logic of language, which makes it difficult for a one-dimensional AI to handle such expressions, let alone interact with humans.

"Retarded Bar" Becomes the Most Popular Chinese AI Training Database, Speech Feed Big Data on CAS Research Paper

Image source: Internet

The seemingly nonsensical corpus content in the retarded bar, which is filtered and collected by the researcher to construct challenging and realistic Chinese corpus interaction data, is very valuable for training and evaluating the ability of the big language model to understand and execute Chinese commands. In layman's terms, during the user's interaction with the AI, the big model will reduce the errors in the answers, i.e., outputting something that is not in accordance with the facts or common sense.

Iihi said that although he is an ordinary person, he hopes to make his own efforts for AI to better understand human beings. He gave an example to the reporter: If a mother learns that her child's city has cooled down, she will wonder if her child is dressed warmly, but she just wants to know if her child is dressed warmly.

"No, she misses her children." Iihi said, "If one day AI can read beyond our words, I believe it will be able to better serve humanity." (Oriental - Vertical News Chen Lina Ding Yihan)

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Bill Gates: AI uses much less power than electric cars, not worried about climate impacts

2024-9-9 10:39:11

Information

UK signs first international treaty on artificial intelligence

2024-9-9 12:00:18

Search