According to TechCrunch, withAIAs video generation technology continues to evolve, an interesting phenomenon is starting to catch on in the industry: when a company releases a new AI video generator, it seems like it's always the first thing someone will use to make a video of Will Smith eating spaghetti. Not only has this evolved into an internet sensation, but it's also become a measure of how well a new type of AI Video GeneratorAn unofficial benchmark of performance - a test to see if it can realistically portray Smith wolfing down his noodles. Smith himself took part in the internet frenzy by posting a parody video via Instagram last February.
1AINoting that."Will Smith eats spaghetti" is just one of the many bizarre "unofficial" criteria for artificial intelligence in the year 2024.This comes after a 16-year-old developer created an app that lets an AI control the game Minecraft and test its architectural design capabilities. Meanwhile, a British programmer has created a platform that lets AIs play against each other in games such as Pictionary and Connect 4.
There's no shortage of more academic performance tests in the AI space, so why have these slightly quirky tests gone viral so quickly instead? One reason is thatMany industry-standard AI benchmarks are too obscure for the average person to understand. Companies often tout their AI's ability to solve problems in Olympic math competitions or PhD-level puzzles, but most people use chatbots just to chat or answer emails.
Even the most commonly used methods of measurement in the industry are not necessarily more effective or informative. The Chatbot Arena, for example, a public benchmarking platform closely watched by many AI enthusiasts and developers, allows any web user to rate the performance of AI on a specific task, such as creating a web application or generating an image. But the users who participate in the ratings are often unrepresentative, mostly from the AI and tech industries, and their votes are often based on personal and elusive preferences.
Peculiar AI benchmarking tests like "foursomes," "My World," and "Will Smith eats spaghetti" are clearly not rigorous empirical studies, or even universally applicable. It's not even universally applicable. Because even if AI can perfectly generate a video of Will Smith eating spaghetti, that doesn't mean it's good enough to generate an image of a burger.
These alternative AI benchmarks probably won't go away anytime soon - after all, they're not only entertaining, but they're also easy to understand.What novelty benchmarks will go viral in 2025?