Apple researchers question AI's reasoning ability: simple math questions can be answered incorrectly with minor changes

In recent years, artificial intelligence (AI) have made significant progress in various areas, with large-scale language modeling (LLM) is capable of generating human-level text and even exceeding human performance on some tasks. However, researchers of LLM'sreasoning abilityquestioned, they found that these models, when solving simple mathematical problems, wereThe fact that mistakes are made with just a few minor changes suggests that they may not be capable of true logical reasoning.

Apple researchers question AI's reasoning ability: simple math questions can be answered incorrectly with minor changes

Thursday.appleA group of researchers at the company published a paper titled "Understanding the Limitations of Mathematical Reasoning in Large Language Models," revealing that LLMs are susceptible to interference when solving mathematical problems.IT House notes that theThe researchers tested the reasoning power of the LLM by making small changes to the math problem, such as adding irrelevant information. It turns out that the performance of these models drops dramatically in the face of such changes.

For example, when the researchers were given a simple math problem: "Oliver picked 44 kiwis on Friday and 58 kiwis on Saturday. On Sunday, he picked twice as many kiwis as he did on Friday. How many kiwis did Oliver pick in total?" The LLM was able to calculate the answer correctly. However, when the researcher added an irrelevant detail, "On Sunday, he picked twice as many kiwis as he did on Friday, and five of them were smaller than average," LLM's answer was incorrect. For example, GPT-o1-mini answered, "... Sunday, where 5 kiwis are smaller than average. We need to subtract them from the Sunday total: 88 (Sunday kiwis) - 5 (smaller kiwis) = 83 kiwis."

The above is just a simple example ofThe researchers modified hundreds of questions, almost all of which resulted in a significant decrease in the model's response success rate.

According to the researchers, this phenomenon suggests that LLMs don't really understand math problems, but instead make predictions based solely on patterns in the training data. But when real "reasoning" is required, such as whether to count small kiwis, they produce strange and implausible results.

This finding has important implications for the development of AI. Although LLM performs well in many areas, there are still limitations in its reasoning ability. In the future, researchers need to further explore how to improve LLM's reasoning ability so that it can better understand and solve complex problems.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Gartner Predicts 80% AI Workers Will Need to Upskill by 2027

2024-10-11 8:45:33

Information

AI Hawk-Eye Linesman System to Replace Human Linesmen at Wimbledon Tennis Championships from Next Year

2024-10-12 9:33:51

Search