Business

Apple researchers question AI’s reasoning ability in mathematics

New Delhi, Oct 12 || A team of Apple researchers has questioned the formal reasoning capabilities of large language models (LLMs), particularly in mathematics.

They found that LLMs exhibit noticeable variance when responding to different instantiations of the same question.

Literature suggests that the reasoning process in LLMs is probabilistic pattern-matching rather than formal reasoning.

Although LLMs can match more abstract reasoning patterns, they fall short of true logical reasoning. Small changes in input tokens can drastically alter model outputs, indicating a strong token bias and suggesting that these models are highly sensitive and fragile.

“Additionally, in tasks requiring the correct selection of multiple tokens, the probability of arriving at an accurate answer decreases exponentially with the number of tokens or steps involved, underscoring their inherent unreliability in complex reasoning scenarios,” said Apple researchers in their paper titled “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models.”

The ‘GSM8K’ benchmark is widely used to assess the mathematical reasoning of models on grade-school level questions.

While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities have genuinely advanced, raising questions about the reliability of the reported metrics.

Business

Apple researchers question AI’s reasoning ability in mathematics

Have something to say? Post your comment

Trending Tags

More Business News

India holds 3rd rank globally in fintech startup funding with $889 million in Jan-June

Sale of ultra-luxury homes worth Rs 50 crore surges 2,550 pc in Delhi-NCR

India’s office leasing jumps 40 pc in H1 2025, new supply jumps 25 pc: Report

LIC shares make strong comeback, jump over 34 pc in last 4 months

Domestic investors infuse $1.4 bn in Indian real estate in Jan-June, up 53 pc

India’s rising middle class to drive global leisure travel boom: Report

Ola, Uber, Rapido can now charge up to double the base fare during peak hours

Tata Motors' sales decline 8.5 pc in Q1 FY26, Mahindra reports 18 pc SUV growth

Hyundai Motor's June sales up 1.5 pc on increased demand

Ola Electric sales crash 45 pc in June, market share plunges to 19 pc