Drag

Let’s get in touch

Schedule a meeting with our Expert to discuss your needs and explore tailored software solutions.

Support center +91 9825 122 840

Logo
About

About Us

Rejoicehub LLP, a prominent offshore IT outsourcing firm, was established in 2019 and has been making remarkable strides in the IT sector.Our dedicated team of over 100 professionals is our greatest asset. Our unwavering commitment to excellence has made us a highly sought-after company globally. We prioritize understanding our clients perspectives to enhance their product development process. Our adept professionals are capable of providing top-notch solutions. We promise our clients to bring their unique ideas to the market in a more user-friendly manner. Punctuality is a cornerstone of our work philosophy, and we prioritize delivering exceptional quality.

Career

Career

We offer careers, not jobs

Becoming a part of Rejoicehub LLP could mark a significant turning point in your life, offering numerous benefits along the way. Its a second home where teamwork is prioritized to achieve our shared objective - continuous evolution with cutting-edge technologies while ensuring the well-being of our most treasured resources, our employees. Embrace the Positive Vibes and the significance of maintaining a healthy Work-life Harmony by collaborating with us.

SOLUTIONS

SOLUTIONS

Case Study

Explore Our Trending Case studies

Visualize yourself being in the place of those clients who are talking about their problems, victories and how our IT solutions was very important for them. From showing how workflow optimization or cybersecurity reinforcement can be implemented through a case study approach to explaining that collaboration and innovation is able to overcome any difficulty.

Technology

Technology

Starterkit

Starterkit

Blogs

Our Blogs

Our blog is packed with valuable resources to keep you ahead of the curve. Explore industry trends, discover hidden tech hacks, and gain expert insights to optimize your operations and stay on top of the latest advancements.

Contact

Let’s get in touch

Great! We are excited to hear from you and lets start something special together. call us for any inquiry.

At Rejoicehub LLP, we are deeply passionate about creative problem-solving, innovative thinking, and pushing the boundaries of brands. With each client, we bring forward a commitment to forward-thinking solutions that drive success in the digital age.

Researchers question AI’s ‘reasoning’ ability as models stumble on math problems with trivial changes

Date October 12, 2024

Writen by Devin Coldewey

newsImage

How do machine learning models do what they do? And are they really “thinking” or “reasoning” the way we understand those things? This is a philosophical question as much as a practical one, but a new paper making the rounds Friday suggests that the answer is, at least for now, a pretty clear “no.”

A group of AI research scientists at Apple released their paper, “Understanding the limitations of mathematical reasoning in large language models,” to general commentary Thursday. While the deeper concepts of symbolic learning and pattern reproduction are a bit in the weeds, the basic concept of their research is very easy to grasp.

Let’s say I asked you to solve a simple math problem like this one:

Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday. How many kiwis does Oliver have?

Obviously, the answer is 44 + 58 + (44 * 2) = 190. Though large language models are actually spotty on arithmetic, they can pretty reliably solve something like this. But what if I threw in a little random extra info, like this:

Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?

It’s the same math problem, right? And of course even a grade-schooler would know that even a small kiwi is still a kiwi. But as it turns out, this extra data point confuses even state-of-the-art LLMs. Here’s GPT-o1-mini’s take:

… on Sunday, 5 of these kiwis were smaller than average. We need to subtract them from the Sunday total: 88 (Sunday’s kiwis) – 5 (smaller kiwis) = 83 kiwis

This is just a simple example out of hundreds of questions that the researchers lightly modified, but nearly all of which led to enormous drops in success rates for the models attempting them.

Image Credits:Mirzadeh et al

Now, why should this be? Why would a model that understands the problem be thrown off so easily by a random, irrelevant detail? The researchers propose that this reliable mode of failure means the models don’t really understand the problem at all. Their training data does allow them to respond with the correct answer in some situations, but as soon as the slightest actual “reasoning” is required, such as whether to count small kiwis, they start producing weird, unintuitive results.

As the researchers put it in their paper:

[W]e investigate the fragility of mathematical reasoning in these models and demonstrate that their performance significantly deteriorates as the number of clauses in a question increases. We hypothesize that this decline is due to the fact that current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data.

This observation is consistent with the other qualities often attributed to LLMs due to their facility with language. When, statistically, the phrase “I love you” is followed by “I love you, too,” the LLM can easily repeat that — but it doesn’t mean it loves you. And although it can follow complex chains of reasoning it has been exposed to before, the fact that this chain can be broken by even superficial deviations suggests that it doesn’t actually reason so much as replicate patterns it has observed in its training data.

Mehrdad Farajtabar, one of the co-authors, breaks down the paper very nicely in this thread on X.

An OpenAI researcher, while commending Mirzadeh et al’s work, objected to their conclusions, saying that correct results could likely be achieved in all these failure cases with a bit of prompt engineering. Farajtabar (responding with the typical yet admirable friendliness researchers tend to employ) noted that while better prompting may work for simple deviations, the model may require exponentially more contextual data in order to counter complex distractions — ones that, again, a child could trivially point out.

Does this mean that LLMs don’t reason? Maybe. That they can’t reason? No one knows. These are not well-defined concepts, and the questions tend to appear at the bleeding edge of AI research, where the state of the art changes on a daily basis. Perhaps LLMs “reason,” but in a way we don’t yet recognize or know how to control.

It makes for a fascinating frontier in research, but it’s also a cautionary tale when it comes to how AI is being sold. Can it really do the things they claim, and if it does, how? As AI becomes an everyday software tool, this kind of question is no longer academic.

Work with us

We would love to hear more about your project

Let’s talk us