Telling AI model to “take a deep breath” causes math scores to soar in study

Although they aren’t human, because AI chatbots have been trained on so much human-written text, they can perform better when invited to ‘take a deep breath’ according to this recent report in Ars Technica:

In a paper called “Large Language Models as Optimizers” listed this month on arXiv, DeepMind scientists introduced Optimization by PROmpting (OPRO), a method to improve the performance of large language models (LLMs) such as OpenAI’s ChatGPT and Google’s PaLM 2. This new approach sidesteps the limitations of traditional math-based optimizers by using natural language to guide LLMs in problem-solving. “Natural language” is a fancy way of saying everyday human speech…

Interestingly, in this latest study, DeepMind researchers found “Take a deep breath and work on this problem step by step” to be the most effective prompt when used with Google’s PaLM 2 language model. The phrase achieved the top accuracy score of 80.2 percent in tests against GSM8K, which is a data set of grade-school math word problems. By comparison, PaLM 2, without any special prompting, scored only 34 percent accuracy on GSM8K, and the classic “Let’s think step by step” prompt scored 71.8 percent accuracy.

So when you aren’t getting the answers you want from an AI chatbot, maybe it’s time for both of you to ‘take a deep breath’?

Read the full article here.

Leave a comment