Chain-of-Thought Prompting

Chain-of-Thought (CoT) consists in prompting Large Language Models (LLMs) to replicate how humans solve complex problems: breaking them down into a sequence of smaller, easier steps. This approach improves problem-solving skills and leads to more precise and reliable answers [1].
Imagine a coin placed heads up on a table. You need to track if the coin stays heads up after several participants either flip it or leave it as is. Each participant announces their choice to flip the coin or not. For example, if Jim flips the coin and Alice does not, the coin will... not be heads up, obviously. How many participants' actions can you keep track of As each participant takes their turn, you update your mental representation of the coin based on their choice. You could surely make it to at least the 100th round.
LLMs fail at round 2.

This is because LLMs lack the intuition to break tasks into small, easy steps. Instead, they merge all the information to generate an answer. It's up to youto use CoT to provide them with a sequence of easy steps.

How ?

To write a CoT prompt, consider how you would approach the problem yourself. Note that you don't need to solve the problem, but just to plan how to tackle it. In your prompt, use this plan to solve an alternative version of the problem to demonstrate the CoT to the LLM. Then, present your actual problem.

Example prompt:

Alternative problem: A coin is heads up. Jim flips it. Tom does not flip it. Is the coin still heads up ?

Answer:
  • The coin starts heads up.
  • Jim flips the coin, so the state changes to tails up.
  • Tom does not flip the coin, so the state remains tails up.

Therefore, after all the actions are performed, the coin is tails up. So the answer is no.

Actual problem: A coin is heads up. Jim does not flip it. Tom flips it. Alice flips it. Is the coin still heads up ?

When ?

CoT enhances the ability of LLMs to manage multi-step reasoning processes. It is useful for tasks too complex for the LLM to solve directly, which can be broken down into shorter, easier subtasks.

  • Arithmetic reasoning requires understanding and interpreting numerical information and relationships within a given context to find a solution. An example of arithmetic reasoning is: "Alice consumed 300 kilocalories at breakfast. She eats two meals during the day, worth 600 kilocalories each. How many kilocalories has Alice consumed in total?" The answer is 300 + 1200 = 1500 kilocalories.
  • Commonsense problems involve using general knowledge and everyday reasoning to evaluate the plausibility of situations or statements. For instance, a common sense prompt might ask if the following statement is plausible: "Rafael Nadal scored 26 points in the 2018 NBA finals." This is not plausible, as Rafael Nadal plays tennis, not basketball.
  • Symbolic reasoning requires tracking abstract representations of numbers and logical relationships. A common example is last letter concatenation: given a person's full name, e.g., "John Doe," concatenate the last letters of each word to obtain "ne."

Pro tips

CoT prompts can sometimes be adjusted to use fewer tokens while maintaining most of their reliability. Here are some common techniques to achieve this.

Improved plan: The LLM's CoT is only as good as your own. You might find success in instructing the LLM to follow clever plans. For instance, in the coin flipping problem, note that the coin remains heads up if and only if the number of flips is even. Hence, an improved CoT might be:

Alternative problem: A coin is heads up. Jim flips it. Tom does not flip it. Is the coin still heads up ?

Answer: The coin was flipped by Jim. So the coin was flipped 1 time, which is an odd number. The coin started heads up, so after an odd number of flips, it is not heads up. So the answer is no.

Actual problem: A coin is heads up. Jim does not flip it. Tom flips it. Alice flips it. Is the coin still heads up ?

Concise prompting: Intentionally concise writing styles can help reduce token consumption, and even improve reliability.

Example: Consider the arithmetic reasoning problem: "Alice consumed 300 kilocalories at breakfast. She eats two burgers during the day, each providing her with 600 kilocalories. How many kilocalories has Alice consumed in total?"

A typical CoT prompt could be: "Alice started with 300 kilocalories. Two burgers worth 600 kilocalories each equals 1200 kilocalories. 300 + 1200 = 1500. The answer is 1500 kilocalories."

Reducing this CoT to: "2 * 600 = 1200 kilocalories from the burgers. So Alice consumed 1200 + 300 = 1500 kilocalories in total." yields the same answer.

Equation only: For arithmetic reasoning problems involving one or two steps, CoT prompts reduced to the equation modeling the problem may suffice. However, thorough testing is advised. Since this technique prevents the LLM from leveraging its natural language abilities, the reliability of these prompts is not guaranteed.

Example: Given the previous arithmetic reasoning example, an equation only CoT would be: "300 + 2 * 600 = 1200."