Researchers at Mila have introduced a novel technique called Markovian Thinking, which aims to revolutionize the efficiency of large language models (LLMs) in complex reasoning tasks. This innovative approach, detailed in a recent paper, enables LLMs to engage in extended reasoning without the exorbitant computational costs that typically accompany such processes.
The key concept behind Markovian Thinking is the restructuring of the reasoning chain into fixed-size chunks within an environment known as Delethink. By breaking down the scaling issue that affects lengthy LLM responses, this method has demonstrated the potential to reduce training costs significantly, with initial estimates suggesting up to a two-thirds decrease for a 1.5B parameter model compared to traditional approaches.
One of the primary challenges that this technique addresses is the quadratic growth problem associated with long-chain reasoning in LLMs. By separating the model’s thinking duration from the amount of context it processes, the Markovian Thinker paradigm transforms the quadratic computational cost into linear requirements, offering a more efficient and effective approach to LLM reasoning.
Through Delethink, the model reasons in sequential fixed-size chunks, each containing 8,000 tokens, maintaining a constant reasoning context window. As a result, the AI learns to embed a summary of its progress, ensuring continuity in its reasoning across different chunks without modifying the original input prompt.
The implications of this novel approach are significant, particularly in enterprise applications where efficiency and performance are paramount. By enabling models to reason for longer durations with reduced computational costs, Markovian Thinking paves the way for next-generation AI capabilities, potentially unlocking the ability for models to ‘think’ for millions of tokens and driving advancements in scientific discovery.
Source: VentureBeat