DeepMind’s RETRO Retrieval-Enhanced Transformer Retrieves from Trillions of Tokens, Achieving Performance Comparable to GPT-3 With 25× Fewer Parameters

By Storm Warden · March 16, 2026 · 1 min read

ai
machine learning & data science
research
ai
artificial intelligence

A DeepMind research team proposes RETRO (Retrieval-Enhanced Transformer), an enhanced auto-regressive language model that conditions on document chunks retrieved from a large corpus and achieves performance comparable to GPT-3 and Jurassic-1 on the Pile dataset while using 25× fewer parameters.