In a laboratory in San Sebastián, Spain, Borja Aizpurua and his team at Multiverse Computing have quietly solved a problem that has been gnawing at the artificial intelligence industry: how to make language models smarter without making them prohibitively expensive to run. Their solution comes from an unexpected direction—quantum computing—and it works by thinking differently about how AI learns.
Large language models have become the backbone of everyday AI. ChatGPT, Claude, and dozens of other systems now rely on learning billions, or even trillions, of adjustable parameters—the mathematical knobs and dials that determine how these models understand and generate human language. More parameters generally mean better performance. But there is a catch. Each parameter needs physical memory to store, and the mathematics of scaling is brutal. GPT-5.5, for instance, is estimated to require somewhere between two and five trillion parameters. The infrastructure cost and energy demands of training and running such models are becoming so steep that they threaten to limit how far the field can advance.
Rather than adding even more classical computing power to the problem, Aizpurua's team took a different approach. They inserted small quantum circuit blocks directly into the architecture of a pre-trained large language model. These quantum components act as mathematical compressors—they can encode complex relationships in a far more compact form than traditional parameters would require. The resulting system is hybrid: the original model runs on standard hardware, while the quantum pieces execute on IBM's 156-qubit superconducting processor.
When the team tested their method on Llama 3.1 8B, Meta's eight-billion-parameter model, they achieved a 1.4 percent reduction in perplexity—a key measure of how reliably a model can predict the next word in a sequence—while adding just 6,000 extra parameters. To put that in perspective, the increase amounted to less than one ten-thousandth of a percent of the original model's size. It is a reminder that sometimes the most elegant solutions are the most efficient ones.
They also tested their platform on SmolLM2, a smaller model with 135 million parameters, chosen because it allowed them to study the effects more systematically. The results were promising: performance improved consistently as the quantum components grew larger, and the quantum-enhanced version was able to answer questions correctly that two purely classical versions of the same model had gotten wrong.
The researchers are candid about the current limitations. The performance gains, while real, are modest by today's standards, and what they have achieved reflects the constraints of quantum hardware as it exists right now. But by demonstrating that quantum enhancement can work at all on a widely used, real-world model, they have opened a door. As quantum processors become more powerful and reliable over the coming years, the improvements are likely to scale dramatically. It is the kind of fundamental breakthrough that could reshape how the AI industry approaches its most pressing problem: building more capable systems without the runaway costs that currently threaten to define the field's future.
