Gabriel Rocklin and his team at Northwestern Medicine have just cracked open a door that structural biologists have been trying to unlock for years: they can now watch thousands of proteins dance at once, revealing the secret choreography that determines whether a protein stays healthy or contributes to disease.
The breakthrough matters because proteins are restless by nature. Every folded protein in your body is constantly shifting between different shapes — from its preferred stable form to higher-energy shapes that appear only briefly and rarely, yet profoundly influence how that protein behaves, interacts with other molecules, and sometimes misfolds into dangerous aggregates. Until now, scientists could only study these conformational fluctuations one protein at a time, and even then, the rare, high-energy states were nearly invisible to existing tools. It's like trying to understand traffic patterns by watching one car on a highway.
The Northwestern team developed a method called multiplexed hydrogen-deuterium exchange mass spectrometry (mHDX-MS) that flips this limitation on its head. Using DNA oligo pool library synthesis, they produced customized synthetic proteomes containing up to 1,300 small protein domains in a single mixture, each domain between 28 and 64 amino acids in length. They then analyzed these mixtures using mHDX-MS, which measures how fast individual amino acid residues transition between closed conformations and higher-energy open conformations — data that couldn't be detected before. The result, published in Nature, is the first large-scale experimental map of protein energy landscapes.
The numbers are staggering. The scientists measured the opening energy distributions of more than 5,700 protein domains from ten domain families, revealing patterns that had remained hidden. The dataset exposed striking differences in energy landscapes between protein sequences with the same overall fold, showed how domains with identical global folding stability can behave completely differently, and identified systematic differences between entire domain families. Machine learning analysis then helped identify common determinants of energy landscapes across this vast range of sequences.
What makes this work transformative is its practical reach. "Previously, we could study one protein at a time, but we couldn't look at tens or hundreds of these proteins to analyze protein dynamics in parallel," said Állan Ramos Ferrari, the study's lead author. Now researchers can examine conformational fluctuations for thousands of different protein sequences — a capacity that was simply unimaginable a few months ago. For drug designers and biotech engineers, this is a game-changer. When designing a new therapeutic or a biosensor, the question has always been: which amino acid sequence will work best? The old answer was trial-and-error, guided by computational models that struggled to predict rare, high-energy states.
The immediate applications cascade outward. Researchers can now take known disease-causing mutations in protein families and ask directly how those mutations alter protein dynamics — why one variant triggers disease while a nearly identical sequence remains benign. The dataset itself becomes a training ground for better artificial intelligence models of protein behavior, improving the accuracy of computational predictions. And for protein engineering, teams can now rapidly test thousands of sequence variations to find the one that matches their specific function.
"There's an unlimited number of possible different combinations of these amino acids," Rocklin noted, capturing the scale of possibility now within reach. The multiplexed method doesn't just answer old questions faster — it asks entirely new ones that were previously off-limits.
