Machine learning without prior training

New algorithm identifies disease-linked changes

Zeping Mao was staring at a digital forest of protein data when the breakthrough clicked—not with a eureka, but with a quiet hum of code uncovering what no one had told it to look for. At the University of Waterloo, Mao and his team have developed RNovA, a machine-learning algorithm that can detect previously unknown protein modifications linked to diseases like cancer and Alzheimer’s—without needing to be trained on what to expect. In a field where discovery often depends on knowing what you’re hunting for, RNovA hunts in the dark and still finds something.

Proteins are the workhorses of our cells, and after they’re built, they’re often chemically tweaked in ways that change how they function. These tweaks, called post-translational modifications (PTMs), are crucial to cellular life—but when they go wrong, they can signal or even drive serious diseases. For decades, scientists have relied on mass spectrometry and reference databases to identify PTMs, but those methods only catch what’s already known. If a modification is rare, unexpected, or absent from the database, it slips through the cracks. “It’s like trying to solve a puzzle but only being able to see a few pieces,” Mao explains.

RNovA changes that. Unlike traditional tools, it doesn’t need labeled training data or a predefined list of modifications. Instead, it uses a zero-shot approach—meaning it can identify entirely new PTMs on the fly, from scratch. In tests using spiked-in synthetic peptides, RNovA demonstrated high accuracy in detecting open PTMs, even those never before cataloged. Published in Nature Biotechnology (2026), the research marks a leap not just for disease research but for the role of AI in basic science.

The implications are profound. By expanding the known universe of PTMs, RNovA could uncover new biomarkers for early disease detection—like a hidden signature of cancer appearing in a blood sample years before symptoms arise. It could also accelerate drug development by revealing new therapeutic targets. For biologists, it’s like being handed a flashlight in a room that was always assumed to be empty.

This isn’t just about efficiency; it’s about possibility. Traditional PTM analysis is slow and expensive, often requiring specialized labs and painstaking calibration. RNovA offers a faster, cheaper alternative that scales across datasets and diseases. The team envisions it becoming a standard tool in proteomics labs worldwide, democratizing access to cutting-edge discovery.

As machine learning continues to reshape science, tools like RNovA remind us that the most powerful algorithms aren’t just faster—they’re more curious. They don’t wait for permission to find what’s missing. And in the tangled world of human biology, that curiosity might just lead to cures we haven’t yet imagined.

New algorithm identifies disease-linked changes in cells without prior training