The Neural Network That Listens to Biological Noise — and Learns from It
A new Oxford framework teaches AI to treat biological variability not as error to be erased, but as a signal worth decoding.
Coral reef recovery on the Great Barrier Reef reveals noise that grows with population — and a neural net that figures t
The Problem with "Good Enough" Noise
When a biologist measures how many cells are in a flask, or how much coral has grown back on a reef, the measurement is never perfect. There is always noise — scatter around the true value. The question is: what kind of noise?
For decades, most mathematical models have assumed the simplest possible answer: the noise is constant. Measure a population of ten cells or ten thousand, and the wobble around the true value is the same size. Statisticians call this homoscedastic noise — same variance everywhere. It is clean, mathematically convenient, and almost always wrong in biology.
In reality, biological variability tends to scale with the thing being measured. A small colony of bacteria will have small absolute fluctuations; a large, dense population will have large ones. The noise grows with the signal. This is called heteroscedastic noise — different variance in different places. Ignoring this structure doesn't just muddy your uncertainty estimates; it can distort your understanding of the underlying biological mechanism entirely.
That is the core problem that Rebecca Crossley and Ruth Baker at the University of Oxford set out to fix (Crossley and Baker, 2026). Their paper introduces the NLL–BINN — a framework that teaches a neural network to simultaneously discover how a population grows and how the noise in its measurements scales, without being told either in advance.
The Science
To understand what NLL–BINNs do, it helps to understand what BINNs — Biologically-Informed Neural Networks — already did. BINNs belong to a family of hybrid models that blend machine learning with mechanistic structure. Rather than fitting a purely data-driven black box, BINNs embed known biological constraints directly into the training process. A standard BINN uses two neural networks in tandem: one approximates the smooth underlying trajectory of a system (say, a population size over time), and another approximates the unknown growth law governing that trajectory. Both are trained together, with a penalty that rewards solutions consistent with the governing differential equation.
The governing equation here is beautifully simple in form:
where is population density and is the crowding function — the term that encodes how growth slows as the population fills its available space. The biological question is: what is ? Is it linear (the logistic model)? Logarithmic (the Gompertz model)? Something in between (the Richards' model)? These three classical growth laws produce frustratingly similar S-shaped curves, making them hard to tell apart even with clean data. With noisy data, it's harder still.
What existing BINNs did not do was model the noise itself. They minimised mean squared error — which implicitly assumes constant Gaussian noise — and called it a day. Crossley and Baker change this by introducing a learnable noise model. Specifically, they adopt a power-law form for the noise magnitude:
Here, is a scale parameter and is the key exponent controlling how noise scales with population density. When , noise is purely additive and constant — the classic assumption. When , noise is multiplicative: the standard deviation scales linearly with the population, a common pattern in count data. When , you get something in between. Both and are treated as learnable parameters — the network figures them out from data alongside the growth law itself.
The training loss function reflects this probabilistic framing. Instead of minimising a mean squared error, the NLL–BINN minimises the negative log-likelihood (NLL) — the standard tool for fitting probabilistic models — under the power-law noise model. The total loss has three components: a data term (how well does the predicted trajectory match observations, accounting for the noise structure?), an ODE term (does the trajectory actually satisfy the governing differential equation?), and a biological term (is the population non-negative?). Training involves an ensemble of ten independent networks with different random seeds, reported as ensemble means and standard deviations to assess reliability.
The team tested the framework on synthetic data generated from all three growth models — logistic, Gompertz, and Richards' — across three noise regimes: additive (, \sigma_0 = 0.1$), intermediate ($\alpha = 0.5, \sigma_0 = 0.05$), and multiplicative ($\alpha = 1, $\sigma_0 = 0.2$). They also applied it to real coral reef regrowth observations from two sites near Lady Musgrave Island on the Great Barrier Reef — a case where replicates are unavailable and noise structure cannot be estimated by traditional means.
What They Found
The results are striking in their clarity. Across all three synthetic growth models and all three noise regimes, the NLL–BINN recovered both the underlying growth law and the noise exponent with high accuracy (Crossley and Baker, 2026). The ensemble mean trajectories closely tracked the ground truth. The learned crowding functions matched their known analytical forms — linear for logistic, logarithmic for Gompertz, nonlinear for Richards' — even though the network was never told which functional form to expect.
Noise Exponent α Recovered by NLL–BINN vs. Ground Truth
The NLL–BINN framework was tested on synthetic data from three growth models under three noise regimes. The true noise exponent α (0 = additive, 0.5 = intermediate, 1 = multiplicative) is compared to the regime applied in each experiment.
| Label | Value |
|---|---|
| Logistic (Additive) | 0 |
| Gompertz (Intermediate) | 0.5 |
| Richards' (Multiplicative) | 1 |
More importantly, the framework correctly identified which type of noise was present in each dataset. When the ground truth was additive noise ($\alpha = 0$), the network learned a value of near zero. When noise was multiplicative ($\alpha = 1$), the network converged to values near one. Intermediate noise ($\alpha = 0.5$) was recovered accurately too. The learned noise profiles matched the empirical variance in the observations closely — a non-trivial achievement given that the noise structure was inferred alongside the dynamics rather than estimated separately.
The uncertainty calibration results (Figure 3 in the paper) are perhaps the most practically significant finding. When the authors checked what proportion of true observations fell within the model's predicted confidence intervals at various coverage levels, the NLL–BINN was near-perfectly calibrated across all three noise regimes. If the model says "there's a 90% chance the true value falls in this range," the true value does so roughly 90% of the time. This is exactly what a well-calibrated probabilistic model should do — and it is exactly what a model trained with the wrong noise assumption cannot achieve.
Noise Scale Parameter σ₀ by Model and Regime
The σ₀ scale parameters used to generate synthetic data across three growth models and three noise regimes. These are the ground-truth values the NLL–BINN was tasked with recovering.
| Label | Value |
|---|---|
| Logistic / Additive (α=0) | 0.1 |
| Gompertz / Intermediate (α=0.5) | 0.05 |
| Richards' / Multiplicative (α=1) | 0.2 |
To assess whether the noise modelling actually improves mechanistic recovery (not just uncertainty estimates), the team compared NLL–BINN directly against a standard BINN trained with root mean squared error (RMSE) loss on multiplicative noise data — the most challenging case for a constant-noise assumption. They computed the root mean squared error between the true population trajectory and the trajectory generated by simulating forward using the learned growth law. The NLL–BINN consistently outperformed the RMSE-based BINN (Crossley and Baker, 2026). Ignoring noise structure doesn't just give you bad confidence intervals; it gives you a worse picture of the biology.
Three Noise Regimes: Key Parameters
Summary of the three synthetic noise configurations used to evaluate the NLL–BINN framework, showing the noise exponent α and scale σ₀ for each regime.
| Label | Value |
|---|---|
| Additive | 0 |
| Intermediate | 0.5 |
| Multiplicative | 1 |
The coral reef application is the real-world proof of concept. At two reef sites near Lady Musgrave Island — designated Site 1 and Site 3 — the framework was applied to coral cover recovery data, where there is only one measurement per time point (no replicates to estimate variance empirically). At both sites, the NLL–BINN inferred a noise model where variability increases with population density — consistent with multiplicative-style heteroscedastic noise. The framework also recovered smooth, biologically plausible growth trajectories and crowding functions, all from a single sparse time series with no ground truth to check against. The noise parameters were reported for each site, giving ecologists a quantitative window into measurement variability that was previously inaccessible.
Why This Changes Things
The deeper conceptual shift here is about what noise is in biological modelling. The conventional view treats noise as a nuisance: variation you subtract out, average over, or paper over with wide confidence bands. This paper argues, implicitly but forcefully, that noise structure is itself biological information. If the variance in your population measurements scales with population size, that tells you something real about the mechanisms generating variability — whether it's demographic stochasticity, measurement technology limitations, environmental heterogeneity, or ecological patchiness.
By learning the noise model rather than assuming it, the NLL–BINN framework makes that information legible. It converts a statistical artefact into a mechanistic signal.
This matters most for the kinds of data biologists actually collect. Ecological field data is sparse — you can't sample a reef every hour. Clinical trial data from patient cohorts has natural heteroscedasticity — variability in tumour growth or immune cell counts scales with baseline levels. Single-cell sequencing data has noise that depends on cell state. In all these settings, the standard MSE assumption is not just slightly wrong; it is structurally wrong in ways that propagate into parameter estimates and mechanistic conclusions.
The comparison with existing methods is also instructive. Sparse identification of nonlinear dynamics (SINDy) and symbolic regression can discover governing equations, but they typically require pre-specified function libraries or relatively clean data. Gaussian process approaches offer principled uncertainty quantification but can struggle to enforce mechanistic constraints. Full Bayesian inference over neural ODEs is principled but computationally expensive. NLL–BINNs occupy a practical middle ground: more flexible than library-based methods, more computationally tractable than full Bayesian approaches, and more mechanistically grounded than purely statistical models. For the kinds of sparse, noisy, biologically constrained datasets that ecologists, cell biologists, and epidemiologists routinely work with, that is a genuinely useful position to occupy.
The three growth models used — logistic, Gompertz, and Richards' — were chosen precisely because they are hard to distinguish. They produce similar S-shaped population curves; their differences lie in the curvature of the crowding function, which only becomes apparent in regions of the data where populations are changing most rapidly. The fact that NLL–BINN recovers the correct functional form across all three, under three different noise regimes, suggests the approach is robust to the combination of model ambiguity and noise confounding that makes real biological data so challenging.
What's Next
The framework as presented has several natural extensions worth noting. The power-law noise model is flexible but still parametric — it assumes a particular functional family for the noise structure. Crossley and Baker acknowledge that alternative functional forms could be used where appropriate, but the question of how to choose between noise models, or learn a fully nonparametric noise structure, remains open. A neural network parameterisation of itself — a network that learns the shape of the noise without assuming a power law — would be a natural next step, albeit one that introduces additional training complexity.
The framework currently uses Gaussian noise as its distributional assumption. This is appropriate for many biological measurements, but count data — cell counts, organism tallies, read depths in sequencing — often follows Poisson or negative binomial distributions. The NLL–BINN architecture is general enough to accommodate these alternatives by swapping the likelihood function, but this has not yet been demonstrated.
The coral reef application hints at the most exciting direction: deploying this framework on real, messy, unreplicated field data where mechanistic understanding is genuinely limited. The Great Barrier Reef results show that the framework can extract meaningful signal even without ground truth — which is, after all, the condition under which most real science operates. Scaling this to multi-site ecological monitoring, longitudinal clinical studies, or spatiotemporal reaction-diffusion systems (the original BINN application domain) would substantially broaden the framework's reach.
There are also open questions about identifiability. The three growth models tested here are distinguishable in principle, but in practice, with finite and noisy data, the learned crowding functions carry uncertainty — particularly at low population densities, where the ODE is difficult to constrain. The paper notes this limitation honestly: predictions in the near-zero density regime are less reliable, because the training data provide little information there. Understanding the fundamental limits of what these frameworks can distinguish — and under what data conditions — will be important for practitioners who want to deploy them with appropriate confidence.
What the paper establishes clearly is that the assumption of constant noise is not a neutral default. It is a choice with consequences. When the noise structure in your data tells a story about biology — about how variability scales with population size, about how measurement error relates to the underlying state — ignoring that structure means leaving information on the table. The NLL–BINN framework, with its simple but powerful extension to a learnable power-law noise model, offers a principled way to stop ignoring it.
In a field where data are precious and mechanisms are elusive, that is a meaningful advance.
Rather than treating uncertainty as a by-product of modelling, the NLL–BINN framework repositions it as an integral component of mechanistic inference.
Sign in to join the conversation.
Comments (0)
No comments yet. Be the first to share your thoughts.