3 — Additive (α=0), intermediate (α=0.5), multiplicative (α=1) noise on 3 growth models

Growth models evaluated

3 — Logistic, Gompertz, and Richards' — all producing similar S-curves but with distinct crowding mechanisms

26 — Uniformly spaced time points over t ∈ [0, 100], representing sparse biological data conditions

10 — Independent BINNs per experiment with different random seeds, for robust uncertainty quantification

4,000 — Per ensemble member, using Adam optimiser with learning rate 10⁻³

Real-world application

Great Barrier Reef — Coral regrowth at Lady Musgrave Island — no replicates available; noise structure inferred from single time series

Neural Networks That Learn Biological Noise

The Problem with "Good Enough" Noise

When a biologist measures how many cells are in a flask, or how much coral has grown back on a reef, the measurement is never perfect. There is always noise — scatter around the true value. The question is: what kind of noise?

For decades, most mathematical models have assumed the simplest possible answer: the noise is constant. Measure a population of ten cells or ten thousand, and the wobble around the true value is the same size. Statisticians call this homoscedastic noise — same variance everywhere. It is clean, mathematically convenient, and almost always wrong in biology.

In reality, biological variability tends to scale with the thing being measured. A small colony of bacteria will have small absolute fluctuations; a large, dense population will have large ones. The noise grows with the signal. This is called heteroscedastic noise — different variance in different places. Ignoring this structure doesn't just muddy your uncertainty estimates; it can distort your understanding of the underlying biological mechanism entirely.

That is the core problem that Rebecca Crossley and Ruth Baker at the University of Oxford set out to fix (Crossley and Baker, 2026). Their paper introduces the NLL–BINN — a framework that teaches a neural network to simultaneously discover how a population grows and how the noise in its measurements scales, without being told either in advance.

The Science

To understand what NLL–BINNs do, it helps to understand what BINNs — Biologically-Informed Neural Networks — already did. BINNs belong to a family of hybrid models that blend machine learning with mechanistic structure. Rather than fitting a purely data-driven black box, BINNs embed known biological constraints directly into the training process. A standard BINN uses two neural networks in tandem: one approximates the smooth underlying trajectory of a system (say, a population size over time), and another approximates the unknown growth law governing that trajectory. Both are trained together, with a penalty that rewards solutions consistent with the governing differential equation.

The governing equation here is beautifully simple in form:

$\frac{d u}{d t} = u g (u)$

where $u (t)$ is population density and $g (u)$ is the crowding function — the term that encodes how growth slows as the population fills its available space. The biological question is: what is $g (u)$ ? Is it linear (the logistic model)? Logarithmic (the Gompertz model)? Something in between (the Richards' model)? These three classical growth laws produce frustratingly similar S-shaped curves, making them hard to tell apart even with clean data. With noisy data, it's harder still.

What existing BINNs did not do was model the noise itself. They minimised mean squared error — which implicitly assumes constant Gaussian noise — and called it a day. Crossley and Baker change this by introducing a learnable noise model. Specifically, they adopt a power-law form for the noise magnitude:

$σ (u) = σ_{0} ∣ u ∣^{α}$

Here, $σ_{0} > 0$ is a scale parameter and $α$ is the key exponent controlling how noise scales with population density. When $α = 0$ , noise is purely additive and constant — the classic assumption. When $α = 1$ , noise is multiplicative: the standard deviation scales linearly with the population, a common pattern in count data. When $α = 0.5$ , you get something in between. Both $σ_{0}$ and $α$ are treated as learnable parameters — the network figures them out from data alongside the growth law itself.

The training loss function reflects this probabilistic framing. Instead of minimising a mean squared error, the NLL–BINN minimises the negative log-likelihood (NLL) — the standard tool for fitting probabilistic models — under the power-law noise model. The total loss has three components: a data term (how well does the predicted trajectory match observations, accounting for the noise structure?), an ODE term (does the trajectory actually satisfy the governing differential equation?), and a biological term (is the population non-negative?). Training involves an ensemble of ten independent networks with different random seeds, reported as ensemble means and standard deviations to assess reliability.

Figure 1: Plots demonstrating the recovery of population dynamics, growth laws, and noise structure using the NLL–BINN framework for synthetic data generated from the logistic, Gompertz, and Richards’ models (Section 3) with parameters rr, KK, u0u_{0}, and ν\nu as specified in Section 3.1. Noise was added according to Equation (3), using α=0\alpha=0 and σ0=0.1\sigma_{0}=0.1 for the Logistic growth model, α=0.5\alpha=0.5 and σ0=0.05\sigma_{0}=0.05 for the Gompertz model, and α=1\alpha=1 with σ0=0.2\sigma_{0}=0.2 for the Richards’ model.
For clarity, the logistic, Gompertz, and Richards datasets shown in this figure also correspond to the additive, intermediate, and multiplicative noise models shown in Figures 2 and 3, respectively.
Top row: mean learned trajectory uθ(t)u_{\theta}(t) in solid orange lines, with one standard deviation (light blue) and noisy observations from all ten replicates (grey). Middle row: inferred crowding functions gϕ(u)g_{\phi}(u) with the mean shown in orange, the truth as a dashed blue line and the variability shown by blue shading. Bottom row: learned noise profiles, with the empirical results as blue dots and the average learned noise model across all repetitions in orange. — Figure 1: Plots demonstrating the recovery of population dynamics, growth laws, and noise structure using the NLL–BINN framework for synthetic data generated from the logistic, Gompertz, and Richards’ models (Section 3) with parameters rr, KK, u0u_{0}, and ν\nu as specified in Section 3.1. Noise was added according to Equation (3), using α=0\alpha=0 and σ0=0.1\sigma_{0}=0.1 for the Logistic growth model, α=0.5\alpha=0.5 and σ0=0.05\sigma_{0}=0.05 for the Gompertz model, and α=1\alpha=1 with σ0=0.2\sigma_{0}=0.2 for the Richards’ model. For clarity, the logistic, Gompertz, and Richards datasets shown in this figure also correspond to the additive, intermediate, and multiplicative noise models shown in Figures 2 and 3, respectively. Top row: mean learned trajectory uθ(t)u_{\theta}(t) in solid orange lines, with one standard deviation (light blue) and noisy observations from all ten replicates (grey). Middle row: inferred crowding functions gϕ(u)g_{\phi}(u) with the mean shown in orange, the truth as a dashed blue line and the variability shown by blue shading. Bottom row: learned noise profiles, with the empirical results as blue dots and the average learned noise model across all repetitions in orange. Source: Rebecca M. Crossley, Ruth E. Baker

The team tested the framework on synthetic data generated from all three growth models — logistic, Gompertz, and Richards' — across three noise regimes: additive ( $α = 0$ , $\sigma_0 = 0.1$), intermediate ($\alpha = 0.5$ , $\sigma_0 = 0.05$), and multiplicative ($\alpha = 1$ , $\sigma_0 = 0.2$). They also applied it to real coral reef regrowth observations from two sites near Lady Musgrave Island on the Great Barrier Reef — a case where replicates are unavailable and noise structure cannot be estimated by traditional means.

What They Found

The results are striking in their clarity. Across all three synthetic growth models and all three noise regimes, the NLL–BINN recovered both the underlying growth law and the noise exponent $α$ with high accuracy (Crossley and Baker, 2026). The ensemble mean trajectories closely tracked the ground truth. The learned crowding functions $g_{ϕ} (u)$ matched their known analytical forms — linear for logistic, logarithmic for Gompertz, nonlinear for Richards' — even though the network was never told which functional form to expect.

Noise Exponent α Recovered by NLL–BINN vs. Ground Truth

The NLL–BINN framework was tested on synthetic data from three growth models under three noise regimes. The true noise exponent α (0 = additive, 0.5 = intermediate, 1 = multiplicative) is compared to the regime applied in each experiment.

Noise Exponent α Recovered by NLL–BINN vs. Ground Truth
Label	Value
Logistic (Additive)	0
Gompertz (Intermediate)	0.5
Richards' (Multiplicative)	1

More importantly, the framework correctly identified which type of noise was present in each dataset. When the ground truth was additive noise ($\alpha = 0$), the network learned a value of $α$ near zero. When noise was multiplicative ($\alpha = 1$), the network converged to values near one. Intermediate noise ($\alpha = 0.5$) was recovered accurately too. The learned noise profiles matched the empirical variance in the observations closely — a non-trivial achievement given that the noise structure was inferred alongside the dynamics rather than estimated separately.

Figure 2: Left: learned variance σ2(u)\sigma^{2}(u) as a function of the population density, uu, for ten replicated synthetic datasets generated from the logistic, Gompertz, and Richards’ models (Section 3) with parameters rr, KK, u0u_{0}, and ν\nu as specified in Section 3.1.
Noise was added according to Equation (3), using α=0\alpha=0 (additive, blue), α=0.5\alpha=0.5 (intermediate, green), and α=1\alpha=1 (multiplicative, red).
For clarity, the logistic, Gompertz, and Richards datasets shown in Figure 1 correspond, respectively, to the additive, intermediate, and multiplicative noise models shown in this figure and Figure 3.
Right: inferred noise parameters (σ0,α\sigma_{0},\,\alpha) across ten ensemble runs plotted against their true values (additive in blue, multiplicative in red and intermediate in green). True α\alpha values are plotted as dashed lines, whereas true σ0\sigma_{0} values are plotted as dotted lines. — Figure 2: Left: learned variance σ2(u)\sigma^{2}(u) as a function of the population density, uu, for ten replicated synthetic datasets generated from the logistic, Gompertz, and Richards’ models (Section 3) with parameters rr, KK, u0u_{0}, and ν\nu as specified in Section 3.1. Noise was added according to Equation (3), using α=0\alpha=0 (additive, blue), α=0.5\alpha=0.5 (intermediate, green), and α=1\alpha=1 (multiplicative, red). For clarity, the logistic, Gompertz, and Richards datasets shown in Figure 1 correspond, respectively, to the additive, intermediate, and multiplicative noise models shown in this figure and Figure 3. Right: inferred noise parameters (σ0,α\sigma_{0},\,\alpha) across ten ensemble runs plotted against their true values (additive in blue, multiplicative in red and intermediate in green). True α\alpha values are plotted as dashed lines, whereas true σ0\sigma_{0} values are plotted as dotted lines. Source: Rebecca M. Crossley, Ruth E. Baker

The uncertainty calibration results (Figure 3 in the paper) are perhaps the most practically significant finding. When the authors checked what proportion of true observations fell within the model's predicted confidence intervals at various coverage levels, the NLL–BINN was near-perfectly calibrated across all three noise regimes. If the model says "there's a 90% chance the true value falls in this range," the true value does so roughly 90% of the time. This is exactly what a well-calibrated probabilistic model should do — and it is exactly what a model trained with the wrong noise assumption cannot achieve.

Noise Scale Parameter σ₀ by Model and Regime

The σ₀ scale parameters used to generate synthetic data across three growth models and three noise regimes. These are the ground-truth values the NLL–BINN was tasked with recovering.

Noise Scale Parameter σ₀ by Model and Regime
Label	Value
Logistic / Additive (α=0)	0.1
Gompertz / Intermediate (α=0.5)	0.05
Richards' / Multiplicative (α=1)	0.2

To assess whether the noise modelling actually improves mechanistic recovery (not just uncertainty estimates), the team compared NLL–BINN directly against a standard BINN trained with root mean squared error (RMSE) loss on multiplicative noise data — the most challenging case for a constant-noise assumption. They computed the root mean squared error between the true population trajectory and the trajectory generated by simulating forward using the learned growth law. The NLL–BINN consistently outperformed the RMSE-based BINN (Crossley and Baker, 2026). Ignoring noise structure doesn't just give you bad confidence intervals; it gives you a worse picture of the biology.

Figure 4:
Plot showing the distribution of the root mean squared error between the true underlying dynamics and the mechanistically simulated solutions using the learned growth dynamics across ten replicated synthetic datasets generated from the logistic, Gompertz, and Richards’ models (Section 3) with parameters rr, KK, u0u_{0}, and ν\nu as specified in Section 3.1. Noise was added according to Equation (3), using α=1\alpha=1 and σ0=0.2\sigma_{0}=0.2.
Results are tested for both the NLL–BINN introduced in this work (orange) and a standard BINN, trained using a root mean squared error (RMSE, green) for the data loss.
A seed is set for each repetition such that the data split into training, test and validation subsets is the same for both BINNs, as well as the initialisation.
Thereafter, differences in the training process and overall results are driven purely by the differences in the data loss functions for the BINNs (NLL, or RMSE). — Figure 4: Plot showing the distribution of the root mean squared error between the true underlying dynamics and the mechanistically simulated solutions using the learned growth dynamics across ten replicated synthetic datasets generated from the logistic, Gompertz, and Richards’ models (Section 3) with parameters rr, KK, u0u_{0}, and ν\nu as specified in Section 3.1. Noise was added according to Equation (3), using α=1\alpha=1 and σ0=0.2\sigma_{0}=0.2. Results are tested for both the NLL–BINN introduced in this work (orange) and a standard BINN, trained using a root mean squared error (RMSE, green) for the data loss. A seed is set for each repetition such that the data split into training, test and validation subsets is the same for both BINNs, as well as the initialisation. Thereafter, differences in the training process and overall results are driven purely by the differences in the data loss functions for the BINNs (NLL, or RMSE). Source: Rebecca M. Crossley, Ruth E. Baker

Three Noise Regimes: Key Parameters

Summary of the three synthetic noise configurations used to evaluate the NLL–BINN framework, showing the noise exponent α and scale σ₀ for each regime.

Three Noise Regimes: Key Parameters
Label	Value
Additive	0
Intermediate	0.5
Multiplicative	1

The coral reef application is the real-world proof of concept. At two reef sites near Lady Musgrave Island — designated Site 1 and Site 3 — the framework was applied to coral cover recovery data, where there is only one measurement per time point (no replicates to estimate variance empirically). At both sites, the NLL–BINN inferred a noise model where variability increases with population density — consistent with multiplicative-style heteroscedastic noise. The framework also recovered smooth, biologically plausible growth trajectories and crowding functions, all from a single sparse time series with no ground truth to check against. The noise parameters $(σ_{0}, α)$ were reported for each site, giving ecologists a quantitative window into measurement variability that was previously inaccessible.

Figure 5: Application of the NLL–BINN framework to coral re-growth data from two reef sites near Lady Musgrave Island, Australia (top: Site 1; bottom: Site 3). Left column: learned trajectories uθ(t)u_{\theta}(t) (solid blue) fitted to the observed data (points), showing sigmoid growth curves.
Middle column: inferred crowding functions gϕ(u)g_{\phi}(u), capturing the density-dependent growth behaviour without prescribing a functional form.
Right column: learned state-dependent noise models σ(u)\sigma(u), with shaded regions indicating variability across the ensemble. In both sites, the inferred noise increases with population density, consistent with heteroscedastic variability.
Estimated noise parameters (σ0,α)(\sigma_{0},\alpha) are shown for each site. — Figure 5: Application of the NLL–BINN framework to coral re-growth data from two reef sites near Lady Musgrave Island, Australia (top: Site 1; bottom: Site 3). Left column: learned trajectories uθ(t)u_{\theta}(t) (solid blue) fitted to the observed data (points), showing sigmoid growth curves. Middle column: inferred crowding functions gϕ(u)g_{\phi}(u), capturing the density-dependent growth behaviour without prescribing a functional form. Right column: learned state-dependent noise models σ(u)\sigma(u), with shaded regions indicating variability across the ensemble. In both sites, the inferred noise increases with population density, consistent with heteroscedastic variability. Estimated noise parameters (σ0,α)(\sigma_{0},\alpha) are shown for each site. Source: Rebecca M. Crossley, Ruth E. Baker

Why This Changes Things

The deeper conceptual shift here is about what noise is in biological modelling. The conventional view treats noise as a nuisance: variation you subtract out, average over, or paper over with wide confidence bands. This paper argues, implicitly but forcefully, that noise structure is itself biological information. If the variance in your population measurements scales with population size, that tells you something real about the mechanisms generating variability — whether it's demographic stochasticity, measurement technology limitations, environmental heterogeneity, or ecological patchiness.

By learning the noise model rather than assuming it, the NLL–BINN framework makes that information legible. It converts a statistical artefact into a mechanistic signal.

This matters most for the kinds of data biologists actually collect. Ecological field data is sparse — you can't sample a reef every hour. Clinical trial data from patient cohorts has natural heteroscedasticity — variability in tumour growth or immune cell counts scales with baseline levels. Single-cell sequencing data has noise that depends on cell state. In all these settings, the standard MSE assumption is not just slightly wrong; it is structurally wrong in ways that propagate into parameter estimates and mechanistic conclusions.

The comparison with existing methods is also instructive. Sparse identification of nonlinear dynamics (SINDy) and symbolic regression can discover governing equations, but they typically require pre-specified function libraries or relatively clean data. Gaussian process approaches offer principled uncertainty quantification but can struggle to enforce mechanistic constraints. Full Bayesian inference over neural ODEs is principled but computationally expensive. NLL–BINNs occupy a practical middle ground: more flexible than library-based methods, more computationally tractable than full Bayesian approaches, and more mechanistically grounded than purely statistical models. For the kinds of sparse, noisy, biologically constrained datasets that ecologists, cell biologists, and epidemiologists routinely work with, that is a genuinely useful position to occupy.

The three growth models used — logistic, Gompertz, and Richards' — were chosen precisely because they are hard to distinguish. They produce similar S-shaped population curves; their differences lie in the curvature of the crowding function, which only becomes apparent in regions of the data where populations are changing most rapidly. The fact that NLL–BINN recovers the correct functional form across all three, under three different noise regimes, suggests the approach is robust to the combination of model ambiguity and noise confounding that makes real biological data so challenging.

What's Next

The framework as presented has several natural extensions worth noting. The power-law noise model $σ (u) = σ_{0} ∣ u ∣^{α}$ is flexible but still parametric — it assumes a particular functional family for the noise structure. Crossley and Baker acknowledge that alternative functional forms could be used where appropriate, but the question of how to choose between noise models, or learn a fully nonparametric noise structure, remains open. A neural network parameterisation of $σ (u)$ itself — a network that learns the shape of the noise without assuming a power law — would be a natural next step, albeit one that introduces additional training complexity.

The framework currently uses Gaussian noise as its distributional assumption. This is appropriate for many biological measurements, but count data — cell counts, organism tallies, read depths in sequencing — often follows Poisson or negative binomial distributions. The NLL–BINN architecture is general enough to accommodate these alternatives by swapping the likelihood function, but this has not yet been demonstrated.

The coral reef application hints at the most exciting direction: deploying this framework on real, messy, unreplicated field data where mechanistic understanding is genuinely limited. The Great Barrier Reef results show that the framework can extract meaningful signal even without ground truth — which is, after all, the condition under which most real science operates. Scaling this to multi-site ecological monitoring, longitudinal clinical studies, or spatiotemporal reaction-diffusion systems (the original BINN application domain) would substantially broaden the framework's reach.

There are also open questions about identifiability. The three growth models tested here are distinguishable in principle, but in practice, with finite and noisy data, the learned crowding functions carry uncertainty — particularly at low population densities, where the ODE is difficult to constrain. The paper notes this limitation honestly: predictions in the near-zero density regime are less reliable, because the training data provide little information there. Understanding the fundamental limits of what these frameworks can distinguish — and under what data conditions — will be important for practitioners who want to deploy them with appropriate confidence.

What the paper establishes clearly is that the assumption of constant noise is not a neutral default. It is a choice with consequences. When the noise structure in your data tells a story about biology — about how variability scales with population size, about how measurement error relates to the underlying state — ignoring that structure means leaving information on the table. The NLL–BINN framework, with its simple but powerful extension to a learnable power-law noise model, offers a principled way to stop ignoring it.

In a field where data are precious and mechanisms are elusive, that is a meaningful advance.

The Neural Network That Listens to Biological Noise — and Learns from It

The Problem with "Good Enough" Noise

The Science

What They Found

Why This Changes Things

What's Next

Source articles

Comments (0)