Parameters for static ESC

1 per input channel — vs. up to 6 for conventional sinusoidal ESC

Parameters for dynamic ESC

1 additional (τs) — Total of 2 per channel: nominal gain + system time constant

Optimal point shift in simulation

[0.2,0.7] → [0.8,0.3] — Two-input system; controller tracked both step changes successfully

Hold time at K=0.01 vs K=0.001

10× speed difference — Higher gain converges 10× faster but with proportionally larger oscillation

U.S. DoD ESTCP — Environmental Security Technology Certification Program — targets military facility energy efficiency

Simpler Self-Tuning Controller for Real-World

Somewhere in a commercial building right now, a chiller is running at a setpoint chosen by an engineer who visited the site once, years ago. It's not the optimal setpoint. It's probably not even close. The engineer knew that, but configuring a proper real-time optimization controller would have taken days of modeling, tuning, and testing — expertise the building operator doesn't have and the budget doesn't support. So the chiller runs where it runs, burning more energy than it needs to.

This is the quiet failure mode of building automation, and it's not unique to buildings. Bioreactors, solar tracking systems, autonomous vehicles, anti-lock brakes — anywhere a system has to find a performance sweet spot without an accurate internal model, the same problem appears. The math for solving it has existed for decades. Getting that math to work in the real world, without a control engineer babysitting the configuration, is another matter entirely.

A new paper from Timothy I. Salsbury and Min Gyung Yu at Pacific Northwest National Laboratory proposes a controller that could genuinely change this (Salsbury and Yu, 2026). The core claim is striking in its directness: their new extremum-seeking controller (ESC) requires just one configurable parameter per input channel for static systems, and only one additional parameter — a single time constant — to handle dynamic systems. For a field where six parameters per channel is the norm, that's not an incremental improvement. It's a different category of tool.

The Science

Extremum-seeking control is, at heart, an elegant idea. You don't need to know the mathematical model of your system. You just nudge it, watch which way the output moves, and adjust your inputs accordingly — feel your way to the summit, or in the case of minimization problems, the valley. The "seeking" is the algorithm's continuous probing for the direction of improvement.

The dominant flavor of ESC, popularized by Krstic and Wang (2000), works in the frequency domain. It injects a sinusoidal "dither" signal — a deliberate, small oscillation — into the system inputs, then demodulates the output to extract gradient information. It's mathematically rigorous and well-studied, but each input channel needs its own dither frequency, phase, and amplitude, plus filter coefficients and an integrator gain. For a two-input system, you could easily be configuring ten or more parameters before you start. And they interact in non-obvious ways.

An older, less-celebrated alternative uses relays instead of sinusoids. A relay is binary: it switches the input either up or down depending on whether the output cost function is currently improving. No external oscillation needed — the perturbation emerges naturally from the switching behavior itself. This relay-based approach needs fewer parameters, but until recently it had mostly been confined to single-input problems. Extending it to multiple inputs creates a mathematical tangle: when two or more inputs are changing simultaneously, how do you tell which input's direction is actually responsible for the improvement or deterioration you're seeing?

Salsbury and Yu's insight is to solve this identification problem stochastically — by deliberately randomizing the gains of the relay signals.

Figure 1: Multi-relay ESC for static map Source: Timothy I. Salsbury, Min Gyung Yu

The logic runs as follows. At any moment, the rate of change of the output cost function $y$ is related to the partial derivatives (gradients) of the cost function with respect to each input, multiplied by how fast each input is changing — this is just the multivariable chain rule:

$\frac{d y}{d t} = \frac{\partial y}{\partial θ _{1}} \cdot \frac{d θ _{1}}{d t} + \dots + \frac{\partial y}{\partial θ _{p}} \cdot \frac{d θ _{p}}{d t}$

With multiple inputs all changing at once, you have one equation but $p$ unknowns (the partial derivatives). You need $p$ distinct observations to solve the system. The trick is that if you randomly vary the relay gains at each timestep — making each input move at a slightly different random speed each time — then sequential observations become linearly independent from each other. The matrix you need to invert in order to recover the gradient estimates is no longer singular. Inversion becomes possible, and the gradient direction — the crucial piece of information telling you whether to push each input up or down — can be recovered.

The stochastic gains take the form $K_{t} = 2 K_{0} \otimes D_{t}$ , where $K_{0}$ is your nominal gain vector and $D_{t}$ is a vector of independent uniform random numbers between 0 and 1. The expected value of the randomized gain equals the nominal gain, so on average the controller is doing exactly what you'd design it to do — but the random variation around that average is precisely what makes gradient identification tractable.

For dynamic systems — where the output doesn't respond to input changes instantaneously — an additional wrinkle appears. The controller needs to hold its relay states long enough that the system has approximately reached a new steady state before switching direction; otherwise the gradient estimates reflect transient dynamics, not the true cost landscape. The authors handle this with a relay hold time $T_{d}$ set equal to the dominant system time constant $τ_{s}$ . Gradient estimation shifts from a simple matrix inversion to recursive least squares (RLS) with exponential forgetting, but critically, the forgetting factor $λ$ is also derived automatically from $τ_{s}$ . One number configures everything.

Figure 2: Multi-relay ESC for general MISO dynamic system Source: Timothy I. Salsbury, Min Gyung Yu

What They Found

The stability proof is compact and satisfying. Using Lyapunov's direct method — a classical technique for proving that a system will converge rather than diverge — the authors choose $V (θ) = Q (θ) - Q (θ^{*})$ as a Lyapunov function, where $Q (θ^{*})$ is the optimal (minimum) cost. This function is positive everywhere except at the optimum. For the system to be provably stable, its time derivative must be non-positive:

$\frac{d V}{d t} = g^{T} \dot{θ} = g^{T} (- K \otimes sgn (g)) = - ∥ (K \otimes g) ∥_{1}$

The $L_{1}$ norm — essentially the sum of absolute values — is always non-negative, so $\frac{d V}{d t}$ is always non-positive. The cost function can only decrease or stay flat. When it's flat, the gradient $g = 0$ , which means you're at the optimum. The proof closes. It's the kind of mathematical argument that's elegant because it doesn't need to be longer.

The simulations use a two-input quadratic cost function, with the optimal point $θ^{*}$ stepped from $[0.2, 0.7]$ to $[0.8, 0.3]$ mid-run — a realistic test of how the controller recovers when the world changes.

Figure 4: Simulation results for static version Source: Timothy I. Salsbury, Min Gyung Yu

With a gain of $K = 0.01$ , the controller converges quickly to both the initial and shifted optima but oscillates noticeably around them. With $K = 0.001$ , oscillations shrink to near-zero but convergence takes ten times longer. This tradeoff — between speed of arrival and amplitude of steady-state chatter — is a known limitation of relay-based ESC.

Static ESC: Convergence Speed vs. Oscillation Trade-off

Comparison of relay gain settings for the static map simulation. Higher gains converge faster but oscillate more around the optimum.

Static ESC: Convergence Speed vs. Oscillation Trade-off
Label	Value
K = 0.01 (high gain)	10 relative units
K = 0.001 (low gain)	1 relative units

The dynamic version, tested with a first-order system time constant $τ_{s} = 10$ seconds, shows the same qualitative tradeoff, with larger oscillations than the static case at equivalent gains — expected, because the longer hold time means each relay step integrates to a larger input excursion before correcting.

Figure 5: Simulation results for dynamic version including adaptive gain approach Source: Timothy I. Salsbury, Min Gyung Yu

Parameter Count: New vs. Conventional ESC Approaches

Number of configurable parameters required per input channel for different extremum-seeking control strategies.

Parameter Count: New vs. Conventional ESC Approaches
Label	Value
Sinusoidal ESC (SISO)	6 parameters
Prior relay ESC (multivariable)	4 parameters
New stochastic relay ESC (static)	1 parameters
New stochastic relay ESC (dynamic)	2 parameters

The adaptive gain extension is where things get particularly interesting. Rather than fixing $K$ , the controller adjusts its gain magnitude based on the estimated gradient: larger steps when far from the optimum (large gradient), smaller steps when close (small gradient). The formula adds the vector of absolute gradient values to the base gain, so the controller is effectively running fast in open country and slowing near the target. The simulation result using this adaptive approach outperforms both fixed-gain settings — converging as fast as the aggressive $K = 0.01$ setting while oscillating nearly as little as the conservative $K = 0.001$ setting.

Dynamic ESC: Adaptive Gain vs. Fixed Gain Performance

Qualitative comparison of three controller configurations for the dynamic system simulation: fixed low gain, fixed high gain, and adaptive gain.

Dynamic ESC: Adaptive Gain vs. Fixed Gain Performance
Label	Value
Fixed K = 0.001	1 relative
Fixed K = 0.01	10 relative
Adaptive gain (ζ=0.001)	9 relative

Why This Changes Things

To appreciate why parameter minimization matters, consider the practical alternative. A conventional sinusoidal ESC applied to a building's air handling system with three control inputs might require an engineer to specify: three dither frequencies (which must be carefully chosen to avoid interference with each other and with the system's natural frequencies), three dither amplitudes, three dither phases, three high-pass filter cutoffs, three low-pass filter cutoffs, and an integrator gain. That's potentially nineteen parameters, many of which require either detailed system modeling or extensive field tuning to set correctly.

The new algorithm, applied to the same three-input system, would require: three relay gains (set to the maximum allowable rate of change for each manipulated variable — a number any plant operator can look up or measure) and one system time constant (obtainable from a simple step test). Four numbers, all physically interpretable.

This is not a marginal improvement. It's the difference between a tool that requires a specialist and a tool that can be deployed by the person who already operates the building. The authors are explicit about this motivation: the work is funded by the U.S. Department of Defense's Environmental Security Technology Certification Program (ESTCP), which focuses on energy efficiency in military facilities — environments where energy optimization expertise is often thin but energy waste is very real.

The broader context matters too. Buildings account for roughly 40% of total energy consumption in developed economies. A significant fraction of that consumption is governed by control systems that are never properly optimized because configuration is too burdensome. Any algorithm that meaningfully lowers that barrier has implications at scale.

There's also a conceptual advance worth noting. The relay-based ESC family has historically been seen as the simpler but less capable cousin of sinusoidal ESC — useful for SISO problems, awkward for multi-input ones. This paper closes much of that capability gap. By routing around the need for external perturbation entirely and solving the gradient identification problem through stochastic gain variation, the authors achieve something the previous relay literature largely couldn't: a provably stable multi-input controller that doesn't reintroduce the complexity it was trying to avoid. Earlier multi-input relay ESC methods, such as those by Aminde et al. (2021) and Peixoto et al. (2020), ended up adding periodic functions and extra configuration parameters that undermined the simplicity argument. The stochastic relay approach sidesteps that trap cleanly.

The speed-vs-oscillation tradeoff, while a genuine limitation compared to sinusoidal ESC, is addressed by the adaptive gain scheme described in Section 4.1. And the tradeoff itself is at least predictable: the expected steady-state error $E [e_{θ}] = K_{0} T_{d}$ gives operators a direct formula relating gain magnitude and hold time to how tightly the controller will cluster around the optimum. Predictable tradeoffs are manageable tradeoffs.

What's Next

The paper's current simulation framework is deliberately simple — a quadratic cost function with first-order dynamics. This is standard for initial ESC validation and sufficient to demonstrate the core properties, but real systems are messier. Cost functions in HVAC systems aren't quadratic; they have flat regions, local minima, measurement noise, and actuator nonlinearities. How the stochastic relay approach performs under these conditions is an open question that field testing will need to answer.

The authors note that $τ_{s}$ — the one additional parameter required for dynamic systems — "will not be known exactly." They argue that overestimating it is safer than underestimating: too slow a relay hold time costs convergence speed, while too fast a hold time risks insufficient time-scale separation and failed convergence. This asymmetry is practically useful guidance, but it also means the algorithm is not entirely self-configuring: someone still needs to estimate the system time constant, even approximately. A future step would be an online identification routine for $τ_{s}$ that eliminates even this requirement.

The adaptive gain scheme described in the paper (Equation 16) is promising but still involves one small additional parameter $ζ$ that scales the random component of the adaptive gain. The authors set it to $0.001$ in their simulation without extended discussion of sensitivity, and whether this generalizes across different system scales and dynamics warrants further study.

Finally, the MISO (multiple-input, single-output) framing is a constraint worth acknowledging. Many real optimization problems have multiple measurable outputs — building energy efficiency is one cost function, but comfort might be another, and they may conflict. Extending the stochastic relay ESC framework to multi-objective or truly multi-output settings would substantially expand its applicable domain.

What Salsbury and Yu have delivered is something that the control community talks about wanting — a rigorous, provably stable optimizer that a non-specialist can actually deploy — and they've demonstrated it works in simulation. The next chapter is the field. If the performance holds in real buildings, real bioreactors, and real energy systems, the impact could be substantial: not because the algorithm is the most powerful optimization technique ever devised, but because it's the one that might actually get turned on.

The Six-Parameter Problem: How a Simpler Optimization Controller Could Unlock Real-World Energy Savings

The Science

What They Found

Why This Changes Things

What's Next

Source articles

Comments (0)