The CNC Algorithm That Cuts Machining Time by 15% — Without a Human Tuning It

Every CNC machine tool on a factory floor — the kind that mills turbine blades, molds smartphone cases, and shapes the titanium hip implants inside people's bodies — is moving slower than physics allows. Not because of the hardware. Because of the software telling it how fast to go.
The gap between "how fast the machine actually moves" and "how fast it theoretically could move" has been known for decades. Closing it has been the subject of a rich academic literature. But almost none of that work has made it into real factory controllers. The reason: the clever optimization algorithms researchers developed were simply too slow to run in real time, and required expert engineers to hand-tune their parameters for every new part geometry. On the shop floor, those are fatal flaws.
A paper from Haijia Xu and Alexander Verl at the University of Stuttgart changes that calculus in a concrete way. Their proposed algorithm — a lexicographic linear programming framework with sequential windowing — reduces finishing time by more than 15% compared to an industrial CNC kernel on a five-axis freeform test contour, runs to completion in 14 seconds on a 2012 Intel i5 processor, requires zero manual tuning, and scales linearly to toolpaths with one million constraint checkpoints (Xu & Verl, 2026). That combination of speed, quality, and practicality is genuinely new.
The Science
To understand what makes this hard, it helps to know what a CNC controller actually does every millisecond. A machining toolpath is a curve in space — the path a cutting tool must follow across a workpiece. The controller's job is to decide, at every moment, how fast the tool should travel along that curve: the feedrate. Move too fast and you violate the machine's actuator limits — the maximum velocities and accelerations of each servo motor — causing vibration, error, or mechanical damage. Move too slow and you waste time and money.
The feedrate profile for a complex freeform path is not a simple shape. It dips near tight curves where centripetal acceleration would otherwise overwhelm the motors, climbs in straight sections, and must smoothly rise and fall everywhere. The theoretically optimal profile — the one that minimizes total machining time while respecting every constraint — is a freeform curve determined by the geometry of the path and the machine's physical limits. Industrial controllers have historically approximated it using rigid templates: seven-phase trapezoidal profiles with piecewise linear segments that are fast to compute but inherently cannot match the true optimum (
).
Optimization-based methods that compute the true optimum exist, but they have been blocked from industrial adoption by two problems. First, they are computationally expensive — previous approaches using second-order cone programming or interior point methods take far too long for real-time execution. Second, they involve competing objectives: maximizing speed versus keeping the motion smooth. Combining those objectives requires a weighting factor that has no clear physical meaning and must be re-tuned every time the part geometry changes.
Xu and Verl tackle both problems simultaneously. Their foundation is a linear programming (LP) formulation — LP is a class of optimization that can be solved extremely efficiently because the objective and constraints are all linear (straight-line) functions of the unknowns. The key mathematical trick is a variable substitution: instead of optimizing the feedrate directly, they optimize , which converts the nonlinear velocity and acceleration constraints into linear ones. The resulting problem is sparse — most of the entries in the giant constraint matrix are zero — and they exploit that structure with the HiGHS solver, a state-of-the-art sparse LP engine (Huangfu & Hall, 2017).
A five-axis machine tool controls both where the tool tip goes (position: , , Z$) and *which direction* the tool points (orientation: $U, , W$). Previous methods often handled these separately and then tried to synchronize them. Xu and Verl instead express everything through a single unified path parameter $u \in [0, 1], with a kinematic Jacobian linking the machine's five axes to the six-dimensional tool state. Position and orientation constraints are simultaneously enforced within the same optimization problem, so synchronization comes for free.
What They Found
The paper's central claim is that a lexicographic design principle resolves the speed-versus-smoothness tension without any human intervention. Lexicographic optimization — named after the logic of a dictionary, where you sort by first letter before worrying about the second — works in two sequential steps. First, maximize feedrate to find the optimal finishing time . Second, minimize motion roughness (formally, minimize , the total variation in the feedrate's second derivative) subject to the constraint that total feedrate cannot degrade by more than a tolerance from the first-step optimum:
The tolerance has a direct physical meaning: it is the maximum percentage of finishing time you are willing to sacrifice for smoothness. Setting on the freeform test contour reduced acceleration-level chattering by 24% while increasing finishing time by only 1.3% — with no user intervention beyond specifying that one interpretable number (Xu & Verl, 2026). Compare that to the weighted-sum approach in equation (8), where the weight has no obvious physical meaning and must be laboriously re-tuned for every new part geometry (
).
Lexicographic Smoothing: Speed vs. Smoothness Trade-off
Effect of 1% lexicographic tolerance on finishing time and acceleration chattering reduction
| Label | Value |
|---|---|
| Chattering reduction | 24 % |
| Finishing time increase | 1.3 % |
The scalability results are equally striking. The sequential windowing strategy — running the LP optimizer on short overlapping windows of the toolpath rather than solving the whole thing at once — produces linear scaling with toolpath length. A one-shot optimizer has polynomial computational cost; as the toolpath grows longer, the computation time explodes. The windowed approach does not (
). Processing 10,500 constraint checkpoints takes 1.5 seconds on the legacy Intel i5-3470. Processing 100,000 takes 14.3 seconds. One million checkpoints — an ultra-long toolpath with a finishing time exceeding 5.4 hours — takes 51.5 seconds on a high-performance AMD 9950X and 145.7 seconds on the older Intel chip. The windowed solution's finishing time stays within ±0.2% of the one-shot global optimum; the one-shot solver, ironically, sometimes does worse on large problems because it terminates prematurely.
Sequential Windowing: Linear Scaling with Toolpath Length
Computation time vs. number of constraint checkpoints on Intel i5-3470 (single core)
| Label | Value |
|---|---|
| 10,500 pts | 1.5 s |
| 100,000 pts | 14.3 s |
| 1,000,000 pts | 145.7 s |
The benchmark against a real industrial CNC kernel is where the practical significance lands. On a five-axis freeform test contour, the TwinCAT 3 CNC kernel finished the path in 20.24 seconds. The proposed LexLP+SeWin method finished it in 16.86 seconds — a reduction of more than 15%. The one-shot feedrate maximization baseline (no smoothing, no windowing) achieved 16.54 seconds, meaning the lexicographic smoothing costs only about 0.3 seconds over the raw theoretical minimum (Xu & Verl, 2026).
Finishing Time Comparison: LexLP vs. Industrial CNC Kernel
Machining finishing times for three methods on the freeform five-axis test contour
| Label | Value |
|---|---|
| Industrial CNC kernel | 20.24 s |
| LexLP + SeWin (proposed) | 16.86 s |
| One-shot max (theoretical best) | 16.54 s |
Axis-level setpoints and Cartesian motion profiles were validated against the physical limits (
). Only minor constraint violations were observed at the 1 kHz interpolation rate — within the expected numerical tolerance of discretization. Crucially, the unified parameterization scheme meant that translational and rotational Cartesian constraints were enforced simultaneously without separate arc-length or arc-radian parameterization, simplifying the geometric preprocessing pipeline.
Physical experiments on a real five-axis machine confirmed the simulation results. A trajectory with a total arc length of 3,766 mm and arc radian of 232° was executed. The 15% finishing time reduction held up in hardware. Tracking errors — how closely the machine actually follows the commanded path — remained at a similar level to the CNC kernel baseline. The control signals from the LexLP method were slightly larger during transient phases (direction reversals), reflecting less conservative use of the acceleration envelope, but well within acceptable bounds.
Why This Changes Things
The 15% figure deserves some unpacking. On any individual part, 15% sounds modest. Across a factory running CNC machines continuously, it is not. A machine that finishes parts 15% faster is, in rough terms, a 15% increase in throughput with no capital expenditure. For high-value machining — aerospace components, medical implants, precision molds — where machine-hours cost hundreds of dollars, that is a meaningful economic shift. For high-mix, low-volume production where parts change constantly, the tuning-free property matters even more: there is no specialist available to re-optimize for every new job.
There is also a precision angle that the paper mentions but deserves more emphasis. Because the sequential windowing is computationally cheap enough to afford fine discretization, it could allow ultra-precise feedrate control at the level needed for laser micromachining — where positional errors are measured in micrometers and motion smoothness directly determines surface quality. The same framework that speeds up a milling machine also has the potential to improve precision.
The deeper conceptual contribution is the lexicographic design principle itself. Multi-objective optimization problems in engineering almost always get handled by weighted sums: combine your objectives into one formula, tune the weights, hope they generalize. Weighted sums are simple but fundamentally unprincipled — the weights have no clear physical interpretation, their optimal values depend on the problem's scale and geometry, and they require domain expertise to set. The lexicographic approach instead encodes a genuine hierarchy of priorities (speed first, smoothness second, with a physically meaningful tolerance between them) and solves each stage sequentially. This is a strategy that could be applied far beyond feedrate planning — anywhere two objectives need to be balanced adaptively across varying problem geometries.
The real-time execution on a single core of a decade-old processor is not a minor footnote. CNC systems in the field run on embedded industrial PCs, often with older hardware and strict real-time operating constraints. Research algorithms that require a GPU cluster or a modern workstation are irrelevant to those systems. Demonstrating verified performance on an Intel i5-3470 — released in 2012, comparable to what sits inside many industrial control cabinets today — is a deliberate design choice that makes the path to deployment concrete.
What's Next
The authors identify parallel kinematics as the next target for the framework. Unlike the serial five-axis machine tested here, parallel kinematic machines (such as hexapods and Delta robots) have direct analytical expressions relating Cartesian space to joint space, which could allow even more efficient LP formulations. Extension to jerk-limited planning — constraining not just acceleration but the rate of change of acceleration, which matters for very high-speed precision applications — is noted as feasible within the LP framework using the pseudo-jerk concept from prior work (Fan et al., 2013).
Some open questions remain. The current validation uses a single freeform test contour; generalization across the full diversity of real-world industrial toolpaths — highly variable curvature, near-singular machine configurations, tool-axis flips — would need systematic study before confident deployment claims. The window length and overlap parameters do require some configuration, even if they are more physically interpretable than $\gamma$; their optimal selection relative to available CPU time and path geometry is not yet fully automated.
The tolerance is physically meaningful, but the paper's experiments use only a single value (1%). Understanding how users should select this parameter across different quality requirements — and whether it interacts with the geometric complexity of the toolpath in non-obvious ways — is a practical question that will matter when the system reaches the hands of machinists rather than researchers.
Still, the overall picture is one of genuine progress on a problem that has frustrated the manufacturing research community for years. The gap between the theoretical time-optimal solution and what industrial machines actually achieve has long been justified by the impossibility of running sophisticated optimization in real time. That justification is now considerably weaker. A 2012 desktop processor, one CPU core, 14 seconds, 15% improvement. The numbers are specific enough to take seriously — and specific enough to challenge.
Manufacturing productivity is not a glamorous research topic. But the machines that fabricate almost every physical object in modern life — from the housings of medical devices to the turbine discs inside jet engines — run on software that has barely changed in its fundamental architecture for two decades. Work like this is how that changes.