Hybrid TimesNet-TimeFilter (TNTF)

Synthetic SCM and ray-tracing datasets

The AI That Learns How Radio Waves Actually

The Heart of Wireless — and Why It's So Hard to Model

Roughly 8.9 billion cellular subscriptions exist worldwide, and every single one depends on something engineers call the "channel" — the physical medium through which radio signals travel between a transmitter and a receiver. A signal rarely travels in a straight line. It scatters off buildings, diffracts around corners, reflects off cars, and arrives at your phone as a superposition of dozens or hundreds of copies of itself, each delayed and rotated by the geometry it encountered along the way. The engineering question is deceptively simple: can we predict how that channel will behave, so that a system can be designed to handle it?

The answer has been stuck in an uncomfortable middle ground for decades. Stochastic models — the kind used in international standards — treat propagation statistically, averaging over many environments. They're fast and broadly applicable, but they can't tell you what the signal looks like in a specific street canyon in Tokyo or a particular warehouse in Munich. Deterministic models do the opposite: they use exact 3D geometry and ray-tracing algorithms to simulate every bounce of every signal path. They're accurate, but brutally expensive to compute, making them impractical for general-purpose network design. A machine learning solution that could generalize across environments while still capturing the right statistical behavior would be genuinely transformative.

That's precisely what Richmond Boamah and Ferdous Pervej set out to build (Boamah & Pervej, 2026). Their proposed model — the TimesNet-TimeFilter hybrid, or TNTF — doesn't try to predict the exact sequence of channel values moment to moment. Instead, it learns to generate future channel realizations whose statistics match the true underlying channel. That shift in objective — from predicting instantaneous values to reproducing statistical structure — turns out to be the key that unlocks both accuracy and scalability.

The Science

The mathematical object at the center of this work is the double-directional (DD) channel model — a representation of the wireless channel that captures not just when multipath signals arrive (the delay domain) but also where they came from (direction of departure, or DoD) and where they're going (direction of arrival, or DoA). In formal terms, the channel impulse response at receiver position $r^{l}$ is expressed as a sum over $N (r^{l})$ multipath components (MPCs):

$h (t, τ, Ω, Ψ; \tilde{r}, r^{l}) = n = 1 \sum N (r^{l}) ∣ g_{r^{l}, n} ∣ e^{j ϕ_{n}} δ (τ - τ_{r^{l}, n}) δ (Ω - Ω_{r^{l}, n}) δ (Ψ - Ψ_{r^{l}, n}) e^{j 2 π ν_{r^{l}, n} t}$

where $τ$ is delay, $Ω$ is direction of departure, $Ψ$ is direction of arrival, and $ν$ is the Doppler shift from receiver motion. It's a complete picture of the propagation channel — but that completeness is also the problem. As a receiver moves, objects enter and leave line-of-sight, and the number $N (r^{l})$ changes constantly. Standard ML models can't handle inputs that vary in size; they need fixed-shape tensors.

The researchers solve this with a principled approximation: select only the top- $M$ MPCs, ranked by received power, where $M ≪ N (r^{l})$ . This trims the representation to a fixed size without discarding the most physically meaningful signal components. Even with $M = 5$ — just five signal paths out of potentially hundreds — the results, as we'll see, are remarkably accurate.

Those top-$M$ paths are then structured as a learnable graph, with nodes representing different physical parameter types: gains, delays, azimuth and zenith angles of departure and arrival. Manhattan distances ($L_1$ norms) between patches at different receiver positions define the edge weights; K-nearest neighbors clustering constructs a sparse adjacency matrix. The graph encodes physical intuition directly into the model's data structure.

Figure 1: Overview of the proposed solution: conversion of DD channel realizations into learning graph, followed by transforming node representation into higher-dimensional embedding and transformation for training the proposed TNTF model with a statistics-aided loss function Source: Richmond Boamah, Ferdous Pervej

The TNTF model itself combines two existing architectures that have excelled independently. TimesNet transforms 1D time series into 2D representations using fast Fourier transforms, then applies 2D convolutional kernels to learn periodic patterns — an approach that outperforms Transformers on several time-series benchmarks. TimeFilter uses mixture-of-experts routing to allocate specialized filters to different subgraphs, learning which temporal, spatial, or spatio-temporal correlations matter most for each node. The fusion of their outputs is additive: $Y : = X_{embed} + H_{a} + X_{p}$ , blending the embedding, temporal analysis, and graph prediction streams.

The training objective is what makes the whole system tick. Rather than minimizing prediction error on individual channel realizations — which tends to produce models that are accurate over milliseconds and useless over seconds — the loss function directly penalizes errors in delay spread ($s_\tau$, which measures how spread out in time the multipath components are) and angular spread ($s_\Psi$, which measures their separation in space). These are the quantities that actually determine whether a MIMO antenna array or a beamforming system will perform well in a given environment. The loss also includes terms for per-path gain accuracy. The entire model is then trained with mini-batch stochastic gradient descent using backpropagation.

Experiments ran on two datasets: synthetic data from a geometry-based stochastic channel model (GBSCM), where receiver trajectories follow a Lissajous-like curve and MPCs are drawn from a Poisson point process; and deterministic ray-tracing data generated using OpenStreetMap, Blender, and NVIDIA Sionna for real urban geometry near Utah State University in Logan, Utah. The two datasets stress-test the model in complementary ways — one probing statistical generalization, the other probing fidelity to real-world geometry.

What They Found

The core result is captured in Table I of the paper: using just $M = 5$ signal paths, the TNTF model achieves a normalized mean squared error (NMSE — lower is better, with more negative values indicating smaller error relative to the signal's own power) of −8.80 dB on delay spread and approximately −8.49 dB on azimuth angular spread for both arrival and departure directions. Zenith angular spread (the vertical dimension of signal spread) is harder, coming in around −4.47 dB — still negative, meaning the model's error is well below the signal variance, though there's more room to improve.

NMSE [dB] on SCM Dataset by Statistic and M Value (lower = better)

Normalized mean squared error between generated and ground-truth statistics on the synthetic SCM dataset for M=5, M=10, and M=15 top multipath components.

NMSE [dB] on SCM Dataset by Statistic and M Value (lower = better)
Label	Value
Delay Spread	-8.803 dB
Az AoA Spread	-8.489 dB
Az AoD Spread	-8.4657 dB
Zn AoA Spread	-4.4881 dB
Zn AoD Spread	-4.4667 dB
Gain	-24.8145 dB

Increasing $M$ to 10 or 15 provides modest but consistent improvements across all metrics. Gain prediction is the standout: with $M = 15$ , NMSE on signal gain reaches −26.97 dB, essentially negligible error. This makes physical sense — the strongest paths dominate the total received power, so including more of them narrows the margin between the top-$M$ approximation and the complete channel.

The cumulative distribution function (CDF) plots tell a complementary story.

Figure 2: CDF of the Statistics on SCM Datasets: L=100L=100, P=300P=300, and M=5M=5 Source: Richmond Boamah, Ferdous Pervej

For the SCM dataset with $L = 100$ historical steps and $P = 300$ future predictions, the generated channel statistics track the ground truth CDFs closely across delay spread, azimuth angular spread, and gain. The model isn't just getting the mean right — it's capturing the shape of the distribution, including the tails. That matters enormously for system design, where extreme-but-plausible conditions often drive the engineering requirements.

On the ray-tracing dataset — the harder, more realistic test — results with $M = 2$ paths remain strong.

Figure 3: CDF of the Statistics on Ray Tracing Dataset: L=100L=100, P=300P=300, and M=2M=2 Source: Richmond Boamah, Ferdous Pervej

The CDF alignment holds across statistics, validating that the method works in genuine urban geometry, not just mathematical abstractions.

Figure 4: NMSE [in dB] for different PP on SCM Dataset: L=100L=100 and M=5M=5 Source: Richmond Boamah, Ferdous Pervej

and

Figure 5: NMSE [in dB] for different PP on Ray Tracing Dataset: L=100L=100 and M=2M=2 Source: Richmond Boamah, Ferdous Pervej

show NMSE performance across different prediction lengths $P$; the TNTF model consistently outperforms baseline approaches including standalone TimesNet and TimeFilter variants, maintaining its advantage as the prediction horizon extends.

NMSE on Gain Improves Significantly as M Increases

Gain prediction NMSE across M=5, M=10, and M=15 on the SCM dataset, illustrating how including more signal paths yields sharply better gain fidelity.

NMSE on Gain Improves Significantly as M Increases
Label	Value
M = 5	-24.8145 dB
M = 10	-26.2784 dB
M = 15	-26.9663 dB

The gain metric deserves special attention. In real deployments, engineers need to know the distribution of received power — not just average signal strength but how it varies as a user moves. The model's ability to reproduce this distribution to within a fraction of a decibel, using only the handful of strongest paths, suggests it has genuinely internalized the statistical structure of propagation, not merely memorized training trajectories.

Statistical Accuracy Profile: M=5 vs M=10 (NMSE, negated for readability)

Radar chart of negated NMSE values — so larger area = better accuracy — comparing M=5 and M=10 across all five spread statistics on the SCM dataset.

Statistical Accuracy Profile: M=5 vs M=10 (NMSE, negated for readability)
Label	Value
Delay Spread	8.803
Az AoA Spread	8.489
Az AoD Spread	8.4657
Zn AoA Spread	4.4881
Zn AoD Spread	4.4667

Why This Changes Things

The telecommunications industry is heading toward a world defined by extremely dense antenna arrays, millimeter-wave frequencies, and intelligent reconfigurable surfaces — all of which require far more detailed channel models than today's standards provide. 5G NR already strains conventional stochastic modeling; 6G research groups are actively debating how to handle the gap between tractable models and physical reality.

The conventional answer has been: more ray-tracing. But ray-tracing a city block takes hours to days of compute per scenario, and network operators need to evaluate thousands of scenarios. The alternative — sticking with statistical models — means missing the spatial precision that next-generation beamforming requires. This paper carves a third path: use ML to learn the statistical fingerprint of a channel from limited observations, then generate statistically faithful synthetic realizations on demand.

What's particularly elegant about the statistics-aided framing is that it sidesteps the hardest problem in channel prediction — instantaneous accuracy over long horizons. It's genuinely impossible to predict the exact multipath structure 300 time steps ahead; the environment is too dynamic. But it's entirely possible to predict that, over those 300 steps, delay spread will be distributed in a certain way and angular spread will cluster around certain values. That statistical prediction is what base station designers and protocol engineers actually need.

The graph-based data representation is also a meaningful contribution in its own right. By encoding physical meaning into the graph structure — grouping gains with gains, delays with delays, arrival angles with arrival angles — the model doesn't have to learn from scratch that delay and angle are different kinds of information. It's baked into the architecture. This is the kind of domain-knowledge injection that often separates ML models that work in the lab from those that generalize in deployment.

There's a practical implication for spectrum policy and infrastructure investment too. Realistic channel models underpin spectrum auctions, interference analysis, and regulatory decisions about how densely networks can be deployed. If those models can be generated quickly and accurately for arbitrary environments using the approach described here, regulators and operators would have far better tools to answer questions like: what's the realistic coverage of a mmWave small cell in a mixed-use urban block? How much interference can a dense indoor deployment tolerate?

What's Next

The authors are transparent about what this first version doesn't yet address. The statistics computed here are first-order: delay spread, angular spread, and gain distribution. Second-order statistics — things like the rate of change of these quantities, their autocorrelation in time, or cross-correlations between delay and angular spread — are left explicitly for future work. Those quantities matter for link adaptation and handover decisions, so their inclusion would substantially widen the model's applicability.

The birth-death process of MPCs — the fact that individual signal paths appear and disappear as objects move in the environment — is handled here by selecting top-$M$ paths at each step and accepting that different physical paths may occupy those slots at different times. This is practically reasonable but theoretically imprecise: the model isn't tracking which physical reflector each path corresponds to. A future extension using path-tracking algorithms or persistent labeling across time steps could improve physical interpretability, especially in environments with clearly identifiable reflectors like buildings and vehicles.

The ray-tracing validation, while rigorous, was conducted in a single urban area (Logan, Utah). Generalizing across dramatically different propagation environments — dense urban canyons, indoor factories, sub-urban campuses — would require either retraining or demonstrating transfer learning capability. The modular graph-plus-statistics-loss framework looks well-suited to transfer, but that remains to be tested.

Finally, the computational cost of TNTF relative to baselines isn't reported in detail in the paper's current form. For a solution to be adopted in real network planning pipelines, the training and inference time must compare favorably not just to deterministic ray-tracing (where any ML model will win easily) but also to leaner statistical tools. That comparison would sharpen the case for deployment.

None of these limitations diminish what's been achieved. Building a wireless channel model that is simultaneously physically grounded, statistically faithful, computationally tractable, and capable of handling the messy variability of real propagation environments is genuinely hard. The TNTF model makes a credible step toward all four of those goals at once — and in doing so, sketches what realistic ML-aided network design might look like as we move toward 6G and beyond.

The AI That Learns How Radio Waves Actually Travel — Without Mapping Every Bounce

The Heart of Wireless — and Why It's So Hard to Model

The Science

What They Found

Why This Changes Things

What's Next