The Weather Prediction Sweet Spot Nobody Can Nail

Taken to its logical extreme, this paper suggests we could stop running new weather simulations altogether - just keep recycling old ones forever, like a meteorological perpetual motion machine. Fortunately, the actual research is way more grounded than that, but the core idea is still kind of wild: your old weather forecasts aren't trash. They're a goldmine you've been throwing away.

Here's the problem. Predicting tomorrow's weather? We're pretty good at that. Guessing whether next winter will be warmer than average? Decent enough. But that awkward middle zone - two weeks to two months out - is what forecasters call the "predictability desert," and it's exactly as hospitable as it sounds. Too far out for the atmosphere to remember its starting conditions, too soon for big slow-moving forces like ocean temperatures to take the wheel.

The Weather Prediction Sweet Spot Nobody Can Nail

This desert matters. A lot. The World Bank estimates that just 24 hours of better advance warning can cut storm and heatwave damage by 30%. Scale that to weeks, and we're talking billions of dollars in agriculture, energy planning, and disaster prep. Weather disasters already cost roughly $180-200 billion per year globally, and US crop losses alone hit $21 billion in 2023.

So yeah, cracking the subseasonal-to-seasonal (S2S) forecast is kind of a big deal.

Old Forecasts: Not Dead, Just Resting

Daisuke Tokuda (University of Tokyo) and Paul Dirmeyer (George Mason University) asked a question that sounds almost too simple: what if we stopped treating yesterday's ensemble forecast like yesterday's leftovers?

Quick detour for the uninitiated: ensemble forecasting is when, instead of running one weather simulation, you run a bunch of them - typically 12 to 51 - each starting from slightly different conditions. It's like asking 50 slightly different versions of a meteorologist to make predictions and then seeing where they agree. More agreement means more confidence. It's how we get those "70% chance of rain" numbers.

The standard practice has been to throw away old ensemble runs once a new one comes in. Some researchers tried a "lagged ensemble" approach - pooling older and newer forecasts together - but it was like inviting everyone to the party without checking the guest list. The bad forecasters dragged down the good ones, and overall skill barely budged.

LEAS: The Bouncer at the Forecast Party

Enter LEAS - Lagged Ensemble Analog Subselection. The name is a mouthful, but the concept is beautifully simple. Instead of letting every old forecast member crash the party, LEAS checks IDs at the door. It looks at previous ensemble members and asks: "Did you correctly predict what actually happened between your initialization and now?" If yes, welcome aboard. If not, you're out.

Think of it like hiring a contractor. You wouldn't just grab anyone off the street. You'd check their references - specifically, whether their recent work was any good. LEAS does the same thing with forecast members, keeping only the ones that proved they had a good read on the atmosphere and land surface conditions.

The genius is that this acts like getting a free upgrade to your model's starting conditions. Those surviving members carry forward accurate representations of soil moisture, snow cover, and atmospheric states - all the slow-moving factors that matter for multiweek predictions - without anyone having to rerun a single simulation.

The Results: Not Subtle

Tokuda and Dirmeyer tested LEAS on daily maximum temperature forecasts across North America using four independent state-of-the-art S2S models. The results were consistent across all four systems:

Forecast errors dropped by up to roughly 10% in some regions
Improvements held across weeks one through five
Extreme heat prediction got measurably better
Both systematic bias and variance errors shrank

As Tokuda put it: "Such a simple strategy worked consistently across all four independent forecast systems." That consistency is the real kicker. A method that only works on one model might be a fluke. One that works on four is a pattern.

The AI Elephant in the Room

Meanwhile, the machine learning crowd has been throwing enormous compute at this same problem. DeepMind's GenCast (Price et al., 2024) uses diffusion models to generate probabilistic forecasts that outperform ECMWF's ensemble system on over 96% of verification targets. FuXi-S2S (Chen et al., 2024) trained on 72 years of reanalysis data to extend skillful MJO prediction from 30 to 36 days.

These are impressive achievements, but they require massive computational resources and entirely new model architectures. LEAS takes the opposite approach: squeeze more juice from the infrastructure you already have. And here's the neat part - the authors suggest LEAS could be applied on top of ML-based forecasts too, since those systems also use repeated initialization and ensemble methods.

It's the difference between buying a faster car and finding a shortcut on your existing route. Both get you there quicker.

Why Recycling Forecasts Is Secretly Brilliant

The deeper insight here connects to a 55-year-old idea. In 1969, Edward Lorenz - yes, the chaos theory butterfly guy - proposed "analog forecasting": find a historical weather state that looks like today, then use what happened next as your prediction (Lorenz, 1969). The concept largely fell out of favor as numerical models got better, but LEAS gives it a clever twist. Instead of searching through decades of observed weather for analogs, it searches through recent forecast members. It's analog forecasting for the ensemble age.

If you're into visualizing complex reasoning chains - like how atmospheric memory connects soil moisture to temperature skill three weeks later - tools like mapb2.io are handy for mapping out those kinds of branching cause-and-effect relationships.

The Bottom Line

Sometimes the best innovation isn't building a bigger model. It's being smarter about what you already have. LEAS doesn't require new satellites, new supercomputers, or retraining billion-parameter neural networks. It just requires not throwing away good data.

As Dirmeyer noted: "Previous forecasts are not outdated." Turns out, one forecaster's recycling bin is another's treasure chest.

References

Tokuda, D. & Dirmeyer, P.A. (2026). Selective reuse of prior ensemble data improves the latest air temperature forecast over North America. Proceedings of the National Academy of Sciences. DOI: 10.1073/pnas.2524516123. PMID: 41950091
Price, I. et al. (2024). Probabilistic weather forecasting with machine learning. Nature, 636, 84-90. DOI: 10.1038/s41586-024-08252-9
Chen, L. et al. (2024). A machine learning model that outperforms conventional global subseasonal forecast models. Nature Communications, 15, 6603. DOI: 10.1038/s41467-024-50714-1
Lorenz, E.N. (1969). Atmospheric Predictability as Revealed by Naturally Occurring Analogues. Journal of the Atmospheric Sciences, 26(4), 636-646.
Vitart, F. et al. (2025). The WWRP/WCRP S2S Project and Its Achievements. Bulletin of the American Meteorological Society, 106(5).

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.

AIb2.io - AI Research Decoded