Looking beyond the extremes
When evaluating forecast accuracy a few extreme days can skew the picture. Anyone who has compared predicted and actual solar yields knows that unusual weather events like sudden haze freak storms or sensor glitches can produce enormous percentage errors. They tell us something about robustness but they can also drown out the everyday performance we care about most.
That is why I re-ran the analysis of my GTI AI and Fusion forecasts after trimming the top and bottom 3 percent of error values. In other words I capped the most extreme results at the 3rd and 97th percentiles so they would not dominate the statistics. I also removed the unfinished day 2025-08-30 from the dataset since its values are not representative. What remains is a fairer test of everyday forecasting quality.
Counting the winners
Each day I compare the absolute errors of the three forecasts and mark which was closest to reality. With outliers winsorized the distribution of winners looks like this
- Before winsorization Fusion won 54 days AI 61 days and GTI 21 days
- After winsorization Fusion won 52 days AI 63 days and GTI 21 days
Fusion and AI are close but AI edges ahead once the most extreme errors are capped. GTI remains a distant third.
Error distributions
The count of “daily wins” is just one view. Looking at the distribution of errors shows why Fusion still matters.
AI scores more outright wins but its error distribution has longer tails. Fusion’s errors are more tightly clustered which means fewer disastrous misses even if it loses by a small margin more often.
Summary statistics
Numbers tell the story even more clearly. Based on 137 days of complete data the errors look like this
| Forecast | Mean Error (kWh) | Median Error (kWh) | StdDev (kWh) |
|----------|------------------|--------------------|--------------|
| AI | 4.99 | 2.82 | 5.48 |
| GTI | 12.70 | 10.84 | 10.18 |
| Fusion | 4.57 | 3.40 | 4.31 |
As of now we only have 137 days with complete data for GTI AI and Fusion. I look forward to revisiting these statistics later in the year when the dataset is larger and even more representative.
Why Fusion still matters
Fusion works by blending the learning power of AI with the physics of GTI. It adjusts weights dynamically applies an intercept to correct systematic bias and can fall back to a safe 70 30 split when training data is scarce.
The advantage is that Fusion avoids the most painful misses. AI does win slightly more often under winsorization but its worst days are still worse than Fusions. GTI meanwhile remains reliable but rarely closest.
Climate change and the road ahead
One reason this hybrid approach is so important is the changing climate itself. As the atmosphere warms it is not simply delivering more sun. Instead we are seeing more haze shifting wind patterns and erratic cloud formations. AI tends to interpret rising temperatures as an upward trend in production but it will take years of retraining on new data before AI alone adapts.
Fusion provides resilience right now. By balancing AIs optimism with GTIs physics based anchor it delivers forecasts that are both realistic and robust in the face of climate uncertainty.
Conclusion
When outliers are capped AI wins by a small margin in terms of the sheer number of days closest to reality. But Fusion continues to offer the most balanced performance across the board with fewer dramatic failures and a robustness that makes it the safer bet in everyday use.
The lesson is simple in solar forecasting as in life balance beats extremes. Fusion delivers that balance and in a warming hazier world it is the approach I trust most for planning my daily energy use.