Economic Scenario Generation Beyond Historical Data: Why the Past Is a Poor Guide to Future Market Risk

The Calibration Problem

Every economic scenario generator has a calibration problem.

The model needs historical data to calibrate shock magnitudes, correlations between risk factors, and the joint distribution of market moves. Historical data is the only empirical grounding available. Without it, scenarios are fiction.

But historical data has a structural limitation that becomes dangerous precisely when it matters most: it can only represent risks that have already occurred. By definition, it cannot calibrate the risk that hasn't happened yet — the scenario that sits outside the historical distribution, in the tail that no past event has yet populated.

This is not a new observation. It's the fundamental critique of VaR models that preceded the 2008 crisis, and it applies with equal force to the scenario libraries that most insurance companies use to satisfy their ORSA and stress testing obligations under Solvency II.

The question this piece addresses is concrete: what does it take to build an economic scenario generation system that is genuinely forward-looking — not just historically calibrated — while remaining auditable and defensible within the Solvency II framework?

How Most ESG Systems Are Actually Built

The typical ESG architecture for a Solvency II insurer looks like this:

A historical dataset of risk factor movements — yield curves, credit spreads by rating and sector, equity indices, FX rates — is assembled, usually covering one to two decades. Statistical models are fitted to this data to capture factor dynamics and correlations. Scenario shocks are then drawn from this fitted distribution, either as historical replays of specific stress events (2008, 2011, 2020, 2022) or as parametric draws from the estimated joint distribution.

This architecture is defensible and auditable. Historical scenario replays — like 2022 — naturally capture the joint movement of all risk factors simultaneously: rates, spreads, equity, currency all move together as they actually did in the event. That joint capture is a genuine strength of the historical approach. The historical method tracks tail risk scenarios like "1970s-style stagflation" and "severe global recession" — using their dynamics to inform shock magnitude and correlation assumptions.

The structural weakness appears elsewhere: when a forward-looking risk that has no historical precedent needs to be modelled. The decade of near-zero rates and compressed spreads from 2012 to 2021 produced scenario libraries with no adequate representation of a rapid rate normalisation shock. When that shock arrived in 2022 — EUR swap rates moving +350bps in under 12 months alongside simultaneous credit spread widening — the scenario was simply not in the historical library. No historical replay could have generated it from the pre-2022 dataset.

That is the structural limitation: historical ESG cannot generate a scenario for a regime it hasn't yet observed.

Three Structural Weaknesses of Pure Historical ESG

1. Regime dependency

Historical calibration implicitly assumes that the future will broadly resemble the past in its statistical properties. Markets are not stationary. They exhibit distinct regimes — low volatility periods, crisis periods, structural repricing phases — with fundamentally different dynamics in each.

A spread risk scenario calibrated on 2015–2021 data will produce shocks that are small relative to what a 2026 credit environment makes plausible. The relevant question for scenario generation is not "what spread widening has occurred historically?" It is "what is plausible given current credit fundamentals, monetary policy trajectory, and fiscal positions of EUR sovereigns?" These are different questions. Historical calibration answers the first. Forward-looking scenario generation needs to answer the second.

2. Novel macro shocks and causal chain modelling

The 2022 market shock was a single causal event — aggressive monetary tightening — that simultaneously drove rates, spread widening, and equity repricing across all risk factors at once. A historical scenario replay of 2022 captures this naturally: all risk factors move together as they actually did.

The challenge arises when constructing a novel forward-looking scenario with no historical analogue. In that case, the correlations between risk factors cannot simply be inherited from historical averages. AI systems can vary dozens of interdependent variables simultaneously and simulate outcomes for each combination — capturing scenarios so complex that a human team might not have conceived them. But the more important point is that the relationship between risk factors in a forward-looking scenario should reflect the causal logic of the scenario narrative — not the historically-averaged correlation structure.

For example: a scenario centred on EUR sovereign fiscal stress would drive periphery sovereign spreads sharply wider, transmit into banking sector credit, keep rates elevated (removing the usual rate-fall buffer in a credit stress), and produce equity drawdowns concentrated in financial sector names. The correlation structure implied by this causal chain differs from the historical average. Modelling it correctly requires building the scenario from the narrative down — defining the causal sequence first, then deriving the risk factor shocks from it.

It's worth being precise about what this means in the context of the standard formula. The BSCR aggregation applies a prescribed correlation matrix across sub-modules — calibrated to historical averages, working exactly as designed for capital adequacy purposes. But for ORSA scenario analysis, where the goal is to understand the actual risk dynamics of a specific stress, the causal chain needs to be modelled explicitly at the scenario construction stage, before any aggregation step. That's where the analytical value lives.

3. Tail calibration breaks down exactly when it matters

For rare events — the 1-in-50 or 1-in-200 year scenarios that Solvency II's 99.5% confidence level is designed to capture — the historical database may contain zero or one observation.

Calibrating a 1-in-200 year spread shock from a 20-year historical dataset is a statistical problem with effectively a sample size of one. The scenario that emerges is highly sensitive to which specific historical episode dominates the calibration — and may bear little relationship to the actual tail risk of the current portfolio in the current market environment.

What Forward-Looking ESG Actually Requires

Narrative-first scenario construction

The most robust forward-looking scenarios start with a macroeconomic narrative, not a statistical draw. The narrative defines the causal logic: what triggers the stress, which risk factors move first, what the transmission mechanism is, and what the policy response looks like.

A narrative like "EUR peripheral sovereign fiscal stress, triggered by rising debt service costs at elevated rates, spreads to banking sector credit" generates specific, justified shocks: spread sub-modules widen sharply in BBB financials and periphery sovereigns; interest rate assumptions stay elevated rather than falling as a hedge; equity drawdowns concentrate in financial sector names. This is a forward-looking construction, calibrated to 2026 conditions, with a documented causal chain that is legible to a risk manager, a board director, and an EIOPA supervisor alike.

Regime-conditional calibration

Rather than calibrating to the full historical distribution uniformly, forward-looking ESG should condition on the current market regime. Prioritising real-time risk intelligence is key — static models tied to annual cycles can no longer keep pace with market volatility, and data has moved from historical archive to live strategic asset.

In practice, this means identifying the relevant historical analogues to current conditions and using their dynamics to inform shock magnitude — not as a replay, but as a calibration reference. A current-regime calibration for EUR spread risk in 2026 draws on periods with similar starting spread levels, similar rate environments, and similar sovereign fiscal conditions.

AI-assisted parameterisation

This is where AI adds genuine value in ESG — not by replacing actuarial judgment, but by accelerating the translation from macro narrative to quantitative parameters.

Given a scenario narrative, an AI system can identify relevant historical analogues, propose shock magnitudes by risk factor, flag cross-factor interactions implied by the narrative logic, and generate ORSA documentation explaining the rationale. The critical constraint is that the output must map cleanly to the risk factor structure: a spread shock expressed as a function of credit quality step and modified duration; an interest rate shock as a curve shape transformation with defined pivot point; an equity shock as a gross drawdown with the symmetric adjustment applied. The narrative layer is useful only when it translates precisely into the parameter layer.

The Competitive Implication

Insurance companies that move from purely historical ESG to forward-looking scenario generation don't just improve their ORSA quality. They fundamentally expand the range of risk questions they can ask and act on.

A historical scenario library answers "what happened before?" A narrative-driven ESG engine answers "what is plausible now?" — and produces a fully parameterised, regime-calibrated, causally consistent stress scenario in hours rather than weeks. When a central bank surprises, when a geopolitical shock reshapes the credit environment, when a new macro regime begins to form, the risk team with a narrative ESG engine can construct and run the relevant scenario the same day — with the causal logic documented, the risk factor shocks calibrated to current conditions, and the output ready for management, ORSA, or further analysis.

Climate variables that once appeared in quarterly risk reviews are now showing up in daily pricing models — the shift is not merely about using more data but using it dynamically. The same dynamic applies to market risk scenario generation. The competitive advantage belongs to teams that can generate forward-looking, causally grounded scenarios faster than the market moves — not teams that are waiting for history to repeat itself.

Effi Mor is the founder of RemitRix, a scenario-based risk intelligence platform focused on Solvency II market risk — covering SCR modules, economic scenario generation, and AI-assisted portfolio stress testing for insurance companies and pension funds. Risk Intelligence Weekly publishes every Tuesday.