Observing risk, mission possible
In our last post, we took a closer look at Value at Risk (VaR) and discovered something rather unsettling: it's fundamentally unobservable. During the VaR era, we seem to have consistently mistaken risk predictions for actual risk measurements. It was as if we expected reality to bend to the will of our models, rather than the other way around. The exact opposite of what the scientific method prescribes.
Photo: Dario Morandotti
This leads to a common refrain: risk, we're told, is a latent variable. Therefore, no matter which risk measure we choose, it will always remain elusive, forever beyond our direct observation. We've heard this argument countless times. But is this correct?
Buy-side risk professionals measure risk in terms of variance and its variants (volatility, tracking error, and so on), yet they never find themselves wrestling with complex models of P&L distributions, overflowing with assumptions. Instead, they estimate “realized variance” directly from past returns. Grinold and Kahn (1999, 2nd edition, p. 41), for example, state with disarming simplicity that “Risk is the standard deviation of returns,” and they build an entire theory for Active Portfolio Management, without ever mentioning any P&L model.
What do we learn from this? Simply that variance is observable and can be directly estimated from observational data. And so is risk, provided one doesn't choose an unobservable risk measure. More precisely, “realized variance” is a (positively biased) estimator of the average true variance of a portfolio over a chosen observation period — a direct (prudential) measurement of the risk a portfolio actually experienced, not just a prediction under a hypothetical model.
The problem with VaR lies in the fact that there doesn't (and cannot) exist such a thing as "realized VaR." This is demonstrated by the equivalent statement that the VaR backtest is not "sharp" (Acerbi and Székely 2023, sec. 3.2), meaning it measures only the model acceptance probability (p-value) but is completely insensitive to the discrepancy between predicted VaR and true VaR, the latter remaining intrinsically unknowable.
So, if you've sometimes had the impression that buy-side risk professionals are allergic to VaR, you were probably right, and you now understand they may actually have a valid point.
But what about our colleagues in sell-side risk and the insurance industry? After all, variance isn't ideal when it comes to measuring one-sided, tail risk. Is there an observable alternative to VaR for them? Surprisingly, there is. And it's not some revolutionary new idea. The solution has been around for quite some time: Expected Shortfall (ES).
However, the observability of ES only becomes apparent when it's directly backtested, using the "ES ridge backtest." Simply calculating ES instead of VaR, as the industry has largely done since its introduction (2002, UR, AT), is akin to playing make-believe with ES predictions, mistaking them for actual risk measurements, just as we did with VaR. Sadly, this is also what Basel III still prescribes today within the Internal Model Approach for Market Risk: banks are required to calculate ES, yet backtest VaR — an inconsistent approach, which was chosen only because a robust ES backtest wasn't available when the first draft of the Fundamental Review of the Trading Book was written in 2012.
How to backtest ES has been a difficult nut to crack that kept academics busy for over a decade. We’ll cover this interesting subject in a dedicated post. As you will see, ES is just like a Schrodinger’s cat, backtestable and not backtestable at the same time. What is important here, is that a conclusive result was obtained (Acerbi and Szekely 2019) showing that there exists an optimal backtest (minimally and prudentially biased), the “ridge backtest” which is the only test for ES whose bias doesn’t produce type II errors (acceptance of wrong models). All other existing or conceivable tests for ES do produce type II errors, and are therefore unsuitable for prudential regulation purposes.
Beyond these fundamental results, what is most interesting here is that the ridge backtest can, surprisingly, be expressed as the difference between the average predicted ES and an emerging notion of "Realized ES," in complete analogy with "Realized variance."
Realized ES is a (positively biased) estimator of the average true ES of a portfolio over a chosen observation period — a direct (prudential) measurement of the risk a portfolio actually experienced, not just a prediction under a hypothetical model. Yes, you’re right: we copy-pasted a previous paragraph. Find: “variance”; replace: “ES”.
ES is observable, unlike VaR. As simple as that.
The ridge backtest is more than just another backtest. It serves as a bridge between sell-side and buy-side risk standards and introduces a new paradigm in risk model validation, dispelling the long-held belief that risk is a latent variable and thus unobservable.
Sophisticated sell-side risk technologies will be available to funds in the future, and realized risk metrics will permit banks to observe and manage their true risk.
We have become accustomed to the binary idea that models must be either accepted or rejected, simply because this was the only output a VaR backtest could provide. However, we now see that models can also be corrected, provided we use an observable risk measure (and ES is not unique in this regard) and an adequate backtest that also measures prediction discrepancies. This fact has enormous consequences, worth another conversation. Stay tuned.
Hidden in plain sight, lack of observability has been the Achilles’ heel of VaR models, and the source of the main troubles in banking regulation over the past three decades. This is finally coming to an end.