Data on NatCat damage is often hard to find, it is rarely open-source and frequently expensive. So when I came across publicly available data from Assuralia (on their website), the association of Belgian insurers, I couldn’t ignore it.
In an age where companies and analysts get excited about terabytes and petabytes of data, I’ll focus on a dataset with just 29 observations—because meaningful insight can sometimes hide in as few as that. This particular dataset from Assuralia reports the total insured losses from flood and storm events in Belgium over the past ten years, where each event caused damages exceeding 10 million euros.
Of course, I would have preferred a dataset spanning a thousand years, with hundreds of observations and variables. But we can’t change the data we have—so the smarter question is: What can we do with it?
In what follows, I’ll present an analysis using Extreme Value Theory (EVT)—a probabilistic framework designed specifically to model extreme events in continuous data.
The origins of Extreme Value Theory (EVT) lie in hydrology. Its foundational results for the univariate case were developed in the 1950s and in the 1990s were adopted in financial and insurance applications. In my view, however, EVT remains underused—perhaps undervalued or not fully understood—despite the critical role extreme events play in risk assessment and the management of financial systems.
By definition, extreme values are rare. They lie in the far left or far right tails of a distribution. When the variable in question is a financial or economic loss, we are particularly interested in the right tail—the region of extreme losses. This is why you’ll often hear references to the tail behavior of a distribution.
In probabilistic modeling, the notion of “extreme” must be defined precisely. It may refer to:
the maximum observation over a given period (e.g. annual maxima),
an exceedance over a certain threshold (e.g. peaks over the threshold - PoT),
or even an exceedance relative to another variable (e.g. in multivariate setting).
For instance, if we’re interested in the so-called 50-year insurance loss, we may model a series of annual maxima—one value per year representing the highest loss observed in that year.
Natural catastrophes are fundamentally different from typical meteorological events. A heavy storm producing 165 mm of rainfall isn’t just a more intense version of a rainy day with 5 mm—it results from entirely different physical mechanisms. Though we may observe the same metric (e.g. precipitation), the underlying processes are distinct. This is why extreme values should be analyzed separately from the rest of the data.
From an actuarial perspective, within the frequency–severity modeling framework, the distinct nature of natural catastrophe (NatCat) events can be explained by the dependence between individual claims.
Consider a portfolio of car damage insurance policies over a given time period. In the absence of a NatCat event, claims typically arise independently—accidents occur for unrelated reasons, at different space and times, and involve different drivers. However, during a NatCat event such as a flood or a severe storm, many vehicles may be damaged simultaneously by the same external event. In this case, claims are no longer independent: damages in a specific geographic area are likely to be similar in nature, timing, and magnitude.
This dependence among claims during a single NatCat event is what fundamentally differentiates these events from ordinary claim occurrences—and why they require specialized modeling techniques beyond standard frequency–severity approaches.
Consider the insured losses in Belgium over the past ten years, as illustrated in Fig. 1. below.
Visually, most events resulted in losses below €100 million. However, two notable spikes stand out—in 2021 and 2022—with the 2021 flood event being approximately four times larger than the next-highest loss. This pattern is typical for extreme events: a new record often greatly exceeds previous ones.
Generally, there is a positive correlation between the number of claims and the total loss per event. However, the July 2021 flood deviates from this trend, suggesting that fewer but significantly more severe claims contributed to the exceptionally high total loss.
Fig. 1. The total loss per NatCat event is shown with red bars and corresponds to the right-hand scale. The total number of individual claims associated with each event is plotted in blue and referenced on the left-hand scale.
The highest recorded value in the dataset is €2,311.1 million, representing the total insured loss from the July 2021 floods.
This event will be the central focus of our subsequent analysis. Specifically, we will estimate the return period of this event using several different statistical models.
We will consider three candidate distributions:
Log-normal distribution: The log-normal distribution is widely used in modeling financial quantities such as claims and returns. It features moderately heavy tails and is a natural first choice for modeling loss data. It is defined by two parameters, which can be easily estimated from the sample. It is doubtful however if it is appropriate for Cat events.
Pareto distribution (with positive shape parameter): The Pareto distribution has heavy tails, which implies a higher probability of extreme losses compared to the log-normal. This makes it particularly suitable for modeling rare but severe NatCat events.
Generalized Pareto distribution (GPD): A core tool from EVA, the GPD generalizes the Pareto distribution and is especially appropriate when modeling exceedances over a threshold—as is the case here, given the EUR ~10 million limit. See Fig. 2.
Fig. 2. Observed (with red dots) and simulated losses over time. In reality meteorological events that generate claims occur more frequently and we might have every couple of days small claims due to meteorological events somewhere in Belgium. All these small losses are not represented in the Assuralia data (perhaps not even reported by insurers). Data capped with red dots is the only observed data. These represent values cut off at some threshold, say 10 mln. Note that, for more visual clarity, the y-axis is on a log-scale.
In the first approach, we assume that the random variable “insured loss per event” follows a Log-normal distribution, and that the dataset represents a random sample from this distribution. The Log-normal model is fully characterized by two parameters—the mean and standard deviation of the logarithm of the original variable—these two can be easily estimated from the sample of 29 observations.
In the second approach, we assume that the insured loss per event follows a Pareto distribution with a positive shape parameter, again treating the dataset as a random sample from this distribution. The Pareto distribution also has two parameters, and both can be estimated using classical statistical methods with closed-form solutions—simple enough to compute with a basic calculator. In our case, the estimated shape parameter is 1.66, indicating quite a heavy-tailed distribution.
In the third approach, we enter the realm of Extreme Value Theory (EVT). Here, we assume that the distribution of the random variable “insured loss per event” satisfies a fundamental condition in EVT—namely, that it belongs to the domain of attraction of an extreme value distribution (referred hereafter as the DA condition). Unlike the previous two models, this assumption does not require us to specify a particular distribution for the losses. It is a much weaker and more general assumption than those made under the Log-normal or Pareto models.
A key implication of the DA condition is that the distribution of excesses over a sufficiently high threshold can be approximated by the GPD. This is suitable in our case, since the available data consists of insured losses that exceed some threshold—precisely the type of data GPD is designed to model.
The GPD is defined by three parameters:
the threshold,
a scale parameter, and
a shape parameter (called also the tail index, which governs the heaviness of the tail).
For any fixed threshold, the scale and shape parameters can be estimated from the data. However, choosing the threshold is a critical and non-trivial step in the modeling process. While several diagnostic tools (e.g. mean excess plots, parameter over threshold plots) can help guide the threshold selection, there is no universal rule for identifying an “optimal” level.
Importantly, there is a trade-off:
A higher threshold leads to fewer data points and increased estimation uncertainty, hence estimator's variance.
A lower threshold may include data points for which the GPD approximation is poor, resulting in biased parameter estimates.
The goal is to strike a balance—to choose a threshold high enough to justify the GPD approximation, while still retaining enough data to produce relatively stable and meaningful estimates.
We now turn our attention to the shape parameter, also called the tail index, which plays a crucial role in determining the heaviness of the tail in the GPD.
Fig.3. below represents two diagnostic tools for identifying an optimal threshold.
On the left e have the estimates of the shape parameter plotted against various threshold levels. As a general rule of thumb, we look for a range of thresholds over which the shape parameter appears relatively stable. From the plot, it seems that for thresholds between 20 and 75 million euros, the estimated shape parameter oscillates around a constant value. Taking the average over this range yields an estimate of approximately 0.9915, i.e. very close to 1.
On the right is the mean excess function over the threshold. Under the assumptions of the GPD, this function should exhibit approximately linear behavior beyond an appropriate threshold. We exclude the far-right portion (after the threshold of 60 mln.), where fewer than 10 observations contribute, making the estimates unstable.
We can argue that the mean excess function makes a slight kink below 23–30 mln. euros. Therefore, we propose using a threshold of 25 mln. euros. At this threshold, the estimated shape parameter is 1.0034 (at 23 mln. of threshold it is 0.9832), which is very close to our earlier estimate of 0.9915. This consistency supports the conclusion that the shape parameter is probably about 1.
Fig.3. Left - tail index over the threshold. Right - the mean excess function.
With the parameters of the three models now estimated, we can use them to calculate the return period of the July 2021 flood event.
The return period of an event, expressed in years, represents the average time between two events with a magnitude equal to or greater than the level of that event. In hydrology, return periods and return levels are commonly used concepts; in statistical terms, they are closely related to exceedance probabilities and quantiles.
To estimate the return period for a loss of €2,311.1 million—the insured loss recorded during the July 2021 flood—we need to consider the probability of exceedance of that level in the distribution of the annual maxima. Since our models are based on per-event losses, and not annual maxima, we must also account for the average frequency of events per year.
We obtain the following return period estimates based on the three models:
Log-normal distribution: 423 years
Pareto distribution: 8.7 years
Generalized Pareto distribution (GPD): 31.6 years
In the report Closing the Climate Insurance Protection Gap in Belgium, page 4 notes that according to the Belgian Royal Meteorological Institute (RMI), the floods in Wallonia during the summer of 2021 have an estimated return period of 10 to 20 years. This aligns closely with our own estimates based on the Generalized Pareto and Pareto distributions.
Such similarity is not entirely surprising, as EVT is a common tool in hydrological studies—see for instance the paper Quantitative Rainfall Analysis of the 2021 Mid-July Flood Event in Belgium.
It’s important to recognize, however, that we are modeling different variables: our analysis focuses on insured losses per event, while the RMI estimates are based on rainfall intensity. While these variables are undoubtedly (tail) related, their return periods do not need to coincide.
That said, neither analysis claims absolute certainty—both could be off to some degree.
Still, it is noteworthy and reassuring that two independent approaches, based on different types of data, yield similar conclusions. This consistency strengthens the case for using EVT in modeling and managing natural catastrophe risk.
The Log-normal distribution has finite expectation and variance for all parameter values. This makes it suitable for insurance pricing and risk modeling when these metrics are used.
In contrast, for the Pareto distribution, our estimated shape parameter of 1.66 implies that neither the mean nor the variance are finite values. This presents challenges for risk quantification and pricing methods.
The situation is similar with the GPD. The shape parameter being estimated at ~1 (1.0034 or 0.9915), suggests that the insured loss per event follows a distribution with infinite mean and variance. In such cases, standard actuarial methods based on these moments become invalid.
So, what can be done?
The practical solution is to truncate the tail—that is, to limit potential extreme losses. This is typically achieved through risk transfer mechanisms such as reinsurance, catastrophe bonds, or even public-private partnerships involving the government. By capping the insurer’s exposure, the tail is effectively “cut,” bringing the distribution back to a regime with finite expectation and variance, and making risk manageable.
We have already discussed how the choice of threshold influences the estimation of the tail index—the parameter that governs the heaviness of the tail and, consequently, the probability of extreme losses. However, EVA offers a broader toolkit that can be further explored, including:
Applying multiple estimators for the shape and scale parameters
Calculating confidence intervals for parameter estimates and return levels
Modeling and analyzing the distribution of annual maxima
Exploring estimation under the Domain of Attraction (DA) condition
In conclusion, we have presented a simplified yet practical application of EVA using Belgian NatCat claims data, focusing on the variable “insured loss per event.”
One of our key findings is that the July 2021 flood may not be as rare as previously assumed—a conclusion supported by both our analysis and the return period estimate provided by the Belgian Royal Meteorological Institute.
EVA is not just a theoretical tool—it is a practical and transparent method for understanding and managing the financial impact of extreme events. It is a powerful statistical framework that provides (re)insurers with valuable insights into their NatCat risk, complementing upon the traditional third-party catastrophe models.