1. Introduction
Ultimate loss ratio estimates change over time. The initial loss ratio estimate that emerges from the pricing analysis for a tranche of policies soon gives way to a new estimate as time passes and claims begin to emerge (or not). By the time all claims have been paid, the loss ratio is likely to have been re-estimated many times. The focus of this paper is on how to model the future revisions of these ultimate loss ratio estimates. We illustrate the approach using loss ratio estimates based on chain ladder and Bornhuetter-Ferguson methods underpinned by a simple stochastic model described by Hayne (1985).
There appears to be little, if any, actuarial literature on the subject of behavior of an ultimate loss ratio estimate between the time when it is made and the time when its final value becomes known, i.e., the point at which all claims have been paid. Various authors have sought to address uncertainty in the ultimate loss ratio estimate, but generally from the perspective of a single point in time.
For example, Hayne (1985) proposed a lognormal model of loss development that supports the construction of confidence intervals around the ultimate loss ratio estimate.[1] Kelly (1992) and Kreps (1997) also used a lognormal framework to explore issues of parameter estimation and parameter uncertainty, respectively. Hodes, Feldblum, and Blumsohn (1996) used a slightly different lognormal development model to quantify the uncertainty in workers compensation reserves. Mack, Venter, and Zehnwirth have all written extensively about stochastic modeling of the loss development process.[2] Others, including Van Kampen (2003), Wacek (2005), and the American Academy of Actuaries Property and Casualty Risk-Based Capital Task Force (1993), have sought to quantify the uncertainty in the ultimate loss ratio estimate used in pricing and reserving applications directly, without reference to the loss development process. The question on which all of these authors focused their attention is the potential variation in the final loss ratio at ultimate compared to the current ultimate loss ratio estimate, with no reference to how the ultimate loss ratio estimate might vary at intermediate points in time.
In contrast, in his acclaimed paper on solvency measurement, Butsic (1992) observed that loss estimates change in their march through time. He recognized that they, like stock prices, are governed by a diffusion process, a type of continuous stochastic process with a time-dependent probability structure. However, he did not propose a model of this stochastic process.
How ultimate loss ratio estimates change in the future depends in part on the method used to make the estimates. In this paper we assume that loss ratio estimates are derived from a consistently applied estimation process with minimal subjective overriding of the indicated result. We model the behavior of loss ratio estimates using stochastic versions of two loss development methods: the chain ladder method and the Bornhuetter-Ferguson method, both using paid development data. To model chain ladder estimates, we combine Hayne’s and Butsic’s ideas to synthesize a lognormal diffusion model for the path of the ultimate loss ratio. Then we adapt that model to the Bornhuetter-Ferguson method.
This conceptual framework, which could easily be adapted to handle other loss development models, provides actuaries with the means to give their clients more information about how much their loss ratio or reserve estimates may fluctuate from period to period. As such, it can be a useful tool for managing expectations about the variability of loss reserve estimates. It also has potential application in a number of other areas of actuarial analysis, as we will discuss later.
The paper comprises six sections, the first being this introduction. In Section 2 we outline Hayne’s lognormal model of chain ladder loss development and illustrate its application using industry Private Passenger Auto Liability data from the 2004 Schedule P. We illustrate the main benefit of a stochastic model for loss development, namely, the ability to measure the uncertainty in loss development factors and in the ultimate loss ratio estimate.
In Section 3 we discuss the effect of future loss emergence on future ultimate loss ratio estimates. We show how to use information implicit in Hayne’s model to determine the distribution of future estimates derived from our stochastic versions of the chain ladder and Bornhuetter-Ferguson methods, with particular attention to the loss ratio estimate one year out. We again use industry Private Passenger Auto Liability data to illustrate the process.
In Section 4 we adjust Hayne’s model to allow for parameter uncertainty and illustrate the effect. Because the adjusted distribution does not have the multiplicative properties of the lognormal, we illustrate the use of Monte Carlo simulation to model the distribution of future ultimate loss ratio estimates.
In Section 5 we conclude with an outline of potential applications of the framework for future ultimate loss ratio estimates in loss reserving and risk-based capital applications.
2. Hayne’s lognormal loss development model
Hayne presented two models of chain ladder loss development: one that assumed that development is independent from one period to the next, and a second one that relaxed the independence assumption. We will adopt the first model (and henceforth refer to it simply as “Hayne’s model”). Kelly (1992) argued that independence is more plausible for paid loss development than for case-incurred development. We will use paid losses to illustrate our framework, but a good case can also be made for its application to ultimate loss ratio estimates based on case-incurred losses, even if all of the underlying assumptions are not always met.
Hayne’s model is quite simple. He assumed that age-to-age development factors are lognormally distributed. The product of independent lognormal random variables is also lognormal, which implies that age-to-ultimate loss development factors are lognormal. Because the product of a constant and a lognormal random variable is lognormal, if we are given the cumulative paid loss ratio at any age and the estimated parameters of the matching age-to-ultimate factor, we can determine the parameter estimates of the ultimate loss ratio. Using these parameters we can estimate the expected loss ratio (which we will take as the “best” estimate) as well as confidence intervals around that estimate.
The lognormal parameters μ and σ of the age-to-age factors can be estimated by a variety of methods. Hayne used (and we also prefer) the unbiased estimators
\[ \begin{array}{l} \bar{y}=\frac{1}{n} \sum_{i=1}^{n} y_{i}=\frac{1}{n} \sum_{i=1}^{n} \ln \left(x_{i}\right) \quad \text { and } \\ s^{2}=\sum_{i=1}^{n} \frac{\left(y_{i}-\bar{y}\right)^{2}}{n-1} \end{array} \]
for [3] is also a maximum likelihood estimator.
and respectively, where are the observed age-to-age factors.2.1. Illustration of model parameter estimation
We illustrate the parameter estimation for Hayne’s model using the real loss development data presented in Tables 1 and 2. Table 1 shows industry aggregate Schedule P net paid loss development data for Private Passenger Auto Liability for accident years 1995 through 2004 from the 2004 Annual Statement[4] together with the associated paid loss age-to-age development factors. The paid loss ratios at age one year are also included in the development factor table. Table 2 shows the natural logarithms of the age-to-age factors and the age one year paid loss ratios. The rows labeled “Mean” and “S.D.” in Table 2 show the unbiased estimators for μ and σ, respectively, given the data in the body of the column.[5]
For example, in Table 2 the mean and standard deviation of the natural logarithms of the observed age 1 to 2 development factors are 0.569 and 0.016, respectively. If we set μ = 0.569 and σ = 0.016,[6] these parameter estimates for prospective age 1 to 2 development imply a lognormal mean, defined as E(x) = exp[μ + 0.5σ2], of 1.767, which matches the mean loss development factor calculated by the traditional method in Table 1. The same is true for all of the other age-to-age factors. Similarly, the parameter estimates for the age one paid loss ratio are −1.246 and 0.069 for μ and σ, respectively, which imply a lognormal mean of 28.8%. This, too, matches the mean age one paid loss ratio shown in Table 1.[7]
The parameter estimates for the prospective age-to-age factors can be combined using the multiplicative property of lognormal distributions to determine parameter estimates for prospective age-to-ultimate factors. The product of
lognormal random variables with respective parameter sets is a lognormal random variable with parameters\[ \begin{array}{l} \mu=\sum_{i=1}^{n} \mu_{i} \quad \text { and } \\ \sigma=\left(\sum_{i=1}^{n} \sigma_{i}^{2}\right)^{1 / 2}. \end{array} \]
For example, treating age 10 as ultimate, in Table 2 the μ parameter estimate for the age 7 to ultimate development factor is the sum of the mean age-to-age factor natural logarithms for ages 7 to 8, 8 to 9, and 9 to 10: 0.005 + 0.003 + 0.001 = 0.009. The corresponding σ parameter estimate is the square root of the sum of the variances of the natural logarithms of the same age-to-age factors: = 0.001. Note that the lognormal means (labeled “LN Fit LDFs” in Table 2) implied by these age-to-ultimate parameters match the age-to-ultimate development factors shown in Table 1.
The ultimate chain ladder loss ratio estimates indicated by this analysis as of the end of 2004 for accident years 1995 through 2004 are summarized in Table 3. In this example, the lognormal loss development model produces the same loss ratio estimates as the traditional deterministic chain ladder loss development method. If we were interested only in these mean estimates, the traditional approach would suffice. However, we also want to measure the uncertainty in the loss ratio estimates, and for that purpose the richer lognormal model is superior.
2.2. Measurement of loss development uncertainty
If we assume μ = y and σ = s based on the data for each age-to-age development period, we can calculate the lower and upper bounds of a two-sided 95% confidence interval for prospective age-to-age factors as exp[y − N−1(97.5%) · s] and exp[y + N−1(97.5%) · s], respectively, where N−1(97.5%) is the value of the standard normal cdf corresponding to a cumulative probability of 97.5%.[8] Similarly, using the parameter estimates for the age-to-ultimate factors, we can also determine confidence intervals for age-to-ultimate factors. We have tabulated these 95% confidence intervals based on the industry Private Passenger Auto Liability Schedule P data as of the end of 2004 in Table 4.[9]
Table 4 indicates that the age 1 to 2 development factor, which has an estimated mean of 1.767, should fall within a range of 1.710 to 1.824 95% of the time. The age 1 to ultimate development factor, which has an estimated mean of 2.508, can be expected to fall within a range of 2.423 to 2.595 95% of the time. Given the accident year 2004 paid loss ratio of 26.6% at age 1, these confidence intervals imply a paid loss ratio range at age 2 of 45.5% to 48.5% (47.0% ± 1.5%) and an ultimate loss ratio range of 64.4% to 69.0% (66.7% ± 2.3%).[10]
As we would expect, the development factors for more mature accident years have tighter confidence intervals. For example, the age 5 to 6 factor, which in a year end 2004 analysis would be applicable to accident year 2000, has an estimated mean of 1.020 and a 95% confidence range of 1.018 to 1.022, implying that 95% of the time the accident year 2000 paid loss ratio of 76.7% as of the end of 2004 will develop to a paid loss ratio of 78.1% to 78.4% by the end of 2005, a range of 0.3 points. The 95% confidence interval for the age 5 to ultimate factor, which has an estimated mean of 1.039, is a range of 1.034 to 1.043. That implies an ultimate loss ratio range of 79.3% to 80.0%, or 0.7 points.
All of these development factor, loss ratio, and confidence interval estimates are as of the end of 2004. They are all subject to change as new information in the form of actual future loss emergence becomes available. In the next section we will show how to use information implicit in Hayne’s approach to model the effect of future loss emergence on these estimates.
3. A model for future ultimate loss ratio estimates
Any estimate of the ultimate loss ratio for a particular accident year is quickly made obsolete by subsequent actual loss emergence. Because of this rapid obsolescence, the ultimate loss ratio must be re-estimated periodically in light of the loss development in the period since the previous evaluation. That loss development affects the new estimate in two ways.
3.1. Sources of variation in future loss ratio estimates
First, the actual accident year loss emergence replaces the expected emergence in the loss ratio projection. For example, in Table 3 the Private Passenger Auto Liability accident year 2004 ultimate loss ratio of 66.7%, estimated as of the end of 2004, was determined by applying an age-to-ultimate factor of 2.508 to the paid loss ratio of 26.6%. That age-to-ultimate factor reflected an expected age 1 to 2 development factor of 1.767 combined with an age 2 to ultimate factor of 1.420.
It is likely that actual age 1 to 2 loss development will vary from the expected. If, for example, the actual accident year 2004 emergence during 2005 (from age 1 to 2) corresponds to a development factor of 1.75, then in the ultimate loss ratio analysis conducted at the end of 2005 this actual development factor will replace the expected development factor of 1.767. If the age 2 to ultimate factor remains unchanged at 1.420, the chain ladder estimate of the ultimate loss ratio will become 26.6% × 1.75 × 1.42 = 66.1%. Assuming an expected ultimate loss ratio of 66.7%, the Bornhuetter-Ferguson loss ratio estimate will become 26.6% × (1.75 − 1.767) + 26.6% × 1.767 × 1.42 = 66.1%.[11],[12]
Of course, loss emergence with respect to older accident years might cause a revision in the prospective age 2 to ultimate factor. This potential for tail factor revision is a second source of uncertainty. For example, suppose the actual age 2 to 3 development on accident year 2003 during 2005 corresponds to a factor of 1.210. If that factor is averaged with the previous eight-point mean of 1.198 determined in Table 1 (using loss development data through 2004), the result is a revised age 2 to 3 development factor of 1.199. Assuming the same process is repeated for the other development periods, a revised age 2 to ultimate factor will be obtained. If the resulting age 2 to ultimate factor is 1.425, the revised chain ladder ultimate loss ratio estimate is given by 26.6% × 1.75 × 1.425 = 66.3%, a reduction of 0.4% from the year end 2004 ultimate loss ratio estimate of 66.7%. The revised Bornhuetter-Ferguson estimate in this case is given by 26.6% × (1.75 − 1.767) + 26.6% × 1.767 × 1.425 = 66.5%.
The foregoing is an illustration of just one scenario of the loss development that might occur in 2005 and its effect on the ultimate loss ratio estimate. We can use information developed in Hayne’s framework to model these two effects generally.
3.2. Modeling the first source of variation—Accident year development
The first effect is captured by the lognormal random variable estimated for the next year of development with respect to the accident year under review. For example, for accident year 2004, which at the end of 2004 is age 1, the lognormal distribution with μ = 0.569 and σ = 0.016 models age 1 to 2 paid development. Then, since the age 1 paid loss ratio is 26.6%, the paid loss ratio distribution at age 2 is lognormal with parameters μ = ln26.6% + 0.569 = −0.756 and σ = 0.016, implying a mean of 47.0%.
If the mean age 2 to ultimate factor (the tail factor) of 1.42 does not change, then the distribution of the revised chain ladder ultimate loss ratio estimate at age 2 (i.e., one year out) has lognormal parameters μ = ln26.6% + 0.569 + ln1.42 = −0.406 and σ = 0.016. The random variable for this chain ladder estimate
can be expressed as a function of the paid loss ratio random variable and the expected value of the mean tail factor:\[ x_{C L}=x_{P} \cdot E(\text { tail }) \tag{3.1} \]
The random variable
for the comparable Bornhuetter-Ferguson estimate is a shifted version of the random variable for the age 2 paid loss ratio:\[ x_{B F}=x_{P}-E\left(x_{P}\right)+E\left(x_{P}\right) \cdot E(t a i l) \tag{3.2} \]
As defined by Formulas 3.1 and 3.2, both
and reflect the uncertain impact of accident year 2004 development during 2005 on the updated ultimate loss ratio estimate that will be made at the end of 2005, but do not reflect the potential impact of tail factor revision.3.3. Modeling the second source of variation—Tail factor revision
The second effect, due to tail factor revision, is captured by measuring the effect of the lognormal loss development modeled for the next year on the existing mean age-to-age and age-to-ultimate factors. For example, the mean age 2 to 3 development factor shown in Table 1 is 1.198. This is a mean of eight data points. What will be the effect on the mean of adding a ninth data point (representing 2005 development on accident year 2003), given that it will arise from a lognormal distribution with parameters μ = 0.181 and σ = 0.005 (and mean of 1.198)? The uncertain ninth data point will contribute one-ninth weight to the revised mean age-to-age factor. There is no uncertainty about the existing mean age 2 to 3 factor—it is a constant. Therefore, the σ parameter of the distribution of the revised mean age 2 to 3 factor one year out, given an additional year of actual development, is given by = 0.001. The μ parameter is given by ln1.198 − 0.5 · 0.0012 = 0.181. We can use the same process to estimate μ and σ parameters for the comparable distributions of mean age-to-age factors one year out for all such factors comprising the development tail.[13] We can then combine the revised mean age-to-age factor parameters to determine the parameters of the revised mean age-to-ultimate factor distributions. See Table 5 for a tabulation of the parameters of these revised mean age-to-age and age-to-ultimate distributions for all ages. The σ of the distributions of revised factors for age 3 to 4 and beyond is less than 0.0005 (and thus displayed as 0.000 in Table 5), indicating that for Private Passenger Auto Liability, the uncertainty arising from the potential for tail factor revision is very small. This is confirmed by the very narrow confidences intervals.
3.4. Modeling the revised loss ratio estimate one year out
We can now combine these two effects to determine the distribution of the revised ultimate loss ratio estimate that will be determined in one year’s time based on the updated loss development experience that will then be available.
To determine the distribution of the revised chain ladder estimate, we start with the actual accident year paid loss ratio, which we then multiply by the lognormal random variables for (1) the age-to-age factor for the next year of development (obtaining the random variable
of the paid loss ratio one year out) and (2) the revised age-to-ultimate factor beyond the next year of development. Using accident year 2004 as an example, as of the end of 2004 the ultimate loss ratio estimate is 66.7%, which has been determined by multiplying the paid loss ratio of 26.6% first by an age 1 to 2 factor of 1.767 and then by an age 2 to ultimate factor of 1.420. In order to model the ultimate loss ratio estimate one year later, at the end of 2005, we replace the constant age 1 to 2 factor of 1.767 with the lognormal random variable with parameters μ1 = 0.569 and σ1 = 0.016. In addition, we replace the constant age-to-ultimate factor of 1.420 with the lognormal random variable with parameters μ2 = 0.350 and σ2 = 0.001. The expected values of these two lognormal random variables are 1.767 and 1.420, respectively. The product of the paid loss ratio (a constant) and these two lognormal random variables is lognormal with parameters μ = lnP + μ1 + μ2 and σ = where P represents the actual paid loss ratio at the end of 2004, which, in this example, implies μ = −1.325 + 0.569 + 0.350 = −0.406 and σ = = 0.017.Generally, we can express the random variable
as the product of the two lognormal random variables and tail, representing the paid loss ratio one year out and the mean tail factor:\[ x_{C L}=x_{P} \cdot \text { tail } \tag{3.3} \]
Now we are in a position to determine confidence intervals for the revised chain ladder ultimate loss ratio estimate at the end of 2005. The endpoints of the two-sided 95% confidence interval are given by exp[μ − N−1(97.5%) · σ] and exp[μ + N−1(97.5%) · σ], which imply an estimated loss ratio range one year out for accident year 2004 of 64.5% to 68.8%, or approximately 66.7% ± 2.1%. Confidence intervals for ultimate loss ratio estimates one year out for the other accident years can be estimated in the same way and are tabulated together with those for accident year 2004 in Table 6.
To determine the distribution of the comparable revised Bornhuetter-Ferguson estimate, we replace the constant E(tail) in Formula 3.2 with the random variable tail:
\[ x_{B F}=x_{P}-E\left(x_{P}\right)+E\left(x_{P}\right) \cdot \text { tail } \tag{3.4} \]
We can also determine confidence intervals for the revised Bornhuetter-Ferguson loss ratio estimate at the end of 2005. However, because the sum of two lognormal random variables, in this case Table 7. For each of the trials we randomly selected observations from the distributions of and tail, assuming independence, and combined them according to Formula 3.4 to arrive at a simulated Bornhuetter-Ferguson estimate. After tabulating the results of 10, 000 such trials, we determined the lower and upper bounds of the 95% confidence interval of the loss ratio estimate by identifying the 2.5 percentile and the 97.5 percentile of the trial values. Not surprisingly, the 95% confidence intervals for the revised Bornhuetter-Ferguson estimates are narrower in every case than the revised chain ladder estimates.
and tail, is not expressible in closed distributional form, the confidence intervals must be estimated using Monte Carlo simulation. The results of a simulation involving 10, 000 trials are shown in3.5. Modeling the revised loss ratio estimate—Other time horizons
We can extend this process to longer time horizons and determine the distribution of the ultimate loss ratio estimate two years out, three years out, and so on, until the time horizon encompasses the point when all claims are expected to have been settled. The modeling is conducted in essentially the same way as for the one-year time horizon. For example, in the case of a two-year horizon, the first source of uncertainty (accident year development) is modeled using the distribution of the age j to j + 2 development factor, where j is the age in years of the accident year under review. The second source of uncertainty (potential tail factor revision) is modeled by reference to the potential effect of two additional development data points on the mean tail factor for age j + 2 to ultimate development. The analysis of a three-year time horizon focuses on accident year development from age j to j + 3 and the tail factor from j + 3 to ultimate, but is otherwise identical to that for the one-year and two-year time horizons. The analysis of the ultimate loss ratio estimate at points further in the future proceeds in the same way.
Alternatively, we can model the path of the ultimate loss ratio estimate as a succession of annual revaluations. Figure 1 illustrates this by plotting the results of one simulation of the path of the accident year 2004 loss ratio estimates through time for estimates determined from both chain ladder and Bornhuetter-Ferguson methods. It represents just one path among many possibilities. The simulation was performed from the vantage point of the end of 2004. As such, it incorporates everything we know about actual loss development through that time as well as what we can infer about the structure of future development. We started with the accident year 2004 loss ratio estimate as of the end of 2004, which was 66.7%. Then, based on one random simulation of loss development during calendar year 2005, we made new chain ladder and Bornhuetter-Ferguson estimates of the ultimate loss ratio as of the end of 2005. We repeated the process for calendar years 2006 through 2013, using the cumulative loss development through each valuation date. Figure 1 is a plot of the results. A complete description of the probability structure of the path can be built up from a simulation involving a large number of random trials, or, in the chain ladder case, directly from the properties of the lognormal distribution.
In practice, there might not be much benefit in determining the distribution of the chain ladder ultimate loss ratio estimate for time horizons between one year and the ultimate horizon (when all claims have been settled), at least for Private Passenger Auto Liability.[14] We see this in Table 8, the top half of which compares the 95% confidence intervals for the accident year 1995 through 2004 chain ladder loss ratio estimates one year out with confidence intervals for the accident year loss ratio estimates over the ultimate time horizon. If we contrast the 95% confidence interval for accident year 2004 for the one-year horizon with the 95% confidence interval for the chain ladder loss ratio estimate over the ultimate time horizon, we can see that the contribution from the out years is dwarfed by the contribution from the next 12 months. The 95% confidence interval for the ultimate time horizon indicates a range for the accident year 2004 loss ratio of 66.7% ± 2.3%, which is barely wider than the range for just one year out. This is true not only for accident year 2004, but also holds for accident years 1995 through 2003.
For example, the accident year 2003 confidence interval of approximately 67.8% ± 0.7% for a one-year time horizon is almost as wide as that for the time horizon to ultimate of 67.8% ± 0.8%. For all of the older accident years, the first year of future development accounts for more than half of the variation associated with the ultimate time horizon.
This phenomenon is not confined to loss ratio estimates over short vs. longer time horizons. The same effect is also seen in other situations not related to insurance, where variability is a function of time. For example, given the common assumption that future stock price movements are lognormally distributed and independent, the 95% confidence interval for a stock price one year out, given constant annualized volatility of σ = 20% and an expected value of $66.70, is $45.07 to $98.71, a range of $53.64. Assuming the same expected value of $66.70, the 95% confidence interval for the stock price two years out is $38.22 to $116.11, a range of $77.80. The confidence interval range for the one-year time horizon stock price is 69% of the price range for the two-year time horizon. The reason for the disproportionate impact of the first period is that the confidence interval is not a linear function of σ but rather of
where t represents the time lag in years. In the case of chain ladder ultimate loss ratio estimation, where the age-to-age σ typically declines as the accident year ages, this effect can be even more pronounced.Turning now to the Bornhuetter-Ferguson estimates, which are inherently less variable, the effect is smaller but still evident. The bottom half of Table 8 compares the 95% confidence intervals for accident year 1995 through 2005 loss ratio estimates one year out with the confidence intervals for the loss ratio estimates over the ultimate time horizon. In the Private Passenger Auto Liability example considered here, the 95% confidence interval for the accident year 2004 loss ratio estimate is approximately 66.7% ± 1.6%, which is about two-thirds of the range of the confidence interval for estimates at the ultimate time horizon. For all of the older accident years, as in the case of the chain ladder estimates, the first year of future development accounts for more than half of the variation associated with the ultimate time horizon.
3.6. Modeling the loss ratio estimate at inception
Up to this point we have focused on modeling the distribution of the ultimate loss ratio after losses have begun to emerge. However, there is no reason why we cannot extend essentially the same procedure backward to the inception of loss exposure at age 0. Indeed, the benefit of doing so is that we can obtain a complete model of the path of the ultimate loss ratio from inception to ultimate.
The main difference in the procedure is that the lognormal model for loss emergence between age 0 and 1 describes the behavior of the paid loss ratio rather than an age-to-age factor. The rest of the analysis is merely an application of Formula 3.3.
For example, assume for the sake of illustration that the age 1 paid loss ratios in Table 1 are lognormally distributed and reflect “on level” adjustments to the accident year 2005 level. The mean age 1 paid loss ratio is 28.8%, which we can take as an estimate of the 2005 “on level” age 1 paid loss ratio. The unbiased estimates of the parameters of the lognormal distribution representing the paid loss ratio at age 1 are μ = −1.246 and σ = 0.069. These parameters imply a lognormal mean paid loss ratio of 28.8% that matches the sample mean. The age 1 to ultimate development factor of 2.508 implies an ultimate loss ratio estimate at inception of 72.3%.
Applying the lognormal multiplicative rule described in Section 2, the parameters of the lognormally distributed ultimate loss ratio (at the ultimate time horizon) are μ = −1.246 + 0.919 = −0.327 and σ = Table 6.
= 0.071, implying a 95% confidence interval of 62.8% to 82.9%, a range of 20.1%. The parameters of the ultimate loss ratio one year out are μ = −1.246 + 0.919 = −0.327 and σ = = 0.069. The indicated 95% confidence interval is 63.0% to 82.6%, a range of 19.6%. These calculations are summarized inThe comparable Bornhuetter-Ferguson estimate can be determined by applying Formula 3.4. Table 7 shows that the 95% confidence interval for the revised Bornhuetter-Ferguson estimate of the accident year 2005 loss ratio one year out is 68.6% to 76.4%, a range of 7.8%.
4. Adjusting the model for parameter uncertainty
In Section 2 we explained that, given the observations
arising from a lognormal process and the natural logarithms of the same observations (where ), the mean and standard deviation of the log-transformed sample are unbiased estimators of the lognormal process parameters and respectively. The parameter selections and define the lognormal distribution that best fits the data, using unbiasedness as the criterion for “best.”However, while these are good estimates of the parameters, there is uncertainty about their true values. Fortunately, by combining information contained in the sample with results from sampling theory, it is possible to determine the mixed distribution ƒ(x) that reflects the probability weighted contribution of all of the potential parameter values.[15] Wacek (2005) showed that ƒ(x) defines a “log t” distribution[16] and in particular that the random variable y = lnx is Student’s t with n − 1 degrees of freedom, mean y and variance s2 · (n + 1)/n · (n − 1)/(n − 3).[17]
4.1. Log t confidence intervals
The bounds of the two-sided [18] Two-sided 95% confidence intervals for Private Passenger Auto Liability age-to-age factors, based on the log t distribution, are shown in Table 9. Unfortunately, the log t distribution does not share the multiplicative property of the lognormal. As a result, we cannot specify the distribution of age-to-ultimate development factors in closed form. Instead, the age-to-ultimate factor distributions and related confidence intervals must be estimated using a Monte Carlo simulation procedure that determines the age-to-ultimate factor from the underlying age-to-age factors for each random trial.
confidence interval are given by and respectively, where is the value of the standard Student’s cdf with n − 1 degrees of freedom corresponding to a cumulative probability of 97.5%.In the top section of Table 9, we have tabulated the indicated log t 95% confidence intervals for age-to-age factors based on the industry Private Passenger Auto Liability 2004 Schedule P data, together with the ratios of these confidence interval bounds to the lognormal confidence interval bounds given in Table 4. In addition, we have tabulated the sample size for each development period as well as (97.5%) and the degrees of freedom used in the calculations. At the risk of being seen as statistically less than rigorous, we set a minimum degrees of freedom value of 3 for purposes of calculating the confidence intervals to avoid using log t distributions with an undefined variance.
The log t confidence intervals shown in Table 9 for age-to-age factors are very close to the lognormal confidence intervals given in Table 4. The largest difference is in the age 1 to 2 factor, where the upper bound of the logt interval is 1.839, which is only 0.8% larger than the lognormal upper bound of 1.824. The percentage differences for the other age-to-age factors are smaller.
In the lower section of Table 9, we have tabulated the 95% confidence intervals for age-to-ultimate factors indicated by a Monte Carlo simulation involving 10, 000 trials. As was the case with the age-to-age factors, the differences between the logt confidence intervals and lognormal confidence intervals for the age-to-ultimate factors are quite small. For example, the largest difference is in the age 1 to ultimate confidence interval, where the upper bound of the log t interval is 2.619. This is only 0.9% larger than the lognormal upper bound of 2.595. The percentage differences for the other age-to-ultimate factors are smaller. This suggests that, at least for Private Passenger Auto Liability at the industry level, the effect of parameter uncertainty is small enough that it can be ignored. However, that is probably not the case for individual insurers, particularly small ones, writing Private Passenger Auto Liability or for other lines of business.
4.2. Log t simulation of development factors
In the Monte Carlo simulation of age-to-ultimate factors, for each trial we randomly selected one age-to-age factor from each of the logt distributions representing development from age 1 to 2, age 2 to 3, . . . , age 9 to 10. Treating age 10 as ultimate, we then multiplied these age-to-age factors in the usual way to determine a set of age-to-ultimate factors for that trial. After the results of the 10,000 trials were tabulated, we determined the lower and upper bounds of the 95% confidence interval for each age-to-ultimate factor (age 1 to ultimate, age 2 to ultimate, etc.) by identifying the 2.5 percentile and the 97.5 percentile of the 10, 000 trial values.
To make the random age-to-age factor selections, we started with a random draw
from the uniform distribution defined on the interval Because has a value between 0 and 1 , it can be treated as though it is a cumulative probability. The number that corresponds to a standard Student’s cumulative probability of is a random number from the standard Student’s distribution with degrees of freedom, which has a mean of zero and a variance of More generally, the corresponding random number from the Student’s distribution with degrees of freedom, mean and variance is given by which corresponds to a random number of from the related distribution. Substituting the appropriate values of for and for we obtain as the value of a randomly selected age-to-age factor.Putting some numbers to it, a draw of R = 0.873 implies a random age 1 to 2 development factor from the corresponding log t with 8 degrees of freedom of exp(0.569 + 1.229 · 0.016 · [19] If the next draw is R = 0.239, then the random age 2 to 3 factor, drawn from the corresponding log t with 7 degrees of freedom, is exp(0.181 + (−0.749) · 0.005 = 1.194. Random numbers corresponding to the other development periods are similarly obtained. Then the age 1 to ultimate factor, the age 2 to ultimate factor, age 3 to ultimate factor, and so on, are obtained by multiplication. Tabulation of these results completes the first trial. The process is repeated in the same way for 10, 000 trials.
= 1.803.4.3. Log t simulation of future loss ratio estimates
Under conditions of parameter uncertainty, the distribution of future loss ratio estimates must also be modeled using Monte Carlo simulation. Each of the lognormal age-to-age development components identified in Section 3 must be replaced with corresponding log t components.
For example, to estimate the distribution of the updated chain ladder estimate of the accident year 2004 ultimate loss ratio at the end of 2005, given the year-end 2004 estimate of 66.7%, we tabulated 10, 000 randomly obtained year-end 2005 loss ratio estimates. To determine each loss ratio estimate, we randomly selected from the log t distributions that represent the factors that contribute to the uncertainty in that estimate. For each trial we randomly selected one factor from the distribution of accident year 2004 development during 2005 and one factor from each of the age-to-age factor distributions that contribute to the revised tail factor. Then we multiplied all of these factors and the paid loss ratio as of year end 2004 to arrive at the ultimate loss ratio estimate for that trial.
This is illustrated in detail in Table 10 for one trial, where the simulated actual accident year 2004 age 1 to 2 development factor is 1.727 (compared to an expected factor of 1.767) and the revised tail factor is 1.418 (compared to an expected 1.420). The product of the year-end 2004 paid loss ratio and these two factors is the revised estimated ultimate loss ratio for accident year 2004 as of the end of 2005.
To arrive at approximate distributions of revised chain ladder ultimate loss ratio estimates for all of the accident years 1995 through 2004 as of the end of 2005, the process described in the preceding paragraph was repeated 10, 000 times for each accident year.[20] The results of this process are summarized in Table 11, which, as the log t version of Table 8, compares the 95% confidence intervals for the accident year 1995–2004 loss ratio estimates one year out with the confidence intervals for the estimates over the ultimate time horizon. The chain ladder estimates are summarized in the top half of the table and the Bornhuetter-Ferguson estimates in the bottom half. As we observed in the lognormal case, much of the potential variation in the ultimate loss ratio estimates that is expected over the time horizon to ultimate is encompassed in the variation expected over a one-year time horizon. For example, the logt 95% confidence interval for the chain ladder estimate of the accident year 2004 loss ratio one year out of 66.7% ± 2.7% is nearly as wide as the 95% confidence interval of 66.7% ± 2.9% for the same loss ratio over the ultimate time horizon. Similarly, the accident year 2003 confidence interval for the chain ladder estimate of approximately 66.7% ± 0.9% for a one-year time horizon is also nearly as wide as that for the time horizon to ultimate of 67.8% ± 1.1%. For the older accident years, the proportion of the variation associated with the ultimate time horizon accounted for by the first year of future development is somewhat smaller, but the absolute size and significance of the confidence intervals for those years is much smaller.
Note that the logt confidence intervals are at least as wide in every case as the comparable lognormal confidence intervals shown in Table 7. In fact, in the case of the chain ladder estimates, for every accident year 1995–2004 the logt confidence intervals for the one-year time horizon are at least as wide as the lognormal confidence intervals for the ultimate time horizon!
5. Conclusions
There are a number of potential applications of the framework we have described for modeling future estimates of the ultimate loss ratio, ranging from loss reserving to pricing to analysis of risk-based capital. While a detailed discussion of these applications is beyond the scope of this paper, we will touch briefly on some examples.
5.1. Loss reserving
The framework presented in this paper gives reserve actuaries a way to manage their clients’ expectations. Reserve clients don’t like surprises and often express frustration that loss ratio or reserve estimates change significantly from one period to the next. We have shown in this paper that a large proportion of the potential variation in ultimate estimates can be present in the first year of future development. As we saw in the Private Passenger Auto Liability example we presented, this phenomenon is particularly pronounced when the estimates are determined using the chain ladder method, but it can also be present if the estimates are derived from the Bornhuetter-Ferguson approach. It seems likely that most reserve clients do not understand this phenomenon. Actuaries have done a good job in getting clients to understand that ultimate loss estimates are subject to large potential variation, but many clients seem to expect that variation to emerge only in the distant future, if at all.
We suggest that the uncertainty in loss ratio and reserve estimates be framed in terms of how these estimates might change at the next valuation by presenting the ultimate estimates together with confidence intervals consistent with the valuation time horizon. For example, if the next valuation will be in one year, then the results would be presented with one-year time horizon confidence intervals. Then, because the potential variation has been explained to them in advance, clients might be better able to accept the revised estimates produced at the next valuation. This framework also naturally facilitates the explanation of the reasons for estimate revisions in terms of the sources of variation. For example, how much of the revision is due to actual accident year development and how much is due to a tail factor revision caused by loss emergence on the older accident years?
While we have focused much of our discussion on historical accident years and thus implicitly on reserving, we can easily extend this framework to encompass certain aspects of the pricing and underwriting, which can be used to assess risk load requirements and reinsurance risk transfer characteristics as well as to establish expectations for paid loss emergence during the first year after inception.
5.2. Risk-based capital
The framework described can also be applied to analysis of the issues outlined by Butsic (1992) in his paper on solvency measurement in risk-based capital applications. He advocated the use of a common time horizon for measurement of all kinds of risks on both sides of the balance sheet. He showed how long-term solvency protection could be achieved by periodic assessment and adjustment of risk-based capital using a short time horizon, e.g., one year. In particular, Butsic proposed that the risk-based capital charge at the beginning of each period be calibrated to a suitably small Expected Policyholder Deficit (EPD)[21] expressed as a ratio to expected unpaid losses. The capital charge would be reset at the beginning of each new period based on asset and/or liability changes during the period just ended. While he illustrated his approach with numerical examples, he did not describe a model for how claim liabilities change from one period to the next. The model presented in this paper, using parameters determined from Schedule P data, could be used together with Butsic’s approach to test and refine the capital charges employed in the NAIC and rating agency risk-based capital models.[22] Moreover, to the extent that these risk-based capital charges imply the minimum amount of capital needed by an underwriter to assume risk, the model potentially has application to the problem of capital allocation for pricing applications as well.
5.3. Other stochastic loss development models
We have used Hayne’s simple lognormal model to illustrate how to model the future behavior of loss ratio estimates. However, the same conceptual approach can be used with other stochastic models. If ultimate loss ratios are estimated using a different stochastic model, the path of future revisions to those ultimate loss ratio estimates can be determined using the ideas presented in this paper.
Abbreviations and notations
second parameter of lognormal,
EPD, expected policyholder deficit
distribution of given known parameters
distribution of (unknown parameters)
number of points in sample
(prob), standard normal inverse distribution function
actual paid loss ratio
random number from unit uniform distribution
standard deviation of log-transformed sample
(prob), standard student’s inverse distribution function
tail, random variable for mean tail factor one year out
lognormal sample
Bornhuetter-Ferguson estimate of ultimate loss ratio
chain ladder estimate of ultimate loss ratio cumulative paid loss ratio -transformed sample mean of log-transformed sample
Conscious that the confidence intervals he derived were dependent on the lognormal model being the correct choice, he cautiously described his results as providing a “range of reasonableness.”
For example, see Mack (1993, 1994), Venter (1994, 1998), and Zehnwirth (1994; Barnett and Zehnwirth 2000), (the last co-authored with Barnett).
We used unweighted unbiased estimators for μ and σ throughout this paper. For formulas for estimators using unequal weights for the observations, see Section 5.5 of Wacek (2005).
Source: Highline Data LLC as reported in the statutory filings (OneSource).
Note that the standard deviation for the age 9 to 10 development factor, which is undefined, has been selected to be equal to that of the age 8 to 9 development factor in both Table 1 and Table 2.
These parameters define the lognormal distribution that best fits the data, using unbiasedness as the criterion for “best.” However, there is uncertainty about whether those parameters are correct. We address the issue of parameter uncertainty later in the paper.
We point this out to emphasize that the lognormal model appears to fit this data well. However, it is important to note that the mean of the fitted lognormal does not necessarily equal the mean of the sample.
N−1(97.5%) is replicated in Excel by NORMS1NV(0.975).
Bear in mind that these confidence intervals are premised on the parameter estimates being correct and are narrower than confidence intervals that incorporate parameter uncertainty.
While the lognormal is a skewed distribution, the skewness is imperceptible for small values of σ, and in those cases the confidence intervals are, for most practical purposes, symmetrical. In this example where the age 1 to 2 factor has an estimated σ = 0.016, the skewness coefficient is 0.05. In contrast, in the case of σ = 1 the skewness coefficient is 6.18 and the confidence interval is highly asymmetrical!
This is algebraically equivalent to the conventional statement of the B-F estimate as Emerged LR + Expected Ultimate LR × (1 − 1/Ultimate LDF), which in this example would be expressed as 26.6% × 1.75 + 26.6 × 1.767 × 1.42 × (1 − 1/1.42). For the December 2005 valuation we will assume a B-F Expected Ultimate LR for each accident year equal to the corresponding chain ladder ultimate estimate as of December 2004 shown in Table 3. There are various ways to choose B-F Expected Ultimate LRs and ours may not be the one most commonly used in practice.
Note that it is merely a coincidence that the chain ladder and B-F estimates in this example are both 66.1%.
Bear in mind that these parameters refer to distributions of the mean age-to-age development factor one year out and not to distributions of the development factor itself. We are interested in the distribution of the mean development factor because changes in the mean directly affect the ultimate loss ratio estimate (which is also a mean).
There might be value in doing so for other lines that display more loss development variability.
This assumes that the historical data and future observations are samples from the same distribution. To the extent that future economic, regulatory, and market conditions are different from those of the past, there will be additional sources of parameter uncertainty beyond those measured here.
The logt bears the same relationship to the Student’s t distribution that the lognormal bears to the normal.
Note that if we used the maximum likelihood estimator s2*ml* = ∑*n**i*=1((y*i* − y)2/n), the variance of this Student’s t distribution could be expressed as s2*ml* · (n + 1)/(n − 3).
T−1*n*−1 (97.5)% is replicated in Excel by TINV(0.05, n − 1).
T−1*n*−1 (R) is replicated in Excel by TINV(2(1 − R),n − 1) if R > 0.5, and −TINV(2R,n − 1), if R ≤ 0.5. TINV assumes users are interested in two-tailed applications and therefore takes as its first argument the total two-tail probability. It returns values only from the right half of the distribution.
Note that largely because their tail factors overlap, the accident year 1995 through 2004 ultimate loss ratio estimates are not independent, and for that reason their distributions were modeled simultaneously. To give one example of the tail factor overlap, the mean age 8 to 9 factor of 1.002 shown in Table 10, which illustrates the calculation of the accident year 2004 ultimate loss ratio estimate for one Monte Carlo trial, was also used in the calculation of ultimate loss ratio estimates for accident years 1996 through 2003.
The EPD is defined as the expectation of losses exceeding available assets. It can be viewed as the expected value of the proportion of policyholder claims that will be unrecoverable because of insurer insolvency.
For stress testing these solvency models it may make sense to use the chain ladder model, which produces more variable loss ratio estimates, rather than the Bornhuetter-Ferguson model.