1. Introduction
Insurance rating for property and casualty lines of business went through a great expansion after World War II. The expansion laid down the foundation for modern rating plans, which are fairly complex in that they typically consist of a wide range of rating factors. However, it also created a significant challenge for the insurance industry in how to determine the optimal values for each rating variable in the plan. For example, a typical personal automobile rating plan contains garage territory, driver age, driver gender, driver marital status, vehicle usage, driving distance, vehicle model year, vehicle symbols, driver history (accidents and violations), and a number of special credits and debits such as multi-car discounts, driving school discounts, and good student discounts.
To respond to the challenge of rating plan expansion, Bailey and Simon (Bailey and Simon 1960) and Bailey (Bailey 1963) proposed a “heuristic” iteration approach called “minimum bias models,” which utilizes an iterative procedure in determining simultaneously the “optimal” values for the rating variables. During iteration, the procedure will minimize a target “bias” function. Compared to the traditional one-way or two-way analysis, such “multivariate procedures” can reduce estimation errors. Until recent interest in generalized linear models (GLMs), the minimum bias approach was the major technique used by property and casualty pricing actuaries in determining the rate relativities for a class plan with multiple rating variables.
We will illustrate how the minimum bias approach can be used to derive indicated class plan factors. Because multiplicative models are more popular than additive ones, we will focus first and primarily on multiplicative models. Later, we will also illustrate how to generalize the approach by developing additive and mixed additive-multiplicative models.
Assume that we are conducting a two-variable (X and Y) rating plan analysis based on loss cost. Variable X has a total of m categories of values, variable Y has a total of n categories of values, and the categories are represented by the subscript of i (from 1,2, . . . ,m) and j (from 1,2, . . . ,n). Define (Bailey 1963) is:
as the observed loss cost relativity and as the earned exposures or weight for the classification i and j for variables X and Y, respectively, and let and be the relativities for classification i and classification j, respectively. The multiplicative rating plan proposed by Bailey\[ \begin{array}{l} \mathrm{E}\left(r_{i, j}\right)=x_{i} y_{j}, \quad \text { where } \quad i=1,2, \ldots, m \\ \text { and } \quad j=1,2, \ldots, n . \end{array} \]
With the above multiplicative formula, one type of “error” proposed by Bailey is to measure the difference between the “estimated cost” and the “observed cost.” The errors across the variable
are for The errors across the variable are forWhen the above errors are set to “zero” for every
and it can be shown that the estimated relativities, can be derived iteratively as follows:\[ \begin{array}{l} \text { Algorithm 1: }\\ \begin{array}{l} \hat{x}_{i}=\frac{\sum_{j} w_{i, j} r_{i, j}}{\sum_{j} w_{i, j} y_{j}} \\ \hat{y}_{j}=\frac{\sum_{i} w_{i, j} r_{i, j}}{\sum_{i} w_{i, j} x_{i}} \end{array} \end{array} \tag{1.1} \]
Strictly speaking, it is somewhat misleading to describe Bailey’s approach as “minimum bias models.” First, what Bailey proposed is essentially an iteration algorithm, not a set of statistical models. The iterative procedure is a “fixed point iteration technique” commonly employed in numeric analysis for root finding. Second, the “error” function above is not consistent with the bias concept in statistics. Bias generally refers to the difference between the mean of an estimator and the true value of the parameter being estimated. For example, suppose we are trying to estimate relativity
using an estimator (which is some function of observed data). Then the bias is defined as If is called an unbiased estimator of Although Algorithm 1 does not measure the bias of the mean estimated relativity from the true value, the approach has long been recognized as the “minimum bias” method by actuaries. In fact, it is essentially a cross-classification estimation algorithm. In this paper, we describe our new and generalized approach, the “general iteration algorithm” (GIA), which has greater statistical rigor.Brown (Brown 1988) was the first one to introduce statistical models and link Bailey and Simon’s minimum bias approach to the maximum likelihood estimations of the statistical theories:
\[ L_{i, j}=B r_{i, j}=B x_{i} y_{j}+\varepsilon_{i, j} \]
where
is the observed loss cost, is the base, is a random error, and follows a statistical distribution. Returning to Algorithm 1, it can be proven that Algorithm 1 is equivalent to applying the maximum likelihood (ML) method with an assumption that follows a Poisson distribution. Therefore, the results from Algorithm 1 are the same as those from the “ML Poisson model.”With the introduction of statistical theories and statistical models to the minimum bias approach, Brown further expanded the approach with four more minimum bias algorithms (three multiplicative and one additive) by assuming different distributions for
(or :\[ \begin{array}{l} \text { Algorithm 2: }\\ \hat{x}_{i}=\frac{1}{n} \sum_{j} \frac{r_{i, j}}{y_{j}} . \end{array} \tag{1.2} \]
Algorithm 2 assumes that
follows an exponential distribution.\[ \begin{array}{l} \text { Algorithm 3: }\\ \hat{x}_{i}=\frac{\sum_{j} w_{i, j}^{2} r_{i, j} y_{j}}{\sum_{j} w_{i, j}^{2} y_{j}^{2}} \end{array} \tag{1.3} \]
Algorithm 3 is equivalent to an ML normal model.
\[ \begin{array}{l} \text { Algorithm 4: }\\ \hat{x}_{i}=\frac{\sum_{j} w_{i, j} r_{i, j} y_{j}}{\sum_{j} w_{i, j} y_{j}^{2}} \end{array} \tag{1.4} \]
Algorithm 4 results from the least squares model.
Another minimum bias algorithm proposed by Bailey and Simon (Bailey and Simon 1960) has a complicated format:
\[ \begin{array}{l} \text { Algorithm 5: }\\ \hat{x}_{i}=\left(\frac{\sum_{j} w_{i, j} r_{i, j}^{2} y_{j}^{-1}}{\sum_{j} w_{i, j} y_{j}}\right)^{1 / 2} \end{array} \tag{1.5} \]
Feldblum and Brosius (Feldblum and Brosius 2003) summarized these minimum bias algorithms into four categories: “balance principle,” “maximum likelihood,” “least squares,” and “χ-squared.”
-
Algorithm 1 could be derived from the so-called “balance principle,” that is, "the sum of the indicated relativity = the sum of observed relativity." Such a balance relationship can be formulated as:
\[ \sum_{j} w_{i, j} r_{i, j}=\sum_{j} w_{i, j} x_{i} y_{j} . \]
-
Algorithms 1, 2, and 3 can be derived from the associated log likelihood functions of observed pure premium relativities.
-
Algorithm 4 can be derived by minimizing the sum of the squared errors:
\[ \operatorname{Min}_{x, y} \sum_{i, j} w_{i, j}\left(r_{i, j}-x_{i} y_{j}\right)^{2} . \]
-
Algorithm 5 can be derived by minimizing the “χ-squared” error, the squared error divided by the indicated relativity:
\[ \operatorname{Min}_{x, y} \sum_{i, j} w_{i, j} \frac{\left(r_{i, j}-x_{i} y_{j}\right)^{2}}{x_{i} y_{j}} \]
In his milestone paper, Mildenhall (Mildenhall 1999) further demonstrated that classification rates determined by various linear bias functions are essentially the same as those from GLMs. One main advantage of using statistical models such as GLM is that the characteristics of the models, such as the parameters’ confidence intervals and hypothesis testing, can be thoroughly studied and determined by statistical theories. Also, the contribution and significance of the variables in the models can be statistically evaluated. Another advantage is that GLMs are more efficient because they do not require actuaries to program the iterative process in determining the parameters.[1] However, this advantage can be discounted somewhat due to the powerful calculation capability of modern computers. Due to these advantages, GLMs have become more popular in recent years. Of course, actuaries need to acquire the necessary statistical knowledge in understanding and applying the GLMs and rely on specific statistical modeling tools or software.
On the other hand, we believe that the formats and the procedures for the minimum bias types of iteration algorithms are simple and straight-forward. The approach is based on a target or error function along with an iterative procedure to minimize the function without distribution assumptions. Actuaries have been using the approach for many decades. So, compared to GLM, some advantages of the iteration approach are that it is easy to understand; easy to use; easy to program using many different software tools (for example, an Excel spreadsheet); and does not require advanced statistical knowledge, such as maximum likelihood estimations and deviance functions of GLM.
One issue associated with most previous work on the minimum bias approach and GLM is the model-selection limitation. GLMs assume the underlying distributions are from the exponential family. Also, commonly used statistical software typically provides limited selection of GLM distributions, such as Poisson, Gamma, normal, negative binomial, and inverse Gaussian. On the other hand, only five types of multiplicative models and four types of additive models are available from previous minimum bias work.[2] These limitations, we believe, may reduce estimation accuracy in practice since insurance and actuarial data are rarely perfect and may not fit well the exponential family of distributions or existing bias models.
In addition, there are two other common and practical issues that actuaries have to deal with in their daily pricing exercises. First, many real-world rating plans are essentially a mixed additive and multiplicative model. For example, for personal auto pricing, the primary class plan factor for age, gender, marital status, and vehicle use is often added with the secondary class plan factor for past accident and violation points, and then the result is multiplied with other factors. The commonly used GLM software, to our knowledge, does not provide options that can solve such mixed models because the identity link function implies an additive model, while the log link function implies a multiplicative model.
Second, it is possible that using either a GLM or a previous minimum bias iteration approach will result in parameters for some variables which are questionable or unacceptable by the marketplace. One practical way to deal with this issue is to select or constrain the factors for the variables based on business and competitive reasons while leaving other factors to be determined by multivariate modeling techniques. Since all the variables are connected in the multivariate analysis, any “constrained” factors should flow through the analysis, and the constraint will impact the results for the other “unconstrained” factors. We can call this issue a “constraint optimization” problem.
In this study, we propose a more flexible and comprehensive approach within the minimum bias framework, called a “general iteration algorithm” (GIA). The key features of GIA are:
-
It will significantly broaden the assumptions for distributions in use, and, to a certain degree, it totally relaxes any specific form for the distributions. Therefore, GIA will be able to provide a much wider array of models from which actuaries may choose. This will increase the model-selection flexibility.
-
Its flexibility will improve the accuracy and the goodness of fit of classification rates. We will demonstrate this result through a case study later.
-
Similar to past minimum bias approaches, it is easy to understand and does not require advanced statistical knowledge. For practical purposes, GIA users only need to select the target functions and the iteration procedure because the approach is distribution-free.
-
While GIA still requires the iterative process in determining the parameters, we believe that the effort is not significant with today’s powerful computers.
-
It can solve the mixed additive-multiplicative models and the constraint optimization problems.
In the following sections, we will first prove that all five existing multiplicative minimum bias algorithms are special cases of GIA. We will also propose several more multiplicative algorithms that actuaries may consider for ratemaking based on insurance data. Then, we will demonstrate how to apply GIAs to solve the mixed models and constraint optimization problems.
The numerical analysis of multiplicative and additive models given later is based on severity data for private passenger auto collision in Mildenhall (Mildenhall 1999) and McCullagh and Nelder (McCullagh and Nelder 1989). The results from selected algorithms will be compared to those from the GLM models. Following Bailey and Simon (Bailey and Simon 1960), the weighted absolute bias and the Pearson chi-square statistic are used to measure the goodness of fit. We also calculate the weighted absolute percentage bias, which indicates the magnitude of the errors relative to the predicted values.
The remainder of this paper is organized as follows:
-
Section 2 discusses the details of two-parameter multiplicative, three-parameter multiplicative, constraint, additive, and mixed GIA.
-
Section 3 addresses the residual diagnosis of GIA.
-
Section 4 investigates the calculation efficiency of GIA. It shows that GIA could converge rapidly and is not necessarily inefficient in numerical calculations.
-
Section 5 reviews numerical results for two case studies using multiplicative and mixed models.
-
Section 6 outlines our conclusions.
-
The appendix reports the numerical results for the examples discussed in Section 5 with several selected multiplicative GIAs. It also shows the iterative convergences of selected multiplicative, additive, and mixed GIAs.
2. General iteration algorithm (GIA)
2.1. Two-parameter GIAs
Following the notation used previously, in the multiplicative framework for two rating factors, the expected relativity for cell (i, j) should be equal to the product of
and :\[ \mathrm{E}\left(r_{i, j}\right)=\mu_{i, j}=x_{i} y_{j} \tag{2.1} \]
By (2.1), there are a total of n alternative estimates for
and a total of m estimates for :\[ \begin{array}{ll} \hat{x}_{i, j}=r_{i, j} / y_{j}, & j=1,2, \ldots, n \\ \hat{y}_{j, i}=r_{i, j} / x_{i}, & i=1,2, \ldots, m . \end{array} \tag{2.2} \]
Following actuarial convention, the final estimates of
and could be calculated by the weighted average of and If we use the straight average to estimate the relativity:\[ \hat{x}_i=\sum_j \frac{1}{n} \hat{x}_{i, j}=\frac{1}{n} \sum_j \frac{r_{i, j}}{y_j} .\tag{2.3} \]
Similarly, (Brown 1988).
This is Algorithm 2, the ML exponential model introduced by BrownIf the relativity-adjusted exposure,
is used as the weight in determining the estimates:\[ \begin{aligned} \hat{x}_{i} & =\sum_{j} \frac{w_{i, j} \mu_{i, j}}{\sum_{j} w_{i, j} \mu_{i, j}} \hat{x}_{i, j}=\sum_{j} \frac{w_{i, j} y_{j}}{\sum_{j} w_{i, j} y_{j}} \frac{r_{i, j}}{y_{j}} \\ & =\frac{\sum_{j} w_{i, j} r_{i, j}}{\sum_{j} w_{i, j} y_{j}} \end{aligned} \tag{2.4} \]
Similarly,
\[ \begin{aligned} \hat{y}_{j} & =\sum_{i} \frac{w_{i, j} \mu_{i, j}}{\sum_{i} w_{i, j} \mu_{i, j}} \hat{y}_{i, j}=\sum_{i} \frac{w_{i, j} x_{i}}{\sum_{i} w_{i, j} x_{i}} \frac{r_{i, j}}{x_{i}} \\ & =\frac{\sum_{i} w_{i, j} r_{i, j}}{\sum_{i} w_{i, j} x_{i}} \end{aligned} \]
The resulting model is the same as Algorithm 1, the “balance principle” or ML Poisson model.
If the square of the relativity-adjusted exposure,
is used as the weight:\[ \hat{x}_{i}=\sum_{j} \frac{w_{i, j}^{2} \mu_{i, j}^{2}}{\sum_{j} w_{i, j}^{2} \mu_{i, j}^{2}} \hat{x}_{i, j}=\frac{\sum_{j} w_{i, j}^{2} r_{i, j} y_{j}}{\sum_{j} w_{i, j}^{2} y_{j}^{2}} . \tag{2.5} \]
The resulting model is the same as Algorithm 3, the ML normal model.
If the exposure adjusted by the square of relativity,
is used as the weight:\[ \hat{x}_{i}=\sum_{j} \frac{w_{i, j} \mu_{i, j}^{2}}{\sum_{j} w_{i, j} \mu_{i, j}^{2}} \hat{x}_{i, j}=\frac{\sum_{j} w_{i, j} r_{i, j} y_{j}}{\sum_{j} w_{i, j} y_{j}^{2}} . \tag{2.6} \]
The resulting model is the same as Algorithm 4, the least-squares model.
From the above results, we propose the 2 -parameter GIA approach by using
2-Parameter GIA:
\[ \begin{array}{l} \text { 2-Parameter GIA: }\\ \hat{x}_{i}=\sum_{j} \frac{w_{i, j}^{p} \mu_{i, j}^{q}}{\sum_{j} w_{i, j}^{p} \mu_{i, j}^{q}} \hat{x}_{i, j}=\frac{\sum_{j} w_{i, j}^{p} r_{i, j} y_{j}^{q-1}}{\sum_{j} w_{i, j}^{p} y_{j}^{q}} . \end{array} \tag{2.7} \]
When
-
p = q = 0, it is the ML exponential model, Algorithm 2;
-
p = q = 1, it is the ML Poisson model, Algorithm 1;
-
p = q = 2, it is the ML normal model, Algorithm 3
-
p = 1 and q = 2, it is the least-squares model, Algorithm 4.
In addition, there are two more models that correspond to GLM with the exponential family of gamma and inverse Gaussian distributions.[3] When the exposure is used as the weights, that is, p = 1 and q = 0, the GIA will lead to a GLM gamma model and becomes:
\[ \begin{array}{l} \text { Algorithm 6: }\\ \hat{x}_{i}=\sum_{j} \frac{w_{i, j}}{\sum_{j} w_{i, j}} \hat{x}_{i, j}=\frac{\sum_{j} w_{i, j} r_{i, j} y_{j}^{-1}}{\sum_{j} w_{i, j}} \end{array} \tag{2.8} \]
When p = 1 and q = −1, the GIA leads to a GLM inverse Gaussian model and becomes:
\[ \begin{array}{l} \text { Algorithm 7: }\\ \hat{x}_{i}=\sum_{j} \frac{w_{i, j} y_{j}^{-1}}{\sum_{j} w_{i, j} y_{j}^{-1}} \hat{x}_{i, j}=\frac{\sum_{j} w_{i, j} r_{i, j} / y_{j}^{2}}{\sum_{j} w_{i, j} / y_{j}} . \end{array} \tag{2.9} \]
Equation (2.7) suggests that in theory there is no limitation for the values of p and q that can be used and they can take on any real values. It is with this feature that GIA should greatly enhance the flexibility for actuaries when they apply the algorithm to fit their data. Of course, in reality we do not expect that extreme values for p and q will be found useful. In ratemaking applications, earned premium could be used if exposure is not available. Normalized premium (premium divided by relativity) is a reasonable option for the weight. This suggests that q could be negative. In general, p should be positive: the more exposure/claims/premium, the more weight assigned.
2.2. Three-parameter GIAs
So far, we have used the 2-parameter GIA in Equation (2.7) to represent several commonly used models, Algorithms 1 to 4, but not Algorithm 5, the “χ-squared” multiplicative model. In order to represent Algorithm 5, we further expand the 2-parameter GIA to a 3-parameter GIA using the link function concept from GLM.
One generalization of GLMs as compared to a more basic linear model is done by introducing a link function to link the linear predictor to the response variable. Similarly, we introduce a relativity link function to link the GIA estimate to the relativity. The proposed relativity link function is different in several aspects from the link function in GLMs. In GLMs, the link function determines the type of model: log link implies a multiplicative model and identity link implies an additive model. This is not the case for GIA. Multiplicative GIA, for example, could have a log, power, or exponential relativity link function.
For a 3-parameter GIA, instead of using (2.2), we estimate the relativity link functions of
and from and first, and then calculate and by inverting the relativity link function, and The functions and can be estimated by:\[ \begin{array}{ll} f\left(\hat{x}_{i, j}\right)=f\left(r_{i, j} / y_{j}\right), & j=1,2, \ldots, n \\ f\left(\hat{y}_{j, i}\right)=f\left(r_{i, j} / x_{i}\right), & i=1,2, \ldots, m . \end{array} \tag{2.10} \]
Taking the weighted average using parameters p and q:
\[ \begin{array}{l} f\left(\hat{x}_{i}\right)=\sum_{j} \frac{w_{i, j}^{p} \mu_{i, j}^{q}}{\sum_{j} w_{i, j}^{p} \mu_{i, j}^{q}} f\left(\hat{x}_{i, j}\right)=\frac{\sum_{j} w_{i, j}^{p} y_{j}^{q} f\left(\frac{r_{i, j}}{y_{j}}\right)}{\sum_{j} w_{i, j}^{p} y_{j}^{q}}\\ f\left(\hat{y}_{j}\right)=\sum_{i} \frac{w_{i, j}^{p} \mu_{i, j}^{q}}{\sum_{i} w_{i, j}^{p} \mu_{i, j}^{q}} f\left(\hat{y}_{j, i}\right)=\frac{\sum_{i} w_{i, j}^{p} x_{i}^{q} f\left(\frac{r_{i, j}}{x_{i}}\right)}{\sum_{i} w_{i, j}^{p} x_{i}^{q}} . \end{array} \tag{2.11} \]
Thus,
\[ \begin{array}{l} \hat{x}_{i}=f^{-1}\left(\frac{\sum_{j} w_{i, j}^{p} y_{j}^{q} f\left(\frac{r_{i, j}}{y_{j}}\right)}{\sum_{j} w_{i, j}^{p} y_{j}^{q}}\right) \\ \hat{y}_{j}=f^{-1}\left(\frac{\sum_{i} w_{i, j}^{p} x_{i}^{q} f\left(\frac{r_{i, j}}{x_{i}}\right)}{\sum_{i} w_{i, j}^{p} x_{i}^{q}}\right) . \end{array} \tag{2.12} \]
One possible selection of the relativity link function is the power function,
and In this case, equation (2.12) leads to a 3-parameter GIA:\[ \hat{x}_{i}=\left(\frac{\sum_{j} w_{i, j}^{p} r_{i, j}^{k} y_{j}^{q-k}}{\sum_{j} w_{i, j}^{p} y_{j}^{q}}\right)^{1 / k} \tag{2.13} \]
When k = 2, p = 1, and q = 1, Equation (2.13)[4] is equivalent to:
\[ \hat{x}_{i}=\left(\frac{\sum_{j} w_{i, j} r_{i, j}^{2} y_{j}^{-1}}{\sum_{j} w_{i, j} y_{j}}\right)^{1 / 2}, \tag{2.14} \]
and this is Algorithm 5, the “χ-squared” multiplicative model.
Another example of a new iterative algorithm occurs when k = 1/2, p = 1, and q = 1:
\[ \begin{array}{l} \text { Algorithm 8: }\\ \hat{x}_{i}=\left(\frac{\sum_{j} w_{i, j} r_{i, j}^{1 / 2} y_{j}^{1 / 2}}{\sum_{j} w_{i, j} y_{j}}\right)^{2} . \end{array} \tag{2.15} \]
Mildenhall (Mildenhall 2005) indicated that the 3-parameter GIA is equivalent to a GLM with the parameters and the weight and the response variable following a distribution with variance function When and we can conclude that:
-
when q = 2, the normal GLM model is the same as the GIA Algorithm 4 in Equation (1.4);
-
when q = 1, the Poisson GLM model is the same as the GIA Algorithm 1 in Equation (1.1);
-
when q = 0, the gamma GLM model is the same as the GIA Algorithm 6 in Equation (2.8);
-
when q = −1, the inverse Gaussian GLM model is the same as the GIA Algorithm 7 in Equation (2.9).
Also, for “χ-squared” minimum bias model with k = 2, p = 1, and q = 1, the GIA theory indicates that r2 follows a Tweedie distribution with a variance function Var(μ)= μ1.5.
In actuarial exercises, we often exclude the extremely high and low values from the weighted average to yield more robust results. In the case of several rating variables, there may be thousands of alternative estimates. Actuaries have the flexibility to use the weighted average within selected ranges (e.g., the average without the highest and the lowest 1% percentile). This is similar to the concept of “trimmed” regression used with GLMs whereby observations with undue influence on a fitted value are removed.
Finally, we would like to extend GIA to reserve applications. Mack (Mack 1991) discussed the connection between ratemaking models of auto insurance and IBNR reserve calculation because reserves can be estimated by a ratemaking model with two “rating” variables, accident year and development year. He showed that the minimum bias method produces the same result as the chain ladder loss development method. Recently, actuaries have applied GLMs to estimate reserves using the incremental loss as the response variable.
Let (England and Verrall 1999) used the following GLM with log link function and Poisson distribution to estimate the expected values of future payments:
be the incremental paid loss in accident year i and development year j, that is, is the cell (i, j) of the incremental payment triangle. England and Verrall\[ \begin{aligned} \mathrm{E}\left(P_{i, j}\right) & =m_{i, j} \quad \text { and } \quad \operatorname{Var}\left(P_{i, j}\right)=\phi m_{i, j}, \\ \log \left(m_{i, j}\right) & =C+\alpha_{i}+\beta_{j} \\ \alpha_{1} & =\beta_{1}=0 . \end{aligned} \]
Several other models were also proposed for reserve estimates. For example, Renshaw and Verrall (Renshaw and Verrall 1998) applied the GLM with a gamma distribution. The only difference between the gamma and Poisson models is that the gamma model’s variance function is
Let
and then the above GLM reserve models can be similarly transferred to the GIA multiplicative algorithm by setting So GIA can also be used to estimate reserves based on the triangles of incremental paid loss. When and GIA yields the same result as a Poisson GLM reserve model; when and GIA produces the same result as a gamma GLM model.2.3. Constraint GIA
In real-world ratemaking applications, some factors need to be selected or capped within a certain range for business or competitive reasons. Since in a multivariate analysis, all the variables are related, other factors should be adjusted to reflect the impact of the subjective selections. When this issue arises, the standard GLM or other approaches may have limitations if the selected factors are outside of the fitted confidence interval.
For example, the multi-car discount used for private passenger auto pricing is typically between 5% and 25%. Any factor outside this range is not likely to be accepted by the market, no matter what the fitted value is for the “indicated” discount. In the following we will demonstrate how to apply GIA to solve the issue.
For example, let x1 and x2 be the single and multi-car factors, respectively, and we will cap the multi-car discount to be between 5% and 25%. The constraint can be represented by 0.75x1 ≤ x2 ≤ 0.95x1. Adding this constraint to (2.13), we can solve the problem by:
\[ \begin{array}{l} \hat{x}_{1}=\left(\frac{\sum_{j} w_{1, j}^{p} r_{1, j}^{k} y_{j}^{q-k}}{\sum_{j} w_{1, j}^{p} y_{j}^{q}}\right)^{1 / k},\\ \hat{x}_{2}=\max \left(0.75 \hat{x}_{1}, \min \left(0.95 \hat{x}_{1},\left(\frac{\sum_{j} w_{2, j}^{p} r_{2, j}^{k} y_{j}^{q-k}}{\sum_{j} w_{2, j}^{p} y_{j}^{q}}\right)^{1 / k}\right)\right) . \end{array} \tag{2.16} \]
With the constraint, we can continue the iteration process until the values for all other rating factors converge. This flexibility[5] associated with GIA will provide actuaries another benefit in dealing with their practical problems.
2.4. Additive GIA
Following the same notations as above, the expected cost for classification cell (i, j) with an additive model should be equal to the sum of
and :\[ \mathrm{E}\left(r_{i, j}\right)=\mu_{i, j}=x_{i}+y_{j} . \tag{2.17} \]
Thus,
\[ \begin{array}{ll} \hat{x}_{i, j}=r_{i, j}-y_{j}, & j=1,2, \ldots, n \\ \hat{y}_{j, i}=r_{i, j}-x_{i}, & i=1,2, \ldots, m . \end{array} \tag{2.18} \]
In the multiplicative models, we use the relativ-ity-adjusted exposure,
as the weighting function and introduce the power relativity link function. However, the weighting functions and the relativity link functions cannot be applied in an additive process.For the additive GIA, we are limited to the following one-parameter model using
as the weight:\[ \hat{x}_{i}=\sum_{j} \frac{w_{i, j}^{p}}{\sum_{j} w_{i, j}^{p}} \hat{x}_{i, j}=\frac{\sum_{j} w_{i, j}^{p}\left(r_{i, j}-y_{j}\right)}{\sum_{j} w_{i, j}^{p}} . \tag{2.19} \]
When p = 1, it leads to the model introduced by Bailey (Bailey 1963) or the “Balance Principle” model in Feldblum and Brosius (Feldblum and Brosius 2003). Mildenhall (Mildenhall 1999) also proved that it is equivalent to an additive normal GLM model. When p = 2, it leads to the ML additive normal model introduced by Brown (Brown 1988). When p = 0, it leads to the least squares model by Feldblum and Brosius (Feldblum and Brosius 2003). There is no further generalization for the additive GIAs with additional parameters or link functions.
Except for the exponential family of distributions, the lognormal distribution is probably the most widely used distribution in actuarial ractice. If [6]
follows a lognormal distribution, will follow a normal distribution and the multiplicative rating plan can be transformed to The additive GIA algorithms can be used to derive the parameters for the lognormal distribution assumption.2.5. Mixed additive and multiplicative GIAs
A simplified mixed additive and multiplicative model[7] can be illustrated as follows:
\[ r_{i, j, h}=\left(x_{i}+y_{j}\right) \times z_{h}+\varepsilon_{i, j, h} \tag{2.20} \]
where i = 1,2, . . . ,m; j = 1,2, . . . ,n; and h = 1,2, . . . ,l. There are n × l alternative estimates for
:\[ \hat{x}_{i, j, h}=\frac{r_{i, j, h}}{z_{h}}-y_{j} \]
There are m × l alternative estimates for
:\[ \hat{y}_{i, j, h}=\frac{r_{i, j, h}}{z_{h}}-x_{i} \]
There are m × n alternative estimates for
:\[ \hat{z}_{i, j, h}=\frac{r_{i, j, h}}{x_{i}+y_{j}} . \]
Using
as the weight:\[ \begin{aligned} \hat{z}_{h} & =\frac{\sum_{i} \sum_{j} w_{i, j, h}^{p} \times \hat{z}_{i, j, h}}{\sum_{i} \sum_{j} w_{i, j, h}^{p}} \\ & =\frac{\sum_{i} \sum_{j} w_{i, j, h}^{p} \times\left(\frac{r_{i, j, h}}{x_{i}+y_{j}}\right)}{\sum_{i} \sum_{j} w_{i, j, h}^{p}}, \\ \hat{x}_{i} & =\frac{\sum_{j} \sum_{h} w_{i, j, h}^{p} \times \hat{x}_{i, j, h}}{\sum_{j} \sum_{h} w_{i, j, h}^{p}} \\ & =\frac{\sum_{j} \sum_{h} w_{i, j, h}^{p} \times\left(\frac{r_{i, j, h}}{z_{h}}-y_{j}\right)}{\sum_{j} \sum_{h} w_{i, j, h}^{p}} ; \\ \hat{y}_{j} & =\frac{\sum_{i} \sum_{h} w_{i, j, h}^{p} \times \hat{y}_{i, j, h}}{\sum_{i} \sum_{h} w_{i, j, h}^{p}} \\ & =\frac{\sum_{i} \sum_{h} w_{i, j, h}^{p} \times\left(\frac{r_{i, j, h}}{z_{h}}-x_{i}\right)}{\sum_{i} \sum_{h} w_{i, j, h}^{p}} . \end{aligned} \tag{2.21} \]
There is no unique solution for (2.21). For example, if a, b, and c are a solution to estimate the factors x, y, and z, then 2a, 2b, and 0.5c are another possible solution. In order to facilitate the iteration convergence, we need to add some constraints in the procedure. If we use the sample mean as the base, the weighted average of multiplicative factors from one rating variable should be close to one. So in each iteration we can adjust all the z’s proportionally so that the average is reset to one. Mathematically, the constraint is:
\[ \frac{\sum_{i} \sum_{j} \sum_{h} w_{i, j, h}^{p} \hat{z}_{h}}{\sum_{i} \sum_{j} \sum_{h} w_{i, j, h}^{p}}=1 \tag{2.22} \]
3. Residual diagnosis
For a statistical data-fitting exercise, it is important to conduct a diagnostic test to validate the distribution assumption in use. Such diagnostic tests typically consist of a residual plot in which the residuals are the difference between the fitted values and the actual values. In this section, we will describe how to conduct such residual analysis for GIA, and the residual plot results for the case study are given in the next section.
We have discussed that a 3-parameter GIA is equivalent to a GLM, assuming the response variable [8] As in GLM, we define the scaled Pearson residual of GIA as
follows a distribution with variance function The raw residuals from GIA do not asymptotically follow an independent and identical normal distribution because the variances of residuals are positively correlated to the predicted values.\[ e_{i, j}=\frac{r_{i, j}^{k}-\hat{r}_{i}^{k}}{\sqrt{\operatorname{Var}(\mu)}}=\frac{r_{i, j}^{k}-\hat{x}_{i}^{k} \hat{y}_{j}^{k}}{\sqrt{\left(\hat{x}_{i}^{k} \hat{y}_{j}^{k}\right)^{2-q / k}}}=\frac{r_{i, j}^{k}-\hat{x}_{i}^{k} \hat{y}_{j}^{k}}{\sqrt{\hat{x}_{i}^{2 k-q} \hat{y}_{j}^{2 k-q}}}, \tag{3.1} \]
where
is approximately independent and identically distributed since\[ \begin{aligned} \operatorname{Var}\left(e_{i, j}\right) & =\operatorname{Var}\left(\frac{r_{i, j}^{k}-\hat{x}_{i}^{k} \hat{y}_{j}^{k}}{\sqrt{\left(\hat{x}_{i}^{k} \hat{y}_{j}^{k}\right)^{2-q / k}}}\right) \\ & =\frac{\operatorname{Var}\left(r_{i, j}^{k}\right)}{\left(\hat{x}_{i}^{k} \hat{y}_{j}^{k}\right)^{2-q / k}}=1 . \end{aligned} \tag{3.2} \]
We can use the scaled Pearson residuals to conduct the residual diagnosis for GIA, such as developing a scattered residuals plot and a quantile-to-quantile (Q-Q) plot. If the GIA algorithms fit the data well, scaled Pearson residuals are randomly scattered and the Q-Q plot is close to a straight line.
4. Calculation efficiency
One issue associated with GIA is the calculation efficiency. Mildenhall (Mildenhall 1999) discussed that one advantage of GLMs compared to the minimum bias models is the calculation efficiency because GLMs do not require an iterative process in estimating the parameters. He showed that the additive minimum bias model by Bailey (Bailey 1963), or GIA with p = 1, does not converge even after 50 iterations using the well-investigated data given by McCullagh and Nelder (McCullagh and Nelder 1989).
However, with several adjustments to the iteration methodology, we can show that GIA can converge very quickly. Using the same data, the additive GIA can complete the convergence in five iterations. One adjustment is to include as much updated information as possible—that is, the latest y’s should be used to estimate the next x’s and vice versa.
In GLMs and previous minimum bias models, a specific class is usually selected as the base (e.g., age 60+ and pleasure). For GIA, we suggest using the average as the base, because, when using a specific class as the base, the numerical value of the base will vary from one iteration to next, requiring additional iterations to force the factor for the base class to be one.
Another well-known issue for the iteration procedure concerns how to set the starting point for the first iteration. The closer the starting point to the final results, the faster the convergence. Using average frequency/severity/pure premium as the base, the average factor of a rating variable is one for multiplicative models and the average discount is zero for the additive models. Therefore, in this study, we chose the starting values of
and to be 1 for the multiplicative models.5. Numerical analysis
The numerical analysis of testing various multiplicative GIAs is based on the severity data for private passenger auto collision given in Mildenhall (Mildenhall 1999) and McCullagh and Nelder (McCullagh and Nelder 1989). Using this well-researched data will help us to compare the empirical results of this paper with previous studies. The data includes 32 severity observations for two classification variables: eight age groups and four types of vehicle use. In this severity case study, the weight is the number of claims. Table 1 in the Appendix lists the data.
In order to test mixed additive and multiplicative GIAs, we need at least three variables in the data. The data in Mildenhall (Mildenhall 1999) and McCullagh and Nelder (McCullagh and Nelder 1989) contain only two variables. Therefore, we will use another collision pure premium dataset to demonstrate the mixed algorithm. In addition to age and vehicle use, this data includes credit score as a third variable, with four classifications from low to high. In this pure premium case study, the weight is the earned exposure. Table 2 in the Appendix displays the data.
Four criteria are used to evaluate the performance of these GIAs: the absolute bias, the absolute percentage bias, the Pearson chi-squared statistic, and the combination of absolute bias and the chi-squared statistic:
-
The weighted absolute bias (wab) criterion is proposed by Bailey and Simon (Bailey and Simon 1960). It is the weighted average of absolute dollar difference between the observations and fitted values:
\[ w a b=\frac{\sum w_{i, j}\left|B r_{i, j}-B x_{i} y_{j}\right|}{\sum w_{i, j}} . \]
-
The second one, weighted absolute percentage bias (wapb), measures the absolute bias relative to the predicted values:\[ w a p b=\frac{\sum w_{i, j} \frac{\left|B r_{i, j}-B x_{i} y_{j}\right|}{B x_{i} y_{j}}}{\sum w_{i, j}} . \]
-
The weighted Pearson chi-squared (wChi) statistic is also proposed by Bailey and Simon (Bailey and Simon 1960), and it is appropriate to test whether “differences between the raw data and the estimated relativities should be small enough to be caused by chance”:
\[ w C h i=\frac{\sum w_{i, j} \frac{\left(B r_{i, j}-B x_{i} y_{j}\right)^{2}}{B x_{i} y_{j}}}{\sum w_{i, j}} . \]
-
Lastly, we combine the absolute bias and Pearson chi-squared statistic,
to be the fourth criterion for the model selection.
Table 3 lists the relativities for Algorithms 1–8 and Table 4 displays the four performance statistics of those models, wab, wapb, wChi, and [9] In all the cases, class “age 60+” and “pleasure” are used as the base.
To illustrate the residual diagnosis of GIA, we show the residual plots for GIA with k = 1, p = 1, and q = −0.5. Figure 1 in the Appendix reports the scattered residuals by observations; Figures 2 and 3 show the scattered residuals by age and by vehicle use, respectively; Figure 4 is the Q-Q plot. It is clear that the classification of age 17–20 and business use is an outlier.[10] This is not surprising because of the small sample size in the cell (five claims). A practical way to solve the problem is to cap the severity.
As stated before, we find that GLMs with common exponential family distribution assumptions are special cases of GIA (k = 1 and p = 1). Comparing the GIA factors in Table 3 with those from GLMs with normal, Poisson, gamma, and inverse Gaussian distributions, we confirm:
-
when k = 1, p = 1, and q = 2, the “least squares” GIA has the same results as GLM with a normal distribution;[11]
-
when k = 1, p = 1, and q = 1, GIA is the same as a Poisson GLM;
-
when k = 1, p = 1, and q = 0, GIA is the same as a gamma GLM; and
-
when k = 1, p = 1, and q = −1, GIA is the same as a GLM with inverse Gaussian distribution.
As discussed in Section 2, a GIA with k = 1 and p = 1
\[ \left(\hat{x}_{i}=\frac{\sum_{j} w_{i, j} r_{i, j} y_{j}^{q-1}}{\sum_{j} w_{i, j} y_{j}^{q}}\right) \]
is equivalent to the multiplicative GLMs with the variance function of
for an assumed exponential family distribution. It is well known that insurance and actuarial data is generally positively skewed. The skewness for the symmetric normal distribution is zero and is increasingly positive from Poisson, to gamma, and to inverse Gaussian. For the multiplicative GIA algorithms, the skewness can be represented by When the GIA is the same as a normal GLM. When it is the same as a Poisson GLM. It is the same as a gamma when and the same as inverse Gaussian when Thus, smaller values should be selected when the GIA is applied to more skewed data.The authors also attempted to find the “global minimum error” points.[12] In this case study, if wab is used to measure the model performance, when k = 1.95, p = 3.15, and q = −14.06, the weighted absolute error is minimized with wab = 10.0765. If wapb is used to measure the model performance, when k = 1.98, p = 3.15, and q = −14.04, the weighted absolute percentage error is minimized with wapb = 3.461%. The result suggests that the best-fit model, in this example, does not occur with any of the commonly used minimum bias models and generalized linear models. It clearly demonstrates the fact that insurance data may not be perfect for predetermined distributions.
On the other hand, if wchi is used, the “χ-squared” model (k = 2, p = 1, and q = 1) provides the best solution. This is expected because the “χ-squared” model is calculated by minimizing the Pearson chi-squared statistic.
If we use the criterion of
to select models, when k = 2.45, p = 1.16, and q = −0.06, the combined error is minimized with = 3.3061. Again, the five commonly used minimum bias algorithms are not the best solution when absolute bias and the chi-squared statistic are considered simultaneously.Based on the results of this research and our experience, we suggest for actuarial applications the following ranges of values for k, p, q:
-
1 ≤ k ≤ 3.
-
p ≥ q, 0.5 ≤ p ≤ 4, and q ≤ 1.
-
The higher the skewness of the data, the smaller the value of q should be.
Finally, we use another collision pure premium dataset to demonstrate the results for the mixed algorithm. Table 11 reports the final factors of the model. For the purpose of illustration, we will only calculate the model with p = 1.
To show that GIAs can converge rapidly, in the Appendix we report the iteration processes of selected GIAs:
-
Table 5 shows the multiplicative factors for the gamma GIA using average severity as the base.
-
Table 6 translates those factors using the classification age 60+ and pleasure as the base.
-
Table 7 reports the iterative process for the coefficients of a GLM with the gamma distribution and log link.
-
Table 8 translates those coefficients to the multiplicative factors of a gamma GLM.
-
Table 9 lists the additive factors for the GIA with p = 1.
-
Table 10 shows the additive dollar values for the GIA with p = 1 and uses the classification age 60+ and pleasure as the base.
-
Table 11 reports the convergence process of the mixed model with p = 1.
From Tables 5–8, the multiplicative gamma GIA converges in four iterations. This is as fast as the corresponding GLM model. As expected, the numerical solutions between the two models are identical, and the solutions are also identical to the previous results of Algorithm 6 given in Table 3 for k = 1, p = 1, and q = 0.
Tables 9 and 10 report the iterative process for the GIA additive algorithm with p = 1. Mildenhall (Mildenhall 1999) used this model as an example to show that the minimum bias approach is not efficient. He showed that the minimum bias model converges slowly to the GLM results, and that the dollar values at the 50th iteration are about two cents different from those by GLM. However, using our numerical algorithm, the GIA calculation converges completely in five iterations with solutions identical to GLM results.
Table 11 shows the iterative process for the GIA mixed model. Even though the algorithm is more complicated than the multiplicative and additive models, the convergence takes only six iterations.
The above example illustrates an optimization case with two and three variables. However, in typical rating plans, we need to optimize more than two variables. Our experience indicates that the improved numerical approach for GIA will converge fairly quickly for typical actuarial rating exercises with five to 15 variables.
6. Conclusions
In this research, we propose a general iteration algorithm by including different weighting functions and relativity link functions in the approach. As indicated by the severity example given previously, insurance and actuarial data are rarely perfect, so we expect that the best fitted results typically will not be based on a predetermined distribution, such as those in the exponential family of distributions. Therefore, GIA can provide actuaries a great deal of flexibility in data fitting and model selection. The case studies given in the paper indicate that the “best” fitted results occur when the underlying distribution assumptions are not commonly used distributions.
In theory, the parameters in GIA can take on any real values and there is no limitation on the relativity link functions when GIA is applied to a dataset. Therefore, GIA will provide actuaries many more options than previous minimum bias algorithms or GLMs. However, due to the fact that insurance and actuarial data is positively skewed in nature, we do not expect that a very wide range of weighting or relativity link functions needs to be used in practice.
For the severity example used in the study, we searched and identified the best models with the minimum fitted errors. One issue may exist: GIA uses an iterative process in determining the parameters, so when it further incorporates multiple distribution assumptions in the searching process, the approach may become even more time-consuming and inefficient. However, we do not believe this issue is significant because of the powerful computational capability of modern computers.
Mildenhall (Mildenhall 2005) indicates, in his comments on our prior work, that GLM can be extended to replicate the comprehensive GIA proposed in this study. However, since commonly used GLM software has limited selections for the statistical distribution assumptions, it is difficult to perform Mildenhall’s extension. In addition, we demonstrate how to extend GIA to solve mixed additive-multiplicative models and constraint optimization problems. To our knowledge, at this stage, there is no solution provided by GLM users to deal with these issues.
With the fast development of information technology, actuaries can analyze data in ways they could not imagine a decade ago. Currently there is a strong interest in data mining and predictive modeling in the insurance industry, and this calls for more powerful data analytical tools for actuaries. While some new tools, such as GLM, neural networks, decision trees, and MARS, have emerged recently and have received a great deal of attention, we believe that the decades-old minimum bias algorithms still have several advantages over other techniques, including being easy to understand and easy to use. We hope that our work in improving the flexibility and comprehensiveness of the minimum bias iteration approach is a timely effort and that this approach will continue to be a useful tool for actuaries in the future.
Acknowledgments
The authors thank Steve Mildenhall for his valuable comments.