1. Introduction
The idea behind a priori risk classification is to split an insurance portfolio into classes that consist of risks with all policyholders belonging to the same class paying the same premium. In view of the economic importance of motor third party liability (MTPL) insurance in developed countries, actuaries have made many attempts to find a probabilistic model for the distribution of the number and costs of claims reported by policyholders.
Recent actuarial literature research assumes that the risks can be rated a priori using generalized linear models, GLM (Nelder and Wedderburn 1972) and generalized additive models, GAM (Hastie and Tibshirani 1990). For motor insurance, typical response variables in these regression models are the number of claims (or claim frequency) and its corresponding severity. References for a priori risk classification include, for example, Dionne and Vanasse (1989, 1992), Dean, Lawless, and Willmot (1989), Denuit and Lang (2004), Yip and Yau (2005), and Boucher, Denuit, and Guillen (2007). Dionne and Vanasse used a negative binomial type I regression model. Dean, Lawless, and Willmot used a Poisson-inverse Gaussian regression model. Denuit and Lang used generalized additive models. Yip and Yau presented several parametric zero-inflated count distributions and Boucher, Denuit, and Guillen presented a comparison of various zero-inflated Mixed Poisson and Hurdle Models. Also, a review of actuarial models for risk classification and insurance ratemaking can be found in Denuit et al. (2007).
The models briefly described above assume that only the mean is modeled as a function of risk factors. However, any model for the mean in terms of a priori rating variables indirectly yields a model for scale and/or shape. Also, even if the mean is the most commonly used measure of the expected claim frequency and of the expected claim severity it does not provide a good description of a distribution’s scale and shape. The scale and shape parameters are not adequately described due to the unobserved heterogeneity changes with explanatory variables. In this study, we extend this setup by assuming that all the parameters of the claim frequency/severity distributions can be modeled as functions of explanatory variables with parametric linear functional forms. Joint modeling of all the parameters in terms of covariates improves rate making and estimation of the scale and shape of the claim frequency/severity distributions. In light of a priori ratemaking there is a substantial benefit in this approach, since by modeling all the parameters jointly, both mean and variance may be assessed by choosing a marginal distribution and building a predictive model using all the available ratemaking factors as independent variables. In this respect, risk heterogeneity is modeled as the distribution of frequency and/or severity of claims changes between classes of policyholders by a function of the level of ratemaking factors underlying the analyzed classes. We model the claim frequency using the Poisson, negative binomial type II, Delaporte, Sichel and zero-inflated Poisson models and the claim severity using the gamma, Weibull, Weibull type III, generalized gamma and generalized Pareto models. Our contribution puts focus on the comparison of these models through their variance values and not only the mean values as usually considered in risk classification literature. To the best of our knowledge, it is the first time that the variance of the claim frequency and severity is modeled in the context of ratemaking. Furthermore, the variance of the claim frequency and severity is an important risk measure of the specific class of policyholders, as it can provide a measure of the uncertainty regarding the mean claim frequency and the mean claim severity of the specific class, and the difference in the premium that it implies can act as a cushion against adverse experience.
The difference between the premium and the mean loss is the premium loading. Estimates of variance values are produced by employing a parametric regression for the scale and/or the shape parameters in addition to the mean parameter. However, the commonly used specification that only the mean claim frequency/severity is modeled in terms of risk factors was widely accepted for ratemaking. In this respect, a priori ratemaking is refined by taking in to account the variance values yielded by modeling jointly all the parameters in terms of risk factors. Furthermore, the differences in the variance values alter significantly the premiums calculated through the standard deviation principle since it is understood that in this case the loading is related to the variability of the loss. Thus, joint modeling of location, scale and shape parameters is justified because it enables us to use all the available information in the estimation of these values through the use of the important explanatory variables for the claim frequency and severity, respectively.
The rest of this paper proceeds as follows. Section 2 introduces the alternative distributions we employ for modeling claim frequency and severity. Section 3 contains an application to a data set concerning car-insurance claims at fault. These classification models are compared on the basis of a sample of the automobile portfolio of a major company operating in Greece employing the generalized Akaike information criterion (GAIC) which is valid for both nested or non-nested model comparisons (as suggested by Rigby and Stasinopoulos 2005, 2009). The differences between these models are analyzed through the mean and the variance of the annual number of claims and the costs of claims of the policyholders who belong to different risk classes, which are formed by dividing the portfolio into clusters defined by the relevant ratemaking factors. Finally, the resulting premium rates are calculated via the expected value and standard deviation principles with independence between the claim frequency and severity components assumed.
2. Regression models for location, scale and shape
This section summarizes the characteristics of the various count and loss models used in this study. As we have mentioned, in the setup we extend the recent a priori risk classification research by assuming that every parameter of the conditional response frequency/severity distribution is modeled in terms of covariates through the use of known monotonic link functions chosen to ensure a valid range for the distribution parameters.[1]
2.1. Frequency component
Consider a policyholder i whose number of claims, denoted as Ki, are independent, for i = 1, . . . , n. The probability that the policyholder i has reported k claims to the insurer, k = 0, 1, 2, . . . , is denoted by P (Ki=k). In this study, besides the traditional Poisson regression model, we model the claim frequency using a negative binomial type II, Delaporte, Sichel and zero-inflated mixed Poisson regression model for location scale and shape.
-
The probability density function (pdf) of the Poisson distribution is given by[2]
P(Ki=k)=e−μμkk!.
We allow the μ parameter to vary from one individual to another. Let [3]
where ei denotes the exposure of policy i and where is the 1 J1 vector of the coefficients. The mean and the variance of Ki are given byE(Ki)=Var(ki)=μi=eiexp(c1iβ1).
-
The pdf of negative binomial type II (NBII) distribution is given by[4]
P(Ki=k)=Γ(k+μσ)σkΓ(μσ)Γ(k+1)[1+σ]k+μσ,
for μ > 0 and σ > 0. Following Rigby and Stasinopoulos (2005, 2009), we assume that and where and are the 1 J’1 vectors of the a priori rating variables and the coefficients respectively, for j = 1, 2. The mean and the variance of Ki are given by
E(Ki)=eiexp(c1iβ1)
and
Var(Ki)=eiexp(c1iβ1)[1+exp(c2iβ2)].
-
The pdf of the Delaporte distribution is given by[5]
P(Ki=k)=e−μνΓ(1σ)[1+μσ(1−v)]−1σS
where σi > 0 and 0 ≤ ν < 1 and where
S=\sum_{m=0}^{k}\binom{k}{m} \frac{\mu^{k} v^{k-m}}{k!}\left[\mu+\frac{1}{\sigma(1-k)}\right]^{-m} \Gamma\left(\frac{1}{\sigma}+m\right). \tag{7}
Following Rigby, Stasinopoulos, and Akantziliotou (2008; Rigby and Stasinopoulos 2009), we assume that and where and are the 1 J’j vectors of the a priori rating variables and the coefficients respectively, for j = 1, 2, 3. The mean and variance of Ki are given by
E\left(K_{i}\right)=e_{i} \exp \left(c_{1 i} \beta_{1}\right) \tag{8}
and
\begin{array}{l} \operatorname{Var}\left(K_{i}\right)=e_{i} \exp \left(c_{1 i} \beta_{1}\right)+\left[e_{i} \exp \left(c_{1 i} \beta_{1}\right)\right]^{2} \\ \quad \exp \left(c_{2 i} \beta_{2}\right)\left[1-\frac{\exp \left(c_{3 i} \beta_{3}\right)}{1+\exp \left(c_{3 i} \beta_{3}\right)}\right]^{2}. \end{array} \tag{9}
-
The pdf of the Sichel distribution is given by[6]
P\left(K_{i}=k\right)=\frac{\left(\frac{\mu}{c}\right)^{k} K_{k+\nu}(a)}{k!(a \sigma)^{k+v} K_{\nu}\left(\frac{1}{\sigma}\right)}, \tag{10}
where σ > 0 and
and where whereK_{\nu}(z)=\frac{1}{2} \int_{0}^{\infty} x^{\nu-1} \exp \left[-\frac{1}{2} z\left(x+\frac{1}{x}\right)\right] d x, \tag{11}
is the modified Bessel function of the third kind of order ν with argument z and where a2σ−2 2μ(cσ)−1. Following Rigby, Stasinopoulos, and Akantziliotou (2008) and Rigby and Stasinopoulos (2009), we assume that and where and are the 1 J’j vectors of the a priori rating variables and the coefficients respectively, for j = 1, 2, 3. The mean and variance of Ki are given by
E\left(K_{i}\right)=e_{i} \exp \left(c_{1 i} \beta_{1}\right) \tag{12}
and
\begin{aligned} \operatorname{Var}\left(K_{i}\right)= & e_{i} \exp \left(c_{1 i} \beta_{1}\right)+\left[e_{i} \exp \left(c_{1 i} \beta_{1}\right)\right]^{2} \\ & \left\{\frac{2 \exp \left(c_{2 i} \beta_{2}\right)\left[c_{3 i} \beta_{3}+1\right]}{c_{i}}+\frac{1}{c_{i}^{2}}-1\right\}, \end{aligned} \tag{13}
where
-
The pdf of the zero-inflated Poisson (ZIP) distribution is given by[7]
P\left(K_{i}=k\right)=\left\{\begin{array}{l} \pi+(1-\pi) e^{-\mu}, \text { if } k=0 \\ (1-\pi) \frac{e^{-\mu} \mu^{k}}{k!}, \text { if } k=1,2,3, \ldots \end{array}\right. \tag{14}
Following Rigby and Stasinopoulos (2005, 2009), we assume that and where and βTj are the 1 × J′j vectors of the a priori rating variables and the coefficients respectively, for j = 1, 2. The mean and the variance of Ki are given by
E\left(K_{i}\right)=e_{i} \exp \left(c_{1 i} \beta_{1}\right)\left[1-\exp \left(c_{2 i} \beta_{2}\right)\right] \tag{15}
and
\begin{array}{l} \operatorname{Var}\left(K_{i}\right)=e_{i} \exp \left(c_{1 i} \beta_{1}\right)\left[1-\exp \left(c_{2 i} \beta_{2}\right)\right] \\ \qquad {\left[1+e_{i} \exp \left(c_{1 i} \beta_{1}\right) \exp \left(c_{2 i} \beta_{2}\right)\right] .} \end{array} \tag{16}
2.2. Severity component
In this section, we need to consider the claim severities. Let Xi,k be the cost of the kth claim reported by policyholder i, i = 1, . . . , n and assume that the individual claim costs
are independent and identically distributed (i.i.d). Different models are used to describe the behavior of the costs of claims as a function of the explanatory variables including gamma, Weibull, Weibull type III, generalized gamma, and generalized Pareto regression models for location, scale and shape.-
The pdf of the gamma distribution is given by[8]
f(x)=\frac{1}{\left(s^{2} m\right)^{\frac{1}{s^{2}}}} \frac{x^{\frac{1}{s^{2}}-1} \exp \left(-\frac{x}{s^{2} m}\right)}{\Gamma\left(\frac{1}{s^{2}}\right)} ,\tag{17}
for Xi,k > 0, where m > 0 and s > 0. Following Rigby and Stasinopoulos (2009), we assume that mi = and where and are the 1 J’j vectors of the exogenous variables and the coefficients respectively for j = 1, 2. The mean and variance of Xi,k are given by
E\left(X_{i, k}\right)=\exp \left(d_{1 i} \gamma_{1}\right) \tag{18}
and
\operatorname{Var}\left(X_{i, k}\right)=\left[\exp \left(d_{2 i} \gamma_{2}\right)\right]^{2}\left[\exp \left(d_{1 i} \gamma_{1}\right)\right]^{2}. \tag{19}
-
The pdf of the Weibull distribution is given by[9]
f(x)=\frac{s x^{s-1}}{m^{s}} \exp \left[-\left(\frac{x}{m}\right)^{s}\right], \tag{20}
where m > 0 and s > 0. Following Rigby and Stasinopoulos (2009), we assume that and where and are the 1 J’j vectors of the exogenous variables and coefficients respectively, for j = 1, 2. The mean and the variance of Xi,k are given by
E\left(X_{i, k}\right)=\exp \left(d_{1 i} \gamma_{1}\right) \Gamma\left(\frac{1}{\exp \left(d_{2 i} \gamma_{2}\right)}+1\right) \tag{21}
and
\begin{array}{l} \operatorname{Var}\left(X_{i, k}\right)=\left[\exp \left(d_{1 i} \gamma_{1}\right)\right]^{2} \\ \left\{\Gamma\left(\frac{2}{\exp \left(d_{2 i} \gamma_{2}\right)}+1\right)-\left[\Gamma\left(\frac{1}{\exp \left(d_{2 i} \gamma_{2}\right)}+1\right)\right]^{2}\right\} . \end{array} \tag{22}
-
The pdf of the Weibull type III (WEI3) distribution is given by[10]
\begin{aligned} f(x)=\frac{s}{m} \Gamma\left(\frac{1}{s}+1\right) & {\left[\frac{x}{m} \Gamma\left(\frac{1}{s}+1\right)\right]^{s-1} } \\ & \exp \left\{-\left[\frac{x}{m} \Gamma\left(\frac{1}{s}+1\right)\right]^{s}\right\}, \end{aligned} \tag{23}
where m > 0 and s > 0. Following Rigby and Stasinopoulos (2009), we assume that and where and are the 1 J’j vectors of the exogenous variables and the coefficients respectively, for j = 1, 2. The mean and the variance of Xi,k are given by
E\left(X_{i, k}\right)=\exp \left(d_{1 i} \gamma_{1}\right) \tag{24}
and
\begin{array}{l} \operatorname{Var}\left(X_{i, k}\right)=\left[\exp \left(d_{1 i} \gamma_{1}\right)\right]^{2} \\ \left\{\Gamma\left(\frac{2}{\exp \left(d_{2 i} \gamma_{2}\right)}+1\right)\left[\Gamma\left(\frac{1}{\exp \left(d_{2 i} \gamma_{2}\right)}+1\right)\right]^{-2}-1\right\} . \end{array} \tag{25}
-
The pdf of the generalized gamma (GG) distribution is given by[11]
f(x)=\frac{|n| \theta^{\theta}\left(\frac{x}{m}\right)^{n \theta} \exp \left[-\theta\left(\frac{x}{m}\right)^{n}\right]}{\Gamma(\theta) x}, \tag{26}
where Rigby, Stasinopoulos, and Akantziliotou (2008), we assume that and where and are the 1 J’j vectors of the exogenous variable and the coefficients respectively, for j = 1, 2, 3. The mean and the variance of Xi,k are given by
and FollowingE\left(X_{i, k}\right)=\frac{\exp \left(d_{1 i} \gamma_{1}\right) \Gamma\left(\theta_{i}+\frac{1}{d_{3 i} \gamma_{3}}\right)}{\theta_{i}^{\frac{1}{d_{3 i} \gamma_{3}}} \Gamma\left(\theta_{i}\right)} \tag{27}
and
\begin{array}{l} \operatorname{Var}\left(X_{i, k}\right) \\ =\frac{\left[\exp \left(d_{1 i} \gamma_{1}\right)\right]^{2}\left\{\begin{array}{l} \Gamma\left(\theta_{i}\right) \Gamma\left(\theta_{i}+\frac{2}{d_{3 i} \gamma_{3}}\right) \\ -\left[\Gamma\left(\theta_{i}+\frac{1}{d_{3 i} \gamma_{3}}\right)\right]^{2} \end{array}\right\}}{\theta_{i}^{\frac{2}{d_{3} \gamma_{3}}}\left[\Gamma\left(\theta_{i}\right)\right]^{2}}, \\ \end{array} \tag{28}
where
-
The pdf of the generalized Pareto distribution is given by[12]
f(x)=\frac{\Gamma(n+t)}{\Gamma(n) \Gamma(t)} \frac{m^{t} x^{n-1}}{(x+m)^{n+t}}, \tag{29}
where m > 0, n > 0 and t > 0. Following Rigby, Stasinopoulos, and Akantziliotou (2008), we assume that and where and are the 1 J’j vectors of the exogenous variables and the coefficients respectively for j = 1, 2, 3. The mean and variance of Xi,k are given by
E\left(X_{i, k}\right)=\frac{\exp \left(d_{1 i} \gamma_{1}\right) \exp \left(d_{2 i} \gamma_{2}\right)}{\exp \left(d_{3 i} \gamma_{3}\right)-1} \tag{30}
and
\begin{aligned} \operatorname{Var}\left(X_{i, k}\right) & =\frac{\left[\exp \left(d_{1 i} \gamma_{1}\right)\right]^{2} \exp \left(d_{2 i} \gamma_{2}\right)}{\exp \left(d_{3 i} \gamma_{3}\right)-1} \\ & \left\{\frac{\exp \left(d_{2 i} \gamma_{2}\right)+\exp \left(d_{3 i} \gamma_{3}\right)-1}{\left[\exp \left(d_{3 i} \gamma_{3}\right)-1\right]\left[\exp \left(d_{3 i} \gamma_{3}\right)-2\right]}\right\} . \end{aligned} \tag{31}
3. Application
The data were kindly provided by a Greek insurance company and concern a motor third party liability insurance portfolio observed during 3.5 years. The data set comprises 15641 policies. Both private cars and fleet vehicles have been considered in this sample.[13] The available a priori rating variables we employ are the Bonus Malus (BM) class,[14] the horsepower (HP) of the car and gender of the driver. Only policyholders with complete records, i.e., with availability of all the variables under consideration were considered. Records for fleet data were not available for the case of the claim frequency. Furthermore, in light of the heterogeneity which exists within the portfolio, consideration was given to grouping the levels of each explanatory variable with respect to risk profiles with similar number and costs of claims at fault reported to the company over the 3.5 years of observation. This was done in order to achieve ratemaking accuracy and homogeneity within rating cells, for the claim frequency and severity component respectively. Also, by balancing homogeneity and sufficiency of the volume of data in each cell credible patterns were provided. As a result of the aforementioned methodology, Bonus-Malus and horsepower variables were segmented into different categories for claim frequency and claim severity component. This will affect the a priori ratemaking, since the claim frequency and severity component will contain a different number of homogeneous classes, generating a ratemaking structure that is fair to the policyholders. Claim counts are modeled for all 15641 policies. The Bonus-Malus class consists of four categories: A, B, C and D, where: A = “drivers who belong to BM classes 1 and 2,” B = “drivers who belong to BM classes 3–5,” C = “drivers who belong to BM classes 6–9 & 11–20” and D = “drivers who belong to BM class 10.” The horsepower of the car consists of three categories: A, B and C, where: A = “drivers who had a car with a HP between 0–33 & 100–132,” B = “drivers who had a car with a HP between 34–66” and C = “drivers who had a car with a HP between 67–99.” The gender consists of two categories: M = “male” and F = “female” drivers. Regarding the amount paid for each claim, there were 5590 observations that met our criteria. The Bonus-Malus class consists of three categories: A, B and C, where: A = “drivers who belong to BM classes 1 and 2,” B = “drivers who belong to BM classes 3–5 & 6–9 & 11–20” and C = “drivers who belong to BM class 10.” The horsepower of the car consists of four categories A, B, C and D, where: A = “drivers who had a car with a HP between 100–110 & 111–121 & 122–132,” B = “drivers who had a car with a HP between 0–33 & 34–44 & 45–55 & 56–66,” C = “drivers who had a car with a HP between 67–74” and D = “drivers who had a car with a HP between 75–82 & 83–90 & 91–99.” Finally, the gender consists of three categories: M = “male,” F = “female” and B = “both,” since in this case, data for fleet vehicles used by either male or female drivers were also available, i.e., shared use.
The claim frequency and severity models presented in Sections 2 and 3 were estimated using the GAMLSS package in software R.[15] The ratio of Bessel functions of the third kind whose orders are different was calculated using the HyperbolicDist package in software R.
3.1. Modeling results
This subsection describes the modeling results of the Poisson, negative binomial type II (NBII), Delaporte (DEL), Sichel and zero-inflated Poisson (ZIP), and gamma (GA), Weibull (WEI), Weibull type III (WEI3), generalized gamma (GG) and generalized Pareto (GP) regression models for location scale and shape that have been applied to model claim frequency and claim severity respectively.
Claim frequency and severity models have been calibrated with respect to GAIC goodness of fit index as suggested by Rigby and Stasinopoulos (2005, 2009). We followed a model selection technique similar to the one presented in Heller et al. (2007).[16] Specifically, our variable selection started with the examination of the mean parameter of each frequency and severity model. This was achieved by adding all available explanatory variables and testing whether the exclusion of each one lowered the Global Deviance, AIC and SBC values. After having selected the best predictor for the mean parameter, we continued in determining the remaining predictors by testing which rating variable between those used in the mean parameter would lead to a further decrease of the GAIC when inserted in the scale and shape parameters of the claim frequency and severity models respectively. Furthermore, if between the same frequency/severity distributions with different parameter specifications several models have similar AIC and SBC values, we preferred the simpler model in order to avoid overfitting. Therefore, the scale and shape parameters of the models have fewer predictors than the mean parameter (see Tables 1 and 2). In the above respect, the final claim frequency and severity models we selected are those that yield the lowest Global deviance (DEV), Akaike information criterion (AIC), and Bayesian information criterion (BIC) values. Also, every explanatory variable they contain is statistically significant at a 5% threshold.
Tables 1 and 2 summarize our findings with respect to the aforementioned claim frequency and severity models respectively.[17]
From Table 1 we observe, for all frequency models, that BM category A, HP category A and male drivers are the reference categories of μ. HP category A and male drivers are the reference categories for σ in the case of the NBII model. HP category A is the reference category for σ in the case of the Delaporte and Sichel models. BM category A and male drivers are the reference categories for σ in the case of the ZIP model. Furthermore, we see that HP category appears in model equations for both μ and σ in the case of the NBII, Delaporte and Sichel models. Gender appears in model equations for both μ and σ in the case of the NBII and ZIP models. BM category appears in the models equation for both μ and σ in the case of the ZIP model. These a priori rating variables do not always have a similar effect (positive and/or negative) on μ and σ.
The results summarized in Table 2 show that BM category A, HP category A and fleet vehicles used by both male or female drivers are the reference categories for m and s in the case of gamma, Weibull, Weibull type III and generalized gamma models. BM category A, HP category A and fleet vehicles are the reference categories for m and n, and BM category A and HP category A are the reference categories for t in the case of the generalized Pareto model. Note also that BM category, HP category, and gender appear in the model equations for both m and s in the case of the gamma, Weibull and Weibull type III and generalized gamma models. Furthermore, in the case of the generalized gamma model, BM category and gender are also in the model equations for n. Finally, in the case of the generalized Pareto model we observe that BM category, HP category and gender appear in the model equations for both m and n, and BM category and HP category are in the model equations for t. These explanatory variables do not always have the same effect (positive and/or negative) on the parameters m, s, n and t.
Most of the models presented in Tables 1 and 2, their reparameterizations and special cases have already been employed for modeling claim frequency/severity data. However, as we have already mentioned, the commonly used specification that only the mean claim frequency/severity is modeled in terms of risk factors was widely accepted for ratemaking. Also, the results for the location parameter of the claim frequency/severity models are in line with the existing results, based on the examination of the relative data sets, in recent actuarial literature research. Specifically, as expected, the values of the estimated regression coefficients of the explanatory variables for this parameter will lead to mean claim frequency/severity values which will not differ much under different distributional assumptions. Within the framework we adopted, the systematic part of these models was expanded to allow modeling of all the parameters of the claim frequency/severity distribution as functions of a priori rating variables. This approach is especially suited to modeling insurance response data which often exhibit heterogeneity, i.e., a situation where the scale or shape of the distribution of the response variable changes with explanatory variables. Furthermore, joint modeling of all the parameters in an a priori ratemaking scheme breaks the nexus between the mean and variance implied by the standard procedure using GLM models, leading to a more complete comparison of these models through their variance values. Finally, in this way we will be able to use all the available information in the estimation of the claim frequency/severity distribution in order to group risks with similar risk characteristics and to establish fair premium rates. Furthermore, our analysis shows that the employment of more advanced models that capture the stylized characteristics of the data is beneficial for the insurance company.
3.2. Models comparison
So far, we have several competing models for the claim frequency and severity components. The differences between models produce different premiums. Consequently, to distinguish between these models, this section compares them so as to select the best for each case. As suggested by Rigby and Stasinopoulos (2005, 2009) the models have been calibrated with respect to generalized Akaike information criterion (GAIC) which is valid for both nested or non-nested model comparisons. The generalized Akaike information criterion (GAIC) is defined as
G A I C=\hat{D}+\kappa \times d f , \tag{32}
where D̂ = −2l̂ is the fitted (global) deviance, l̂ is the fitted log-likelihood, df is the degrees of freedom used in the model (i.e., the sum of the degrees of freedom used for the location, scale and shape parameters) and κ is a constant. The Akaike information criterion (AIC) and the Schwartz Bayesian criterion (SBC) are special cases of the GAIC. Specifically, if we let κ = 2 we have the AIC, while if we let κ = log (n) we have the SBC.
The resulting Global Deviance, AIC and SBC are given in Table 3 for the different claim frequency (Panel A) and claim severity (Panel B) fitted models.
Overall, with respect to the Global Deviance, AIC and SBC indices, from Panel A we observe the best fitted claim frequency model is the negative binomial type II model, followed closely by the Sichel and Delaporte models. From the claim severity models in Panel B we see that the best fitting performances are provided by the generalized gamma model followed by the generalized Pareto and gamma models. Negative binomial type II and generalized gamma capture more efficiently the stylized characteristics of the data, such as overdispersion of the number of claims and the tail behavior of losses and performed better than the other distributions.
3.3. A priori risk classification
In this subsection differences between the claim frequency and severity models, presented in Sections 2 and 3 respectively, are analyzed through the mean and the variance of the number and costs of claims of the policyholders who belong to different risk classes, which are determined by the availability of the relevant a priori characteristics.
The final a priori ratemaking for the claim frequency models contains 24 classes. The estimated expected annual claim frequency and the variance for each risk class are obtained by Eqs (2, 4, 8, 12 and 15) and the Eqs (2, 5, 9, 13 and 16) for the case of the Poisson, negative binomial type II (NBII), Delaporte (DEL), Sichel and zero-inflated Poisson (ZIP) model respectively. The results are summarized in Table 4. As expected, the variance of the NBII, Delaporte, Sichel and ZIP model exceeds the mean and these models allow for overdispersion. Furthermore, we observe that the biggest differences lie in the variance values of these models. For example, the variance of the expected number of claims for a man who belongs to BM category A and has a car that belongs to HP category A, i.e., for the reference class, is equal to 0.1264, 0.2140, 0.1868, 0.1884 and 0.1391 while the variance of the expected number of claims for a woman who shares common characteristics is equal to 0.1354, 0.1964, 0.2100, 0.2128 and 0.1507 in the case of the Poisson, NBII, Delaporte, Sichel and ZIP model, respectively.
The final a priori ratemaking for the claim severity models contains 36 classes. Table 5 gives the estimated expected claim severity and the variance for each risk class obtained from the gamma (GA), Weibull (WEI), Weibull type III (WEI3), generalized gamma (GG) and generalized Pareto (GP) model according to the Eqs (18, 21, 24, 27 and 30) and the Eqs (19, 22, 25, 28 and 31) respectively. As expected, similarly to the case of the claim frequency models, we see that the biggest differences between the claim severity models lie in their variance values. For instance, the variance of the expected claim costs for a fleet vehicle that belongs to HP category A, used by both a man and a woman, and belongs to BM category A, i.e., for the reference class, is equal to 135347.30, 169637.36, 168267.90, 148196.45 and 142078.20, while the variance of the expected claim costs for a private car that belongs to HP category A and is used by a man who belongs to BM category A is equal to 78621.46, 110315.30, 111018.27, 72875.39 and 89891.64 in the case of the gamma, WEI, WEI3, generalized gamma and generalized Pareto model.
Overall, the results summarized in Tables 4 and 5 show the following trends by type of frequency/severity model as to which the lowest/highest variances are observed. First, from Table 4 we see that the NBII model has the highest variance values among all models in eleven risk classes. The Delaporte model has the highest variance values among all models in six risk classes, while it has the lowest variance value among all mixed Poisson models[18] in one risk class. The Sichel model has the highest variance values among all models in five risk classes, while it has the lowest variance values among all mixed Poisson models in eight risk classes. The ZIP model has the highest variance values among all models in two risk classes, while it has the lowest variance values among all mixed Poisson models in fifteen risk classes. Second, from Table 5 we observe that the gamma model has the highest variance value among all models in one risk class, while it has the lowest variance values among all models in fourteen risk classes. The Weibull model has the highest variance values among all models in five risk classes. The Weibull type III model has the highest variance values among all models in ten risk classes. The generalized gamma model has the lowest variance values among all models in nineteen risk classes. The generalized Pareto model has the highest variance value among all models in twenty risk classes, while it has the lowest variance values among all models in three risk classes.
The claim frequency and severity models are better compared through their variance values, leading to a better classification of the policyholders and thus modeling jointly the location, scale and shape parameters in terms of a priori rating variables is justified because it enables us to use all the available information in the estimation of these values through the use of the important a priori rating variables for the number and the costs of claims respectively.
3.4. Calculation of the premiums according to the expected value and standard deviation principles
Consider a policyholder i who belongs to a group of policyholders, whose number of claims, denoted as Ki, are independent, for i = 1, . . . , n. Let Xi,k be the cost of the kth claim reported by the policyholder i and assume that the individual claim costs
are independent. It is assumed that the number of claims of each policyholder that belongs to a certain group is independent of the severity of each claim in order to deal with the frequency and the severity components separately.A premium principle is a rule for assigning a premium to an insurance risk. In this section the premiums rates will be calculated via two well-known premium principles, the expected value and the standard deviation premium principles. More details about the use of the expected value premium principle in MTPL insurance can be found in Lemaire (1995). Furthermore, regarding the use of the standard deviation premium principle one can refer to Bühlmann (1970) and Lemaire (1995) who used the variance principle in MTPL insurance, which is closely related to the standard deviation principle. The standard deviation principle can be used as an alternative and complementary of the expected value principle. It provides a more complete picture to the actuary since it takes into account an additional characteristic of the distribution, i.e., the standard deviation of the number of claims and of losses.
-
The premium rates calculated according to the expected value principle are given by
P_{1}=\left(1+w_{1}\right) E\left(K_{i}\right)\left(1+w_{2}\right) E\left(X_{i, k}\right), \tag{33}
where w1 > 0 and w2 > 0 are risk loads.
-
The premium rates calculated according to the standard deviation principle are given by
\begin{array}{l} P_{2}=\left[E\left(K_{i}\right)+\omega_{1} \sqrt{\operatorname{Var}\left(K_{i}\right)}\right] \\ {\left[E\left(X_{i, k}\right)+\omega_{2} \sqrt{\operatorname{Var}\left(X_{i, k}\right)}\right],} \end{array} \tag{34}
where ω1 > 0 and ω2 > 0 are risk loads.
In the following example (Table 6), six different groups of policyholders have been considered. In Table 6 a YES indicates the presence of the characteristic corresponding to the column.
We will calculate the premiums P1 and P2 that must be paid by a specific group of policyholders based on the alternative models for assessing claim frequency and the various claim severity models. We assume that w1 = w2 = ω1 = Table 7 by substituting into Eqs (33 and 34) the corresponding E(Ki) and Var(Ki), and E(Xi,k) and Var(Xi,k) values to these six different groups of policyholders, which were displayed in Tables 4 and 5 for the case of the Poisson, NBII, Delaporte, Sichel and ZIP, and the gamma, Weibull, Weibull type III, generalized gamma and generalized Pareto regression models for location scale and shape respectively.
The premiums P1 and P2 are obtained inFrom Table 7 consider, for instance, a man who belongs to BM category A and has a car with a HP between 34–66. In the case of the Poisson model and the corresponding claim severity models, P1 is equal to 31.78, 31.77, 31.75, 31.65 and 32.03 euros, while P2 equals 35.95, 36.07, 36.05, 35.85 and 36.5 euros. In the case of the NBII model and the corresponding claim severity models, P1 is equal to 31.91, 31.90, 31.88, 31.78 and 32.16 euros, while P2 equals 37.35, 37.48, 37.46, 37.25 and 37.95 euros. In the case of the Delaporte model and the corresponding claim severity models, P1 is equal to 31.37, 31.36, 31.33, 31.24 and 31.61 euros, while P2 equals 36.14, 36.26, 36.24, 36.04 and 36.72 euros. In the case of the Sichel model and the corresponding severity models, P1 is equal to 31.37, 31.36, 31.33, 31.24 and 31.61 euros, while P2 equals 35.80, 35.93, 35.90, 35.70 and 36.38 euros. In the case of the ZIP model and the corresponding claim severity models, P1 is equal to 31.34, 31.33, 31.30, 31.20 and 31.58 euros, while P2 equals 35.84, 35.97, 35.94, 35.74 and 36.42 euros. Overall, we observe that all the claim frequency models which were combined with the generalized gamma model for assessing claim severity have the lowest P1 and P2 values among their combinations with the other claim severity models. Also, PO-GP, NBII-GP, DEL-GP, SI-GP and ZIP-GP have the highest P1 and P2 values in groups 1, 2, 3 and 4, while PO-WEI3, NBII-WEI3, DEL-WEI3, SI-WEI3 and ZIP-WEI3 have the highest P1 and P2 values in groups 5 and 6 among their combinations with the other claim severity models. Finally, with respect to the NBII and GG models which performed best, we see that NBII-GG has the lowest P1 values in groups 2, 4 and 6 and the lowest P2 values in groups 2 and 6 among all the combinations of the mixed Poisson models for approximating claim frequency and the claim severity models.
4. Conclusions
In this paper, we examined the use of regression models for location, scale and shape for pricing risks through ratemaking based on a priori risk classification. Specifically, we assumed that the number of claims was distributed according to a Poisson, negative binomial type II, the Delaporte, Sichel and zero-inflated Poisson and that the losses were distributed according to a gamma, Weibull, Weibull type III, generalized gamma and generalized Pareto regression model for location, scale and shape respectively. These classification models were calibrated employing a generalized Akaike information criterion (GAIC) which is valid for both nested or non-nested model comparisons (as suggested by Rigby and Stasinopoulos 2005, 2009). The best fitted claim frequency model was the negative binomial type II model, followed closely by the Sichel and Delaporte models while regarding the claim severity models, the best fitting performances were provided by the generalized gamma model followed by the generalized Pareto and gamma models. Furthermore, the difference between these models was analyzed through the mean and the variance of the annual number of claims and the severity of claims of the policyholders, who belong to different risk classes. The resulting a priori premiums rates were calculated via the expected value and standard deviation principles with independence between the claim frequency and severity components assumed.
Extensions to other frequency/severity regression models for location scale and shape can be obtained in a similar straightforward way. Moreover, these models are parametric and a possible line of further research is to explore the semiparametric approach and go through the ratemaking exercise when functional forms other than the linear are included, based on the generalized additive models for location scale and shape (GAMLSS) approach of Rigby and Stasinopoulos (2001, 2005, 2009). Also see, for example, a recent paper by Klein et al. (2014) in which Bayesian GAMLSS models are employed for nonlife ratemaking and risk management.
Acknowledgments
The authors would like to thank the Variance Editor in Chief and the referees for their constructive comments and suggestions.
For more details about the claim frequency/severity models and the associated link functions used in this paper, we refer the reader to Rigby and Stasinopoulos (2005, 2009).
The Poisson regression model has been widely used by insurance practitioners for modeling claim count data. See, for example, Renshaw (1994).
Equidispersion implied by the Poisson distribution is usually corrected by the introduction of a random variable into the regression component. Then the marginal distribution of the number of claims is a mixed Poisson distribution. For well-known results applied to the above situation, we refer the interested reader to Gourieroux, Montfort, and Trognon (1984b, 1984a), Boyer, Dionne, and Vanasse (1992), Lemaire (1995), and Boucher, Denuit, and Guillen (2007, 2008).
This parameterization was used by Evans (1953) as pointed out by Johnson, Kotz, and Balakrishnan (1994). Note also that a negative binomial type I distribution arises if σ is reparameterized to σ1μ. A priori ratemaking using the NBI where regression is not only performed on the mean parameter has been recommended by, for example, Boucher, Denuit, and Guillen (2007, 2008).
This parameterization of Delaporte was given by Rigby, Stasinopoulos, and Akantziliotou (2008).
Parameterization (10) was given by Rigby, Stasinopoulos, and Akantziliotou (2008). The use of the Sichel distribution for modeling claim frequency where regression is only performed on the mean parameter has been recommended by Tzougas and Frangos (2014).
This parameterization was used by Johnson, Kotz, and Balakrishnan (1994) and Lambert (1992). The ZIP model is a special case of a mixed Poisson distribution. However, if overdispersion in the Poisson part is still present then all the distributions seen before can be used since a heterogeneity term may be incorporated in the model. For instance, see Yip and Yau (2005) for an application to insurance claim count data. For more details about zero-inflated count models see Lambert (1992) and Green and Silverman (1994).
We use the parameterization of the two parameter gamma distribution given by Rigby and Stasinopoulos (2009). Note also that a priori ratemaking using the gamma distribution where regression is not only performed on the mean parameter can be found in, for example, Denuit et al. (2007).
The specific parameterization of the two parameter Weibull distribution used here was that used by Johnson, Kotz, and Balakrishnan (1994).
This is a parameterization of the Weibull distribution where m is the mean of the distribution.
The parameterization of the generalized gamma distribution we use was that used by Lopatatzidis and Green (2000).
The above parameterization of the generalized Pareto distribution can be found, for example, in Klugman, Panjer, and Willmot (2004). Note that if we let n = 1 in Eq. (29), the generalized Pareto distribution reduced to the Pareto distribution. The use of the Pareto distribution for modeling claim severity where regression is not only performed on the mean parameter can be found in Frangos and Vrontos (2001).
All the characteristics we consider are observable.
A Bonus-Malus System (BMS) penalizes policyholders responsible for one or more claims by a premium surcharge (malus) and rewards the policyholders who had a claim-free year by awarding discount of the premium (bonus).
Note that the same models can be fitted to larger data sets in order to study the effect of other rating factors such as age of driver, driving experience or driving zone, which have been traditionally used in MTPL insurance.
Heller et al. (2007) used generalized additive models for location scale and shape (GAMLSS) for the statistical analysis of the total amount of insurance paid out on a policy.
Note that in Tables 1 and 2 the significant at a probability level of 5% p-values are included in parentheses.
The Poisson regression model has the lowest variance values among all models since they are equal to its mean values.