The Chain Ladder and Tweedie Distributed Claims Data

Greg Taylor

1. Introduction

The chain ladder is a widely used algorithm for loss reserving. It is formulated in Mack (1993). From its heuristic beginnings, it was shown to give maximum likelihood (ML) estimates of model parameters (Hachemeister and Stanard 1975; Mack 1991a; Renshaw and Verrall 1998) when:

observations are independently Poisson distributed; and
their means are modeled as the product of a row effect and a column effect.

This result was extended from the Poisson to the overdispersed Poisson (ODP) distribution by England and Verrall (2002).

Mack (1991a) considered another model in which observations were gamma distributed, and gave a number of earlier references to the same model. ML parameter estimates were obtained which, while not identical to chain ladder estimates, have sometimes been found by subsequent authors (e.g., Wüthrich 2003) to be numerically similar.

The ODP lies within the Tweedie family (Tweedie 1984), a subset of the exponential dispersion family (Nelder and Wedderburn 1972). Wüthrich (2003) made a numerical study of ML fitting in the case of Tweedie distributed observations. Again the results were similar to chain ladder estimation.

The purpose of the present very brief note is to consider ML estimation in this Tweedie case, to derive the earlier results as special cases of it, and to indicate the reasons for the numerical similarity of their results.

2. Preliminaries

2.1. Framework and notation

The data set will consist throughout of a triangle of insurance claims data. Let i = 1, 2, . . . , n denote period of origin, j = 1, 2, . . . , n denote development period, and Y_ij ≥ 0 the observation in the (i, j) cell of the triangle. The triangle consists of the set {Y_ij : i = 1, 2, . . . , n; j = 1, 2, . . . , n − i + 1} of incremental claims data (paid losses, claim counts, etc.). It is assumed that E[Y_ij] is finite for each (i, j).

Define cumulative row sums

\[ S_{i j}=\sum_{k=1}^{j} Y_{i k} . \tag{2.1} \]

Further, let \(\sum^{R(i)} x_{i j}\) denote summation over the entire row \(i\) of the triangle of quantities \(x_{i j}\) indexed by \(i, j\), i.e., over cells \((i, j)\) with \(i\) fixed and \(j=1,2, \ldots, n-i+1\). Similarly, let \(\sum^{C(j)}\) denote summation over the entire column \(j\) of the triangle, and let \(\sum^{D(k)}\) denote summation over the entire diagonal \(k\).

2.2. Chain ladder

The chain ladder model is formulated by Mack (1991b, 1993) as follows:

\[ \begin{array}{l} \mathrm{E}\left[S_{i, j+1} \mid S_{i 1}, S_{i 2}, \ldots, S_{i j}\right]=S_{i j} f_{j},\\ j=1,2, \ldots, n-1 \text {, independently of } i \end{array} \tag{2.2} \]

for some set of parameters f_j; and also

Rows of the data triangle are stochastically independent, i.e., Y_ij and Y_kl are independent for i ≠ k.

It may be observed that (2.2) implies

\[ \mathrm{E}\left[S_{i j} \mid S_{i 1}\right]=S_{i 1} f_{1} f_{2} \ldots f_{j-1} ,\tag{2.3} \]

which in turn implies

\[ \mathrm{E}\left[Y_{i j}\right]=\alpha_{i} \beta_{j} \tag{2.4} \]

for parameters α_i, β_j, where E[Y_ij] denotes the unconditional mean of Y_ij, and

\[ f_{j}=\sum_{k=1}^{j+1} \beta_{k} / \sum_{k=1}^{j} \beta_{k} . \tag{2.5} \]

The derivation of Equation (2.4) is as follows:

\[ \begin{aligned} \mathrm{E}\left[Y_{i j}\right] & =\mathrm{E}\left[S_{i j}-S_{i, j-1} \mid S_{i 1}\right] \\ & =\mathrm{E}\left[S_{i 1} f_{1} f_{2} \ldots f_{j-2}\left(f_{j-1}-1\right)\right] \end{aligned} \]

by (2.1) and (2.3), where the outer expectation is taken with respect to \(S_{i 1}\). Hence

\[ \mathrm{E}\left[Y_{i j}\right]=\mathrm{E}\left[S_{i 1}\right] f_{1} f_{2} \ldots f_{j-2}\left(f_{j-1}-1\right) \]

which is of form (2.4).

The chain ladder estimate of f_j is

\[ F_{j}=\sum_{i=1}^{n-j} S_{i, j+1} / \sum_{i=1}^{n-j} S_{i j} . \tag{2.6} \]

The \(F_j\) may be converted to estimates \(\hat{\alpha}_i, \hat{\beta}_j\) of the \(\alpha_i, \beta_j\) by means of the following relations:

\[ \hat{\beta}_{j}=\beta_{1}\left[F_{1} \ldots F_{j-2}\left(F_{j-1}-1\right)\right] \tag{2.7} \]

subject to some linear constraint on the β_j, such as

\[ \sum_{k=1}^{n} \beta_{k}=1 \tag{2.8} \]

and

\[ \hat{\alpha}_{i}=S_{i, n-i+1} / \sum^{R(i)} \hat{\beta}_{j} .\tag{2.9} \]

2.3. Exponential dispersion and Tweedie families of distributions

2.3.1. Exponential dispersion family

The following family of log densities is called the exponential dispersion family (EDF) (Nelder and Wedderburn 1972):

\[ l(y ; \gamma, \lambda)=c(\lambda)[y \gamma-b(\gamma)]+a(y, \lambda) \tag{2.10} \]

for some functions a(.,.), b(.) and c(.) and parameters γ and λ.

It may be shown that, for Y subject to this log likelihood,

\[ \mu=\mathrm{E}[Y]=b^{\prime}(\gamma), \quad \operatorname{Var}[Y]=b^{\prime \prime}(\gamma) / c(\lambda) . \tag{2.11} \]

2.3.2. Tweedie family

A sub-family of the EDF is that defined by the relations:

\[ c(\lambda)=\lambda \tag{2.12} \]

\[ \operatorname{Var}[Y]=\mu^{p} / \lambda \quad \text { for some } \quad p \leq 0 \quad \text { or } \quad p \geq 1 . \tag{2.13} \]

This is the Tweedie family of exponential dispersion likelihoods (Tweedie 1984). The restriction on the moment relations (2.11) implies that

\[ b^{\prime}(\gamma)=[(1-p)(\gamma+k)]^{1 /(1-p)} \tag{2.14} \]

\[ b(\gamma)=(2-p)^{-1}[(1-p)(\gamma+k)]^{(2-p) /(1-p)} \tag{2.15} \]

for some constant k. This parameterization is found, for example, in Jorgensen and Paes de Souza (1994) and Wüthrich (2003) with k = 0.

Occasionally, the Tweedie family is defined as above but over the parameter range 1 < p < 2 (Mildenhall 1999; Kaas 2005). It is noteworthy that a member of the family with one of these values of p is a compound Poisson distribution (Jorgensen and Paes de Souza 1994) with a gamma severity distribution.

It follows from (2.11), (2.14), and (2.15) that

\[ \gamma=\mu^{1-p} /(1-p)-k \tag{2.16} \]

\[ b(\gamma)=\mu^{2-p} /(2-p) . \tag{2.17} \]

3. Maximum likelihood estimation for the Tweedie cross-classified model

Consider the model (2.4), together with the assumption that all Y_ij are stochastically independent. Note that this is not the same as the chain ladder model, as defined in Section 2, because the latter is formulated in terms of conditional expectations and does not make the same independence assumption. Indeed, Assumption CL1 specifically postulates dependencies between observations from within the same row.

Let Y denote the entire set {Y_ij} of observations, and let l(Y) denote the log likelihood of Y for some assumed distribution of the Y_ij, whose parameters have been suppressed for convenience. Suppose that each Y_ij has a Tweedie distribution defined by (2.12) and the following generalization of (2.13):

\[ \operatorname{Var}\left[Y_{i j}\right]=\mu_{i j}^{p} / \lambda w_{i j} \tag{3.1} \]

i.e., λ is replaced by λ/w_ij in (2.12). In common parlance w_ij is the weight associated with Y_ij. This model will be called the Tweedie cross-classified model.

While the Tweedie family allows a reasonably general representation of insurance data, its restrictions should be recognized. First, it has a short (exponential) tail for the case \(p \geq 1\). Second, all its cumulants, from the variance upward, are related through \(p\) since the \(k\) th cumulant is a multiple of \(b^{(k)}(\gamma)\), where the superscript denotes differentiation (McCullagh and Nelder 1989, 44).

With the replacement λ ← λ/w_ij just given, and substitution of (2.16) and (2.17) into (2.10),

\[ \begin{aligned} l(Y)=\sum\{ & \lambda w_{i j}\left[y_{i j}\left[\mu_{i j}^{1-p} /(1-p)-k\right]-\mu_{i j}^{2-p} /(2-p)\right] \\ & \left.+a\left(y_{i j}, \lambda\right)\right\} \end{aligned} \tag{3.2} \]

where the summation runs over all observations in the data set Y.

The ML equations with respect to the α_i are:

\[ \begin{array}{c} \delta L / \delta \alpha_{i}=\sum^{R(i)} \lambda w_{i j}\left[y_{i j} \mu_{i j}^{-p}-\mu_{i j}^{1-p}\right] \beta_{j}=0, \\ i=1, \ldots, n \end{array} \tag{3.3} \]

where use has been made of (2.4). This may be equivalently represented as follows:

Lemma 3.1 The ML equations with respect to the α_i for the Tweedie cross-classified model are:

\[ \sum^{R(i)} w_{i j} \mu_{i j}^{1-p}\left[y_{i j}-\mu_{i j}\right]=0, \quad i=1, \ldots, n . \tag{3.4} \]

Similarly, the ML equations with respect to the β_j are:

\[ \sum^{C(j)} w_{i j} \mu_{i j}^{1-p}\left[y_{i j}-\mu_{i j}\right]=0, \quad j=1, \ldots, n . \tag{3.5} \]

Note that p is taken here as fixed, rather than estimated. ML estimation of this parameter would require an additional equation.Equations (3.4) and (3.5) are reminiscent of the estimating equations of Fu and Wu (2007) who were concerned with a cross-classified model in a ratemaking context.

Corollary 3.2 The case of ODP \(Y_{i j}\) is represented by \(p=1, w_{i j}=1\). The \(M L\) equations are then

\[ \sum^{R(i)}\left[y_{i j}-\mu_{i j}\right]=0, \quad i=1, \ldots, n \tag{3.6} \]

\[ \sum^{C(j)}\left[y_{i j}-\mu_{i j}\right]=0, \quad j=1, \ldots, n . \tag{3.7} \]

These imply the chain ladder estimation of the α_i, β_j set out in (2.6)–(2.9).

Proof See Hachemeister and Stanard (1975), Mack (1991a), or Renshaw and Verrall (1998).

Corollary 3.3 The case of gamma Y_ij is represented by p = 2. The ML equations are then

\[ \sum^{R(i)} w_{i j}\left[y_{i j} / \mu_{i j}-1\right]=0, \quad i=1, \ldots, n \tag{3.8} \]

\[ \sum^{C(j)} w_{i j}\left[y_{i j} / \mu_{i j}-1\right]=0, \quad j=1, \ldots, n . \tag{3.9} \]

Substitution of α_i β_j for μ_ij, followed by minor rearrangement, gives

\[ \alpha_{i}=w_{i .}^{-1} \sum^{R(i)} w_{i j} y_{i j} / \beta_{j}, \quad i=1, \ldots, n \tag{3.10} \]

\[ \beta_{j}=w_{\cdot j}^{-1} \sum^{C(j)} w_{i j} y_{i j} / \alpha_{i}, \quad j=1, \ldots, n \tag{3.11} \]

where

\[ w_{i .}=\sum^{R(i)} w_{i j} \tag{3.12} \]

\[ w_{. j}=\sum^{C(j)} w_{i j} . \tag{3.13} \]

These are essentially the results obtained by Mack (1991a) for gamma-distributed cells.

Remark 3.4 Mack’s assumption of a gamma distribution is, in fact, an approximation to a compound Poisson distribution in each cell of the triangle in which each cell has a gamma severity distribution with the same shape parameter. Mack notes that the shape parameter would need to take a smallish value in order to attribute a non-negligible probability to Y_ij in the vicinity of zero.

As noted near the end of Section 2, the compound Poisson with gamma severity distribution may itself be accommodated within the Tweedie family (with 1 ≤ p < 2) and so Mack’s assumption of a gamma approximation in each cell could be replaced by the exact compound Poisson by means of suitable choice of p (< 2).

Remark 3.5 The ML equations (3.6) and (3.7) also show that the chain ladder estimates are marginal sum estimates in the ODP case (see Mack 1991a; Schmidt and Wünsche 1998). In the general Tweedie case [Equations (3.4) and (3.5)], while not equivalent to the chain ladder, they are weighted marginal sum estimates.

This provides an indication of the reason why past investigations have shown chain ladder estimates to be close to ML estimates in various Tweedie cases. For example, this was a finding of Wüthrich (2003).

To elaborate on this, write the general weighted marginal sum equation corresponding to (3.4) in the form

\[ \sum^{R(i)} \omega_{i j}\left[y_{i j}-\hat{\mu}_{i j}\right]=0 \tag{3.14} \]

where the \(\omega_{i j}\) are general weights and the term \(\hat{\mu}_{i j}\) recognizes that the solution of the equations provides only an estimate of \(\mu_{i j}\). A parallel to the following argument about (3.4) may be given in relation to (3.5).

Now rewrite the left side of (3.14) as

\[ \sum^{R(i)} \omega_{i j}\left[\varepsilon_{i j}+\eta_{i j}\right] \tag{3.15} \]

where \(\varepsilon_{i j}=y_{i j}-\mu_{i j}\) and \(\eta_{i j}=\mu_{i j}-\hat{\mu}_{i j}\), both of which are random variables with zero means (assuming a correctly specified model).

Now consider the substitution of the solutions \(\hat{\mu}_{i j}\) of (3.14) in the unweighted form of the same system of equations:

\[ \begin{aligned} \omega_{i} \sum^{R(i)} & {\left[y_{i j}-\hat{\mu}_{i j}\right] } \\ & =\omega_{i} \sum^{R(i)}\left[\varepsilon_{i j}+\eta_{i j}\right] \\ & =\sum^{R(i)} \omega_{i j}\left[\varepsilon_{i j}+\eta_{i j}\right]+\sum^{R(i)}\left(\omega_{i}-\omega_{i j}\right)\left[\varepsilon_{i j}+\eta_{i j}\right] \\ & =\sum^{R(i)}\left(\omega_{i}-\omega_{i j}\right)\left[\varepsilon_{i j}+\eta_{i j}\right] \end{aligned} \tag{3.16} \]

where \(\omega_i=\sum^{R(i)} \omega_{i j} /(n-i+1)\).

The right side of (3.16) has a mean of zero and a variance of \(\sum^{R(i)}\left(\omega_i-\omega_{i j}\right) \sigma_{i j}^2\) where \(\sigma_{i j}^2= \operatorname{Var}\left[\varepsilon_{i j}+\eta_{i j}\right]=\operatorname{Var}\left[y_{i j}-\hat{\mu}_{i j}\right]\). Hence the value of (3.16) will be small if either or both of the following conditions hold:

Weights vary little across a row;
The variances of observations around values fitted by (3.14) are small.

In this case, the solutions to (3.4) will also be approximate solutions to the unweighted form:

\[ \sum^{R(i)}\left[y_{i j}-\hat{\mu}_{i j}\right]=0 \]

which is the chain ladder solution.

In summary, under the right conditions the chain ladder will approximate the solution to the weighted marginal sum estimates given by (3.4) and (3.5).

An example of this approximation is provided by Wüthrich (2003), who made a numerical study of ML fitting of the Tweedie cross-classified model in which the parameters α_i, β_j, λ, and p were all treated as free and the weights w_ij as known. In the example, the w_ij varied comparatively little with i and j, and p was estimated to be 1.17.

As pointed out just prior to Remark 3.5, this parameter value is consistent with the assumption of a compound Poisson distribution for each cell of the triangle.

For this numerical example the weights \(\omega_{i j}= w_{i j} \mu_{i j}^{p-1}\) show not too much variation over the triangle and the ML estimates of the Tweedie cross-classified model are expected to approximate those of the standard chain ladder, as was indeed found by Wüthrich.

4. Maximum likelihood estimation for general Tweedie

Parameters of the general Tweedie cross-classified model may be estimated by the use of GLM software. However, an interesting special case arises under the sole constraint that the weights w_ij also have the multiplicative structure:

\[ w_{i j}=u_{i} v_{j} . \tag{4.1} \]

Note that this includes the unweighted case w_ij = 1.

The ML equations for estimation of the α_i, β_j were derived as (3.4) and (3.5). Rewrite these with the substitutions:

\[ Z_{i j}=w_{i j} \mu_{i j}^{1-p} Y_{i j} \tag{4.2} \]

\[ \nu_{i j}=w_{i j} \mu_{i j}^{2-p}=u_{i} v_{j}\left(\alpha_{i} \beta_{j}\right)^{2-p}=a_{i} b_{j} \tag{4.3} \]

where

\[ a_{i}=u_{i} \alpha_{i}^{2-p} \tag{4.4} \]

\[ b_{j}=v_{j} \beta_{j}^{2-p} . L \tag{4.5} \]

This yields

\[ \sum^{R(i)}\left[z_{i j}-\nu_{i j}\right]=0, \quad i=1, \ldots, n \tag{4.6} \]

\[ \sum^{C(j)}\left[z_{i j}-\nu_{i j}\right]=0, \quad i=1, \ldots, n . \tag{4.7} \]

Note that these are the same equations as (3.6) and (3.7) in Corollary 3.2. Lemma 3.1 therefore implies the following result.

Lemma 4.1 Consider the Tweedie cross-classified model with general (admissible) \(p\) and subject to (3.1) with constraint (4.1). ML estimates of \(a_i, b_j\) (and hence of \(\alpha_i, \beta_j\), by (4.4) and (4.5)) are obtained by application of the chain ladder algorithm (2.6)-(2.9) to the data triangle \(Z=\left\{Z_{i j}\right\}\).

In the application of this result \(\mu_{i j}=\alpha_i \beta_j\) must be known in order to formulate the “data” \(Z_{i j}\), whereas \(\alpha_i, \beta_j\) are estimands of the theorem. However, a solution can be obtained by an iterative procedure.

Let a superscript ( \(r\) ) denote the \(r\) th iteration of the estimate to which it is attached, e.g., \(\mu_{i j}^{(r)}\). Define

\[ Z_{i j}^{(r)}=w_{i j}\left[\mu_{i j}^{(r)}\right]^{1-p} Y_{i j} \tag{4.8} \]

\[ \nu_{i j}^{(r)}=w_{i j}\left[\mu_{i j}^{(r)}\right]^{2-p}=u_{i} v_{j}\left(\alpha_{i}^{(r)} \beta_{j}^{(r)}\right)^{2-p}=a_{i}^{(r)} b_{j}^{(r)} . \tag{4.9} \]

Then define \(a_i^{(r+1)}, b_j^{(r+1)}\) as the estimates obtained in place of \(a_i, b_j\) when the chain ladder algorithm is applied to the data triangle \(\left\{Z_{i j}^{(r)}\right\}\) in place of \(Z\). By this iterative means, obtain the sequence of estimates \(\left\{a_i^{(r)}, b_j^{(r)}, r=0,1, \ldots\right\}\), initiated at \(r=0\) by some simple choice, such as setting \(a_i^{(r)}, b_j^{(r)}\) equal to the estimates of \(\alpha_i, \beta_j\) given by the conventional chain ladder.

If this sequence converges, then the limit is taken as an estimate of the \(a_i, b_j\).

This procedure has been applied to the data set in the Appendix with p = 2, and convergence of the estimated loss reserve to an accuracy of 0.05% in the estimated loss reserve obtained in 5 iterations. Convergence becomes slower as p increases. For p = 2.4, 24 iterations were required to achieve an accuracy of 0.1%.

5. The “separation method”

Taylor (1977) introduced the procedure that subsequently became known as the “separation method.” This produces parameter estimates for a model of the form

\[ \mathrm{E}\left[Y_{i j}\right]=\alpha_{i+j-1} \beta_{j}, \tag{5.1} \]

which is the parallel of (2.4), but with the α parameter applying to diagonal i + j − 1 rather than row i.

The heuristic equations given by Taylor for parameter estimation were:

\[ \sum^{D(k)}\left[y_{i j}-\mu_{i j}\right]=0, \quad k=1, \ldots, n \tag{5.2} \]

\[ \sum^{C(j)}\left[y_{i j}-\mu_{i j}\right]=0, \quad j=1, \ldots, n . \tag{5.3} \]

It is evident that these equations yield marginal sum estimates. Taylor (1977) gives the explicit algorithm for generating estimates of the \(\alpha_{i+j-1}, \beta_j\). This will be referred to as separation method estimation, and is as follows:

\[ \alpha_{k}=\sum^{D(k)} Y_{i j} /\left[1-\sum_{j=n-k}^{n} \beta_{j}\right] \tag{5.4} \]

\[ \beta_{j}=\sum^{C(j)} Y_{i j} / \sum_{k=j}^{n} \alpha_{k}, \tag{5.5} \]

these equations being applied alternately for k = n, j = n, k = n − 1, etc.

The model resulting from replacement of (2.4) by (5.1) in the Tweedie cross-classified model will be referred to as the Tweedie separation model. It is the same as the Tweedie cross-classified model except for the interchange of rows and diagonals, and so a result parallel to each of those of Sections 3 and 4 is obtainable.

Lemma 5.1 The ML equations with respect to the α_k, β_j for the Tweedie separation model are:

\[ \sum^{D(k)} w_{i j} \mu_{i j}^{1-p}\left[y_{i j}-\mu_{i j}\right]=0, \quad i=1, \ldots, n \tag{5.6} \]

\[ \sum^{C(j)} w_{i j} \mu_{i j}^{1-p}\left[y_{i j}-\mu_{i j}\right]=0, \quad j=1, \ldots, n . \tag{5.7} \]

Corollary 5.2 The case of ODP \(Y_{i j}\) is represented by \(p=1, w_{i j}=1\). The \(M L\) equations are then

\[ \sum^{D(k)}\left[y_{i j}-\mu_{i j}\right]=0, \quad i=1, \ldots, n \tag{5.8} \]

\[ \sum^{C(j)}\left[y_{i j}-\mu_{i j}\right]=0, \quad j=1, \ldots, n . \tag{5.9} \]

These imply the separation method estimation of the α_k, β_j set out in (5.4) and (5.5).

Remark 5.3 This result was known for the simple Poisson case since Verbeek (1972), actually earlier than the corresponding result for the chain ladder (Corollary 3.2).

Corollary 5.4 The case of gamma Y_ij is represented by p = 2. The ML equations are then

\[ \sum^{D(k)} w_{i j}\left[y_{i j} / \mu_{i j}-1\right]=0, \quad i=1, \ldots, n \tag{5.10} \]

\[ \sum^{C(j)} w_{i j}\left[y_{i j} / \mu_{i j}-1\right]=0, \quad j=1, \ldots, n . \tag{5.11} \]

Remark 5.5 In the case of the general Tweedie separation model, the separation method algorithm (5.4) and (5.5) will approximate the ML solution (5.6) and (5.7) if either or both of the following conditions hold:

Weights vary little over the triangle;
The variances of observations around values fitted by (5.6) and (5.7) are small.

Lemma 5.6 Consider the Tweedie separation model with general (admissible) p and subject to (3.1) with constraint

\[ w_{i+j-1, j}=u_{i+j-1} v_{j} . \tag{5.12} \]

Define by (4.2), and also define

\[ \begin{aligned} \nu_{i+j-1, j} & =w_{i+j-1, j} \mu_{i+j-1, j}^{2-p} \\ & =u_{i+j-1} v_{j}\left(\alpha_{i+j-1} \beta_{j}\right)^{2-p}=a_{i+j-1} b_{j} \end{aligned} \tag{5.13} \]

where

\[ a_{k}=u_{k} \alpha_{k}^{2-p} \tag{5.14} \]

\[ b_{j}=v_{j} \beta_{j}^{2-p} . \tag{5.15} \]

ML estimates of a_k, b_j (and hence of α_k, β_j) are obtained by application of the separation method algorithm (5.4) and (5.5) to the data triangle Z = {Z_ij}.

6. Conclusion

As noted in the statement of purpose at the end of Section 1, the purpose of this paper is largely expository. In operational terms, however, Section 4 provides a numerical procedure for obtaining parameter estimates for a Tweedie cross-classified model for known p.

This procedure will often be numerically efficient. A parallel numerical procedure produces parameter estimates for the Tweedie separation model.

A referee suggested that ML estimation might be carried out with respect to p as well as parameters α_i, β_j. This would extend ML estimation to the case of the Tweedie cross-classified model for unknown p.

The procedure in this case would consist of:

Application of a univariate numerical search procedure to maximize likelihood (3.2) with respect to p; where
for each trial value of p in this search, the parameters set {α_i, β_j} is fixed as ML for that p.

Acknowledgment

Thanks are due to Hugh Miller, who provided the numerical detail reported in Section 4. Helpful comments were also provided by referees.

Accident year	Claim payments ($) in development year
Accident year	1	2	3	4	5	6	7	8	9	10	11	12	13
1983	1,897,289	5,200,926	6,766,124	5,390,019	1,495,905	2,031,888	2,493,553	506,813	128,100	75,943	308,205	8,891	8,813
1984	2,087,985	4,308,216	5,872,530	6,782,784	4,915,169	2,051,073	1,864,319	562,354	356,830	833,297	4,844	561,572
1985	1,490,677	4,476,085	4,992,179	8,358,920	4,697,517	3,502,695	850,298	2,684,057	727,265	3,400	397,917
1986	1,483,176	3,293,114	6,436,956	6,102,689	5,747,793	4,045,070	2,522,463	1,125,877	1,431,484	862,797
1987	1,392,209	4,130,422	4,838,069	6,746,366	5,949,455	3,748,639	2,854,290	1,001,874	738,291
1988	1,350,347	2,687,237	4,483,829	5,607,406	4,630,570	3,082,570	1,760,536	2,190,282
1989	1,777,107	4,026,788	4,038,537	5,375,214	5,109,038	3,723,188	3,122,941
1990	1,861,113	2,828,223	2,935,704	5,537,553	6,515,910	6,300,323
1991	2,236,165	3,848,454	4,554,935	6,457,862	5,572,385
1992	2,271,180	3,59,346	3,599,932	5,309,764
1993	2,822,819	4,834,966	7,362,328
1994	2,464,971	4,669,219
1995	2,725,355