Portfolio Claims Reserving with Univariate and Multivariate Generalized Link Ratios

Luis Portugal

1. Introduction

In this paper, we extend our previous article on univariate and multivariate claims reserving with generalized link ratios, Portugal, Pantelous, and Verrall (2021), to the case where we want to estimate reserves with several triangles at the same time. We call it portfolio claims reserving. In the literature (see, for example, Merz and Wüthrich 2007), this is called a multivariate approach. It happens due to the consideration of correlations between the triangles. Correlations between regressions appear in the data for several reasons, such as an increase/decrease of claims on some development years or an increase/decrease of the speed of payment of claims on certain development years.

We extend our previous article because the latter considered correlations inside triangles but not correlations between lines of business. This means we assume that claim’s triangles are independent of each other. That is what an insurer does in practice, with the use of these triangles. However, there are situations where correlations exist between triangles.

For example, if comprehensive policies’ claim payments increase/decrease, third-party liability claims may increase/decrease because they may share a common driver, for example, the country’s economic activity may increase or decrease claims frequency. Another example comes from claims development: if we increase the speed of payment of claims in one line of business, that may also impact other lines of business, because the same claims department may manage both.

Correlations measure all these effects and are important in reserving because they allow us to extend our knowledge of total reserves. It is worth mentioning that IFRS 17, the insurance international financial reporting standard that became effective on 1 January 2023 (see IASB 2017), incentivizes insurers to consider correlations because they demand an economic balance sheet, considering all risks and diversification effects. As we will see, in the numerical examples provided, considering correlations may reduce the reserves level overall.

In our case, we are going to have correlations between each triangle regression and between the triangles estimated at the same time. This means that we may have correlations inside one triangle (Portugal, Pantelous, and Verrall 2021), giving a multivariate method, and when we do that with several triangles at the same time, we have multivariate methods with portfolio data. This is equivalent to what happens in econometrics, where they call the latter panel data; see, for example, Fomby, Johnson, and Hill (1984). Panel data always have correlations between different sets of data, as happens with insurers’ triangles, which, by definition, must have correlations when estimated together. This means that we save the multivariate feature for the existence of correlations inside each triangle.

Multivariate papers in the literature (see Section 2) address this problem but with two limitations:

only considering the chain-ladder method or the additive method, and for all the triangles, increasing the possibility of model error;
not considering correlations inside each triangle, which may be important in practice due to the existence of correlations in triangle data.

Finally, considering all lines of business at the same time will allow us to have a prediction error for all portfolios instead of one prediction error per line of business. This is important because, when we work more than one triangle at the same time, we apply the same method to all triangles. In this paper we do that, but we do not restrict ourselves to one method. We will use the method from the link ratios family that minimizes the portfolio prediction error.

This study contributes to the reserving literature in the following three distinctive ways. Firstly, we develop the foundations of the univariate and multivariate generalized link ratios methods in the context of portfolio data with more than one triangle. Current literature usually restricts the approach to chain-ladder and does not consider correlations inside each triangle. Secondly, an analytical formula is presented for the prediction error, which is general to any generalized link ratio method and not restricted to chain-ladder. Finally, to demonstrate clearly the advantages of our approach, this paper contains an empirical investigation using real data. In this context, a comparison with Zhang (2010) is also presented.

The remainder of the paper is organized as follows. In Section 2, a brief overview of other trends in the existing literature on claims reserving with more than one triangle is presented. Section 3 presents the portfolio generalized link ratios framework, which is developed in Section 4 with the univariate portfolio model and in Section 5 for the multivariate case. In Section 6, numerical examples are provided for both methods, and with the Zhang (2010) method. Finally, Section 7 contains the main conclusions.

2. Models in the literature with more than one triangle

The literature considers several models where authors calculate claims reserves using several triangles at the same time, known as multivariate, due to correlations between triangles. For example, Holmberg (1994), Halliwell (1997), Brehm (2002), Kirschner, Kerley, and Isaacs (2002), Quarg and Mack (2004), Merz and Wüthrich (2007) and Taylor and McGuire (2007) include cases where the development of one triangle might depend upon past information from other triangles. Braun (2004), Kremer (2005), Prohl and Schmidt (2005), Hess, Schmidt, and Zocher (2006), Schmidt (2006), and Merz and Wüthrich (2007, 2008) consider joint development with contemporaneous correlations among triangles. Finally, Zhang (2010) proposes a general multivariate chain-ladder method with contemporaneous correlations and structural connections among the triangles. Merz, Wüthrich, and Hashorva (2012) presents a multivariate approach using the lognormal distribution and also give a closed formula for claims uncertainty. As an alternative to correlations, some copula-based methodologies have been proposed (Shi and Frees 2011; Shi 2014).

Recently, Avanzi et al. (2020) have developed multivariate models with several triangles in the context of generalized linear models, and Winarta, Novita, and Nurrohmah (2021) extends multivariate models to credibility theory.

However, these papers do not include correlations within each triangle, and restrict their use to one method (chain-ladder, additive, and some parametric approaches with probability distributions) and they usually do not present an analytical formula for the prediction error. Two examples where prediction errors are presented may be seen in Merz and Wüthrich (2007, 2008).

The present paper aims to overcome all these issues in estimating reserves for portfolio data (which means considering the data’s correlations) but using generalized link ratios within a non-parametric context and presenting a general formula for prediction error, which may be applied to any generalized link ratio method, when considering correlations inside each triangle or not.

3. Portfolio generalized link ratios formulation

We extend here, to several triangles, the framework presented in Portugal, Pantelous, and Verrall (2021) for one triangle. Now we have data from $t = 1,\ldots,N$ triangles. For each of these triangles, we have $k = 1,\ldots,T - 1$ equations (regressions) and for each of these equations we have $T - k$ observations (years of origin).

Considering now $t = 1,\ldots,N$ triangles with $k$ equations for each triangle, the estimations for the t triangles will be provided simultaneously, with the $y_{t,i,j}$ explained by the adjacent triangle column $x_{t,i,j - 1}$ . This means that the claim’s payments in column $j$ from triangle i, $y_{t,i,j}$ , are a function (a regression through the origin) of the claim’s payments in column $j - 1$ from triangle t, $x_{t,i,j - 1}$ . Both variables represent the cumulative payments, but $y_{t,i,j}$ is a random variable and $x_{t,i,j - 1}$ is a non-random variable.

We define $\beta_{t,j}$ as the slope (loss development factor) of j equation from triangle $t$ . Also, each $\varepsilon_{t,i,j}$ is the error from year of origin $i$ , development year j, and triangle $t$ . For $t = 1,\ldots,N$ , $\ i = 1,\ldots,T - j + 1,\ \ and\ j = 2,\ldots T$ , the cumulative payments dependent variable, $y_{t,i,j}$ , is given by

$y_{t,i,j} = \beta_{t,j}x_{t,i,j - 1} + \varepsilon_{t,i,j}$ .

In matrix format we will have

$\mathbf{Y = X\beta + \varepsilon}\mathbf{,}\tag{3.3}$

where $\mathbf{Y}$ is the block-vector that includes the Y from each triangle case but now for $N$ triangles. $\mathbf{Y}$ has dimensions ( $m \times N)\ \times 1$ , with generic $\mathbf{Y}_{t}$ for triangle t and $t = 1,\ldots,N$ ,

$\mathbf{Y} = \begin{bmatrix} \mathbf{Y}_{1} \\ \ldots \\ \mathbf{Y}_{N} \end{bmatrix}.$

$\mathbf{Y}_{t}\mathbf{=}$ $\begin{bmatrix} Y_{t,1} \\ \ldots \\ Y_{t,T - 1} \end{bmatrix}$ represents the dependent variables of the set of $T - 1$ equations for the triangle $t$ , where the generic equation $Y_{t,k} = \begin{bmatrix} y_{t,1,k + 1} \\ \ldots \\ y_{t,T - k,k + 1} \end{bmatrix}$ includes the random variables $y_{t,i,k + 1}$ for $t = 1,\ldots,N$ , $k = 1,\ldots,T - 1$ , and $i = 1,\ldots,T - k$ .

$\mathbf{X}$ is defined by a diagonal block matrix. $\mathbf{X}$ has dimensions $(m\ \times N) \times \left\lbrack N \times (T - 1) \right\rbrack$ and can be represented by

$\mathbf{X} = \begin{bmatrix} \mathbf{X}_{1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \mathbf{X}_{N} \end{bmatrix}.$

$\mathbf{X}_{t}\mathbf{=}$ $\begin{bmatrix} X_{t,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & X_{t,T - 1} \end{bmatrix}$ , where each generic element $X_{t,k}$ = $\begin{bmatrix} x_{t,1,k} \\ \ldots \\ x_{t,T - k,k} \end{bmatrix}$ belongs to equation $k$ and includes the non-random variables $x_{t,i,k}$ for $t = 1,\ldots,N$ , $k = 1,\ldots,T - 1$ , and $i = 1,\ldots,T - k$ .

$\mathbf{\beta}$ is defined by a block-vector that includes the previous $\beta$ from the one-triangle model, but now for $N$ triangles. $\mathbf{\beta}$ has dimensions $\left\lbrack N \times (T - 1) \right\rbrack \times 1$ and can be represented by

$\mathbf{\beta} = \begin{bmatrix} \mathbf{\beta}_{1} \\ \ldots \\ \mathbf{\beta}_{N} \end{bmatrix}.$

$\mathbf{\beta}_{t}\mathbf{=}$ $\begin{bmatrix} \beta_{t,1} \\ \ldots \\ \beta_{t,T - 1} \end{bmatrix}$ , where the generic $\beta_{t,k}$ is the non-random parameter that represents the slope (loss development factor) from triangle $t = 1,\ldots,N$ and equation $k = 1,\ldots,T - 1$ .

$\mathbf{\varepsilon}$ is the block-vector that includes the $\varepsilon$ from the one-triangle case, but now for $N$ triangles. $\mathbf{\varepsilon}$ has dimensions ( $m \times N)\ \times 1$ . The $\varepsilon$ from the one-triangle case is now $\mathbf{\varepsilon}_{t}$ and $t = 1,\ldots,N$ ,

$\mathbf{\varepsilon} = \begin{bmatrix} \mathbf{\varepsilon}_{1} \\ \ldots \\ \mathbf{\varepsilon}_{N} \end{bmatrix},$

$\mathbf{\varepsilon}_{t}\mathbf{=}$ $\begin{bmatrix} \mathbf{\varepsilon}_{t,1} \\ \ldots \\ \mathbf{\varepsilon}_{t,T - 1} \end{bmatrix}$ , where the generic $\mathbf{\varepsilon}_{t,k} = \begin{bmatrix} \varepsilon_{t,1,k + 1} \\ \ldots \\ \varepsilon_{t,T - k,k + 1} \end{bmatrix}$ includes the random variables $\mathbf{\varepsilon}_{t,i,k + 1}$ for $t = 1,\ldots,N$ , $k = 1,\ldots,T - 1$ , and $i = 1,\ldots,T - k$ .

We define the true unknown future observations of the dependent variables as

$\mathbf{Y}_{\mathbf{F}} = \mathbf{X}_{\mathbf{F}}\mathbf{\beta} + \mathbf{\varepsilon}_{\mathbf{F}}$

where $\mathbf{X}_{\mathbf{F}}$ and $\mathbf{\varepsilon}_{\mathbf{F}}$ are, respectively, the future values of $\mathbf{X}$ and the future errors. $\mathbf{Y}_{\mathbf{F}}$ is a block-vector with size $(N \times m)\ \times \ 1$ , given by

$\mathbf{Y}_{\mathbf{F}} = \begin{bmatrix} {\mathbf{Y}^{\mathbf{F}}}_{1} \\ \ldots \\ {\mathbf{Y}^{\mathbf{F}}}_{N} \end{bmatrix},$

with each element $\mathbf{Y}^{\mathbf{F}}_{t} = \begin{bmatrix} {\mathbf{Y}^{F}}_{t,1} \\ \ldots \\ {\mathbf{Y}^{F}}_{t,T - 1} \end{bmatrix}$ and the generic ${\mathbf{Y}^{F}}_{t,k} = \begin{bmatrix} {y^{F}}_{t,T - k + 1,k + 1} \\ \ldots \\ {y^{F}}_{t,T,k + 1} \end{bmatrix}$ for $t = 1,\ldots,N$ and $k = 1,\ldots,T - 1$ .

$\mathbf{X}_{\mathbf{F}}$ is given by the current diagonal of payments from each triangle and by the estimated payments of the lower triangle from each triangle. It is a block matrix given by

$\mathbf{X}_{\mathbf{F}} = \begin{bmatrix} {\mathbf{X}^{F}}_{1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {\mathbf{X}^{F}}_{N} \end{bmatrix}$

where each element ${\mathbf{X}^{F}}_{k} = \begin{bmatrix} {X^{F}}_{t,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {X^{F}}_{t,T - 1} \end{bmatrix}$ and ${X^{F}}_{t,k}$ = $\begin{bmatrix} x_{t,T - k + 1,k} \\ \ldots \\ x_{t,T,k} \end{bmatrix}$ for $t = 1,\ldots,N$ and $k = 1,\ldots,T - 1$ .

$\mathbf{\varepsilon}_{\mathbf{F}}$ is a block-vector with size $(N \times m)\ \times \ 1$ given by $\mathbf{\varepsilon}_{\mathbf{F}} = \begin{bmatrix} {\mathbf{\varepsilon}_{\mathbf{F}}}_{1} \\ \ldots \\ {\mathbf{\varepsilon}_{\mathbf{F}}}_{N} \end{bmatrix}$ ,

with each element ${\mathbf{\varepsilon}_{\mathbf{F}}}_{t} = \begin{bmatrix} {\mathbf{\varepsilon}^{F}}_{t,1} \\ \ldots \\ {\mathbf{\varepsilon}^{F}}_{t,T - 1} \end{bmatrix}$ and the generic ${\mathbf{\varepsilon}^{F}}_{t,k} = \begin{bmatrix} {\varepsilon^{F}}_{t,T - k + 1,k + 1} \\ \ldots \\ {\varepsilon^{F}}_{t,T,k + 1} \end{bmatrix}$ for $t = 1,\ldots,N$ and $k = 1,\ldots,T - 1$ .

The estimated values of the dependent variables are obtained from $\widehat{\mathbf{Y}_{\mathbf{F}}}\mathbf{=}\mathbf{X}_{\mathbf{F}}\widehat{\mathbf{\beta}}$ .

4. Portfolio univariate generalized link ratios

4.1. Assumptions

Having defined the framework of these methods, we present in this section the Portfolio Univariate Generalized Link Ratios (PUGLR) assumptions with the following proposition.

Proposition 4.1 Considering equation (3.3) we assume for PUGLR

$\mathbb{E}\left( \mathbf{\varepsilon}|\mathbf{X} \right)\mathbb{= E}\left( \mathbf{\varepsilon} \right) = \mathbf{0}\tag{4.1.1}$

$\mathbb{E}\left( \mathbf{\varepsilon\varepsilon}' \right) = \mathbf{\sigma}^{2}\mathbf{W} = \mathbf{\Psi}\tag{4.1.2}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right) = \mathbf{\sigma}^{\mathbf{2}}\mathbf{W}_{\mathbf{F}}\mathbf{=}\mathbf{\Psi}_{\mathbf{F}}\tag{4.1.3}$

where $\mathbf{0}$ is a vector of zeros of size $(N \times m)\ \times \ 1$ , and W is an $(N \times m) \times (N \times m)$ diagonal weighting matrix, which depends on the parameter $\alpha$ in each non-zero cell. This parameter is related to the heteroscedasticity level in the triangles and will be crucial to identifying which link ratio method minimizes the prediction error; for more details, see Portugal, Pantelous, and Verrall (2021).

W is given by equation (4.1.4), where the $diag$ operator transforms a vector into a diagonal matrix. W’s diagonal elements are given by the elements of the transformed vector:

$\begin{aligned}\mathbf{W} &= diag\left( \mathbf{X}^{\alpha}\ \right) \\&= \begin{bmatrix} diag\left( {\mathbf{X}_{1}}^{\alpha} \right) & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & diag\left( {\mathbf{X}_{N}}^{\alpha} \right) \end{bmatrix} \\&= \begin{bmatrix} {x_{1,1,1}}^{\alpha} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{N,T - 1,T - 1}}^{\alpha} \end{bmatrix}.\end{aligned}\tag{4.1.4}$

The matrix $\mathbf{W}_{F}$ is the future $\mathbf{W}$ and has the same structure as $\mathbf{W}$ . However, its elements are the ${\mathbf{X}_{\mathbf{F}}}^{\alpha}$ instead of the $\mathbf{X}^{\alpha}$ . $\mathbf{W}$ corresponds to a specific structure of heteroscedasticity through the choice of parameter α, and $\mathbf{W}_{\mathbf{F}}$ has the same structure as $\mathbf{W}$ but is based on the predicted payments.

$\begin{aligned}\mathbf{W}_{\mathbf{F}} &= diag\ (\mathbf{X}\mathbf{)} \\&= \begin{bmatrix} diag\left( {{\mathbf{X}_{\mathbf{F}}}_{1}}^{\alpha} \right) & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & diag\left( {{\mathbf{X}_{\mathbf{F}}}_{N}}^{\alpha} \right) \end{bmatrix} \\& = \begin{bmatrix} {x_{1,T,1}}^{\alpha} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{N,T,T - 1}}^{\alpha} \end{bmatrix}\end{aligned}\tag{4.1.5}$

$\mathbf{\sigma}^{2}$ is diagonal block matrix with N blocks and of $size\ (N \times m) \times (N \times m)$ , when expanded:

$\mathbf{\sigma}^{\mathbf{2}} = \begin{bmatrix} {\mathbf{\sigma}^{\mathbf{2}}}_{\mathbf{1,1}} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {\mathbf{\sigma}^{\mathbf{2}}}_{\mathbf{N,N}} \end{bmatrix}\tag{4.1.6}$

where each block ${\mathbf{\sigma}^{\mathbf{2}}}_{\mathbf{t,t}} = diag\ \begin{bmatrix} {\sigma^{2}}_{t,k} \\ \ldots \\ {\sigma^{2}}_{t,T - 1} \end{bmatrix}$ for $t = 1,\ldots,N$ and $k = 1,\ldots,T - 1$ .

Seeing equation (4.1.5), we can understand that the method will be homoscedastic in each triangle when $\alpha = 0.$ Otherwise, it will be heteroscedastic.

4.2 Parameters’ estimation

The following two lemmas allow us to have estimators for $\mathbf{\beta}$ and $\mathbf{\sigma}^{\mathbf{2}}$ .

Lemma 4.2.1 Following Fomby, Johnson, and Hill (1984), we can obtain the estimation of $\mathbf{\beta}$ , the loss development factors vector of all the equations from all the triangles. $\widehat{\mathbf{\beta}}$ is obtained using the Aitken generalized least squares method with $\mathbf{\Psi}$ as the weights matrix and is the best linear unbiased estimator of $\mathbf{\beta}$ ,

$\widehat{\mathbf{\beta}} = \left( \mathbf{X'}\mathbf{\Psi}^{\mathbf{- 1}}\mathbf{X} \right)^{\mathbf{- 1}}\mathbf{X}\mathbf{\Psi}^{\mathbf{- 1}}\mathbf{Y}\tag{4.2.1}$

The parameter α from equations (4.1.4) and (4.1.5) will be estimated as the value that minimizes the prediction error. This α parameter is a method choice parameter and, as in Portugal, Pantelous, and Verrall (2021), we select the model with the lowest prediction error.

Lemma 4.2.2 Following Srivastava and Giles (1987), we can estimate ${{\widehat{\sigma}}^{2}}_{t,k}$ using equation k (from triangle t)'s sum of the square of the errors, ${SSR}_{t,k},$ divided by this equation’s degrees of freedom, which is the number of observations $T_{t,k}$ for this equation, minus the number of parameters in the equation, in this case one:

${{\widehat{\sigma}}^{2}}_{t,k} = \frac{{SSR}_{t,k}}{T_{t,k} - 1}\tag{4.2.2}$

4.3 Prediction error

We also need an expression for the prediction error, which is given by the following theorem.

Theorem 4.3.1 Knowing that the prediction error (i.e., the root of the mean square error) is given by the root of the expected value of $\widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}}$ and its transpose, $E\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)^{\prime},$ we get the mean square error prediction (MSEP) from the following expression:

$\mathbb{E}\left\lbrack \mathbf{X}_{\mathbf{F}}\left( \widehat{\mathbf{\beta}} - \mathbf{\beta} \right)\left( \widehat{\mathbf{\beta}} - \mathbf{\beta} \right)'\mathbf{X'}_{\mathbf{F}} \right\rbrack\mathbb{+ E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}} \right)^{\prime}\tag{4.3.1}$

The estimation of the variance is given by $\mathbf{X}_{\mathbf{F}}\left( \mathbf{X'}\mathbf{\Psi}^{- 1}\mathbf{X} \right)\mathbf{X'}\mathbf{\Psi}^{- 1}\mathbb{E(}\mathbf{\varepsilon\varepsilon')}\mathbf{X}\left( \mathbf{X'}\mathbf{\Psi}^{- 1}\mathbf{X} \right)\mathbf{X'}_{\mathbf{F}}$

and the process variance comes from $\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right)$ .

Altogether, this means that the MSEP will be obtained from

$\scriptsize\mathbf{X}_{\mathbf{F}}{\mathbf{(X'}\mathbf{\Psi}^{- 1}\mathbf{X)}}^{- 1}\mathbf{X'}\mathbf{\Psi}^{- 1}\mathbb{E}\left( \mathbf{\varepsilon}\mathbf{\varepsilon'} \right)\mathbf{X}\left( \mathbf{X'}\mathbf{\Psi}^{- 1}\mathbf{X} \right){\mathbf{X'}}_{\mathbf{F}}\mathbb{+ E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F'}} \right)\tag{4.3.2}$

Proof. This can be obtained by following the same steps as are presented in Portugal, Pantelous, and Verrall (2021) for the single-triangle case.

Proposition 4.3.1 Following expression (4.3.2), and assumptions (4.1.2) and (4.1.3), the MSEP is obtained from

$\mathbf{X}_{\mathbf{F}}{\mathbf{(X'}\mathbf{\Psi}^{- 1}\mathbf{X)}}^{- 1}\mathbf{X'}\mathbf{X}\left( \mathbf{X'}\mathbf{\Psi}^{- 1}\mathbf{X} \right){\mathbf{X'}}_{\mathbf{F}}\mathbf{+}\mathbf{\Psi}_{\mathbf{F}}\tag{4.3.3}$

As with univariate generalized link ratios and multivariate generalized link ratios, see Portugal, Pantelous, and Verrall (2021), and following the results of Theorem 4.3.1, we need to know which weighting matrices we are using, namely $\mathbf{\Psi}$ and $\mathbf{\Psi}_{\mathbf{F}}$ . For the latter, we need the parameter α and we can obtain it by searching for the α that minimizes the prediction error presented in expression (4.3.3).

The parameter α also corresponds to a specific structure of heteroscedasticity. If α is zero, we will get homoscedastic errors inside each triangle. This means that the way $\mathbf{\Psi}$ is defined will provide us with several claims reserving methods for estimating the loss development factors.

Analytically, we get several portfolio data methods: Vector Projection (VP), see Portugal, Pantelous, and Assa (2017) for $\alpha = 0,$ chain-ladder (CL) for $\alpha = 1$ , simple average (SA) for $\alpha = 2$ , and other methods for different values of $\alpha$ . To have them, we just need to change $\alpha$ to get a different $\mathbf{\Psi}$ matrix. For the VP, we will have homoscedastic errors, for the CL and the SA we will have heteroscedastic errors.

Thus, the main advantage of this approach is that we choose the $\alpha$ that minimizes the prediction error for $N$ triangles at the same time. With $\alpha$ different from 0, 1, or 2, we would get other methods: the optimal choice for the weights of the link ratios is obtained as the prediction error is minimized.

All the link ratios methods considered here (see the next section for special cases) depend on $\alpha$ , which represents the level of heteroscedasticity, and we want to choose the $\alpha$ that minimizes the prediction error. The lower is the prediction error, the better are the errors analysis and back-testing results; see Portugal, Pantelous, and Assa (2017).

4.4. Special cases

Special cases of the method are considered with the next three corollaries. Obviously, the proofs of these corollaries are linked to Theorem 4.3.1 and Proposition 4.3.1. Thus, they are omitted.

Corollary 4.4.1 If $\alpha = 0,$ the triangle’s variances are homoscedastic and, looking at expressions (4.1.2) and (4.1.3), we get

$\mathbb{E}\left( \mathbf{\varepsilon\varepsilon}' \right) = \mathbf{\sigma}^{2}\mathbf{I}_{(N \times m) \times (N \times m)}\mathbf{=}\mathbf{\Psi}_{\mathbf{VP}}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right) = \mathbf{\sigma}^{2}\mathbf{I}_{(N \times m) \times (N \times m)}\mathbf{=}\mathbf{\Psi}_{\mathbf{VP,F}}$

Here, $\mathbf{I}_{(N \times m) \times (N \times m)}$ is a diagonal identity matrix with size $(N \times m) \times (N \times m)$ . With $\alpha$ = 0, the loss development factors are the ones from the VP applied with a portfolio context, that is, the portfolio vector projection (PVP) (see expression (4.2.1)), where $\mathbf{\Psi}_{\mathbf{VP}}\ is\ \mathbf{\Psi}$ with $W=\mathbf{I}_{(N \times m) \times (N \times m)}$ . Then, the MSEP is obtained from

$\small\begin{aligned}\mathbb{E}&\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - Y_{F} \right)^{\prime} \\= & \mathbf{X}_{\mathbf{F}}{\mathbf{(X'}{\mathbf{\Psi}_{\mathbf{VP}}}^{- 1}\mathbf{X)}}^{- 1}\mathbf{X'}\mathbf{X}\left( \mathbf{X'}{\mathbf{\Psi}_{\mathbf{VP}}}^{- 1}\mathbf{X} \right){\mathbf{X'}}_{\mathbf{F}}\\&+\mathbf{\Psi}_{\mathbf{VP,F}}\end{aligned}\tag{4.4.1}$

(where $\mathbf{\Psi}_{\mathbf{VP,F}}\ is\ \mathbf{\Psi\ }$ with $\mathbf{W}_{\mathbf{F}}$ = $\mathbf{I}_{(N \times m) \times (N \times m)}$ ).

Corollary 4.4.2 If $\alpha = 1,$ the triangle’s variances are heteroscedastic, and we get

$\mathbb{E}\left( \mathbf{\varepsilon\varepsilon}' \right) = \mathbf{\sigma}^{2}\mathbf{W}_{CL} = \mathbf{\Psi}_{CL}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right) = \mathbf{\sigma}^{2}\mathbf{W}_{CL,F} = \mathbf{\Psi}_{CL,F}$

with

$\mathbf{W}_{CL} = \ \begin{bmatrix} x_{1,1,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x_{N,T - 1,T - 1} \end{bmatrix}$ and $\mathbf{W}_{CL,F}$ = $\begin{bmatrix} x_{1,T,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x_{N,T,T - 1} \end{bmatrix}$ .

With $\alpha$ =1, the loss development factors are the ones from the CL applied in a portfolio context, that is the portfolio chain-ladder (PCL), (see (4.2.1)), where $\mathbf{\Psi}_{\mathbf{CL}}\ is\ \mathbf{\Psi\ }$ with W= $\mathbf{W}_{CL}$ . Then, the MSEP is obtained from

$\scriptsize\begin{aligned}\mathbb{E}&\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)^{\prime} \\&= \mathbf{X}_{\mathbf{F}}{(\mathbf{X'}{\mathbf{\Psi}_{CL}}^{- 1}\mathbf{X})}^{- 1}\mathbf{X'}\mathbf{X}\left( \mathbf{X'}{\mathbf{\Psi}_{CL}}^{- 1}\mathbf{X} \right){\mathbf{X'}}_{\mathbf{F}}\mathbf{+}\mathbf{\Psi}_{CL,F}\end{aligned}\tag{4.4.2}$

(where $\mathbf{\Psi}_{\mathbf{CL,F}}\ is\ \mathbf{\Psi\ }$ with $\mathbf{W}_{\mathbf{F}}$ = $\mathbf{W}_{CL,F}$ ).

Corollary 4.4.3 If $\alpha = 2,$ the triangle’s variances are heteroscedastic, and we get

$\mathbb{E}\left( \mathbf{\varepsilon\varepsilon}' \right) = \mathbf{\sigma}^{\mathbf{2}}\mathbf{W}_{SA} = \mathbf{\Psi}_{SA}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right) = \mathbf{\sigma}^{2}\mathbf{W}_{SA,F} = \mathbf{\Psi}_{SA,F}$

with

$\mathbf{W}_{SA} = \ \begin{bmatrix} {x_{1,1,1}}^{2} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{N,T - 1,T - 1}}^{2} \end{bmatrix}$ and $\mathbf{W}_{SA,F}$ = $\begin{bmatrix} {x_{1,T,1}}^{2} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{N,T,T - 1}}^{2} \end{bmatrix}$

With $\alpha$ =2, the loss development factors are the ones from the SA applied in a portfolio context, i.e., the portfolio simple average (PSA), see (4.2.1), where $\mathbf{\Psi}_{\mathbf{SA}}\ is\ \mathbf{\Psi\ }$ with W= $\mathbf{W}_{SA}$ .

Then, the MSEP is obtained from

$\scriptsize\begin{aligned}\mathbb{E}&\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)\left( \widehat{\mathbf{Y}_{\mathbf{F}}} - \mathbf{Y}_{\mathbf{F}} \right)^{\prime} \\&= \mathbf{X}_{\mathbf{F}}{(\mathbf{X'}{\mathbf{\Psi}_{SA}}^{- 1}\mathbf{X})}^{- 1}\mathbf{X'}\mathbf{X}\left( \mathbf{X'}{\mathbf{\Psi}_{SA}}^{- 1}\mathbf{X} \right){\mathbf{X'}}_{\mathbf{F}}\mathbf{+}\mathbf{\Psi}_{SA,F}\end{aligned}\tag{4.4.3}$

(where $\mathbf{\Psi}_{\mathbf{SA}\mathbf{,F}}\ is\ \mathbf{\Psi}$ with $\mathbf{W}_{\mathbf{F}}$ = $\mathbf{W}_{SA,F}$ ).

5. Portfolio multivariate generalized link ratios

In this section, we develop the Section 4 method for the case where there are contemporaneous correlations between equations inside the same triangle and between triangles. The method considered here is the same as that presented in Section 4. However, we will change the assumptions, introducing a more complex structure for the errors: seemingly unrelated regression (SUR) (Srivastava and Giles 1987). Our method will become multivariate as an SUR and may also use the heteroscedastic structure from the generalized link ratios, also including VP, CL, and SA. In this portfolio multivariate generalized link ratios (PMGLR) method, we are going to maintain the entire framework presented in Section 4 but change assumptions (4.1.2) and (4.1.3).

5.1. Assumptions

We are going to assume contemporaneous correlations between the errors of the different equations and between the triangles. To do that, we obtain a portfolio multivariate method.

$\mathbf{\Sigma}$ is a block matrix of block-size $\left\lbrack N \times (T - 1) \right\rbrack\ \times \left\lbrack N \times (T - 1) \right\rbrack$ that summarizes the variances and covariances between the $k = 1,\ldots,T - 1$ regressions in each of the $t\ = \ 1,\ldots,N$ triangles, and between each triangle, for observations in the same origin year. Expanding each block, we get a matrix with dimensions $(N \times m)\ \times (N \times m)$ ,

$\mathbf{\Sigma} = \begin{bmatrix} \mathbf{\Sigma}_{1,1,1} & \cdots & \mathbf{\Sigma}_{N,1,T - 1} \\ \vdots & \ddots & \vdots \\ \mathbf{\Sigma}_{1,T - 1,1} & \cdots & \mathbf{\Sigma}_{N,T - 1,T - 1} \end{bmatrix}\tag{5.1.1}$

The generic component of (5.1.1), $\mathbf{\Sigma}_{t,k,k}$ is given by a matrix of size $\left\lbrack N \times (T - k) \right\rbrack\ \times \left\lbrack N \times (T - k) \right\rbrack$ and by $s_{t,k,k}$ , the variance parameter from triangle $t$ and regression $k$ :

$\mathbf{\Sigma}_{t,k,k} = s_{t,k,k}\begin{bmatrix} {x_{1,1,k}}^{\alpha} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{N,T - k,k}}^{\alpha} \end{bmatrix}/\tag{5.1.2}$

The generic component of (5.1.1), $\mathbf{\Sigma}_{t,k,j}\ with\ k \neq j$ , is given by a matrix of size $\left\lbrack N \times (T - k) \right\rbrack\ \times \left\lbrack N \times (T - k) \right\rbrack$ and by $s_{t,k,j}$ , the covariance parameter for triangle $t$ between regressions $k$ and $j$ , where $k \neq j:$

$\mathbf{\Sigma}_{t,k,j} = s_{t,k,j}\mathbf{I}_{N \times (T - k)}.\tag{5.1.3}$

$\mathbf{\Sigma}^{F}$ is a block matrix of block-size $\left\lbrack N \times (T - 1) \right\rbrack\ \times \left\lbrack N \times (T - 1) \right\rbrack$ that summarizes the future variances and covariances between the $k = 1,\ldots,T - 1$ regressions. Expanding each block, we get a matrix of dimensions $(N \times m)\ \times (N \times m),$

$\mathbf{\Sigma}^{F} = \begin{bmatrix} \mathbf{\Sigma}_{1,1,1}^{F} & \cdots & \mathbf{\Sigma}_{N,1,T - 1}^{F} \\ \vdots & \ddots & \vdots \\ \mathbf{\Sigma}_{1,T - 1,1}^{F} & \cdots & \mathbf{\Sigma}_{N,T - 1,T - 1}^{F} \end{bmatrix}.\tag{5.1.4}$

The generic component of (5.1.4?), $\mathbf{\Sigma}_{t,k,k}^{F},$ is given by a matrix of size $\left\lbrack N \times (T - k) \right\rbrack\ \times \left\lbrack N \times (T - k) \right\rbrack$ ,

$\mathbf{\Sigma}_{t,k,k}^{F} = s_{t,k,k}\begin{bmatrix} {x_{1,T,k}}^{\alpha} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{N,T + k,k}}^{\alpha} \end{bmatrix}.\tag{5.1.5}$

The generic component of (5.1.4), $\mathbf{\Sigma}_{t,k,j}^{F}$ with $k \neq j$ , is given by a matrix of size $\left\lbrack N \times (T - k) \right\rbrack\ \times \left\lbrack N \times (T - k) \right\rbrack$ ,

${\mathbf{\Sigma}^{F}}_{t,k,j} = s_{t,k,j}\mathbf{I}_{N \times (T - k)}.\tag{5.1.6}$

Proposition 5.1 Considering (3.3) we assume, for the PMGLR method,

$\mathbb{E}\left( \mathbf{\varepsilon}|\mathbf{X} \right)\mathbb{= E}\left( \mathbf{\varepsilon} \right) = \mathbf{0}\tag{5.1.7}$

$\mathbb{E}\left( \mathbf{\varepsilon\varepsilon}' \right) = \mathbf{\Sigma}\tag{5.1.8}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right) = \mathbf{\Sigma}_{F}\tag{5.1.9}$

5.2. Parameter estimation

The parameter estimation can be obtained from the following Lemma 5.2.1.

Lemma 5.2.1 (Srivastava and Giles 1987) We can obtain the estimation of $\mathbf{\beta}$ , that is the estimation of the loss development factors from all the equations. $\widehat{\mathbf{\beta}}$ is obtained using the SUR generalized least squares for panel data with heteroscedasticity and contemporaneous correlations, and is the best linear unbiased estimator of $\mathbf{\beta}$ ,

$\widehat{\mathbf{\beta}} = \left( \mathbf{X}'\mathbf{\Sigma}^{- 1}\mathbf{X} \right)^{- 1}\mathbf{X}\mathbf{\Sigma}^{- 1}\mathbf{Y}.\tag{5.2.1}$

We also need an expression for the prediction error, which will be given in the following paragraphs. Clearly, the parameters $s_{j,j}$ and $s_{i,j}$ are not known and must be estimated. Thus, with the following Lemma, the estimators ${\widehat{s}}_{j,j}$ and ${\widehat{s}}_{l,j}$ are provided.

Lemma 5.2.2 (Srivastava and Giles 1987) Estimators for the parameters of the variance and covariance matrix, from a multivariate regression with panel data, are given by

${\widehat{s}}_{t,k,k} = \frac{1}{T - 1}{SSR}_{t,k} \quad {\widehat{s}}_{t,k,j} = \frac{1}{T}{SSR}_{t,k}\tag{5.2.2}$

The ${SSR}_{t,k}$ are to be calculated using, for each equation t, the regression $k$ ordinary least squares (OLS) sum of the square of the errors.

5.3 Prediction error

The following theorem follows from Theorem 4.3.1 and gives us a general analytical formula for obtaining the prediction error.

Theorem 5.3.1 The MSEP for the method presented in (3.3) and Proposition 5.1 is obtained from

$\scriptsize\mathbf{X}_{F}{(\mathbf{X}'\mathbf{\Sigma}^{- 1}\mathbf{X})}^{- 1}\mathbf{X}'\mathbf{\Sigma}^{- 1}\mathbb{E(}\mathbf{\varepsilon}{\mathbf{\varepsilon}'{)\mathbf{\Sigma}}^{- 1}\mathbf{X}\ {(\mathbf{X}'\mathbf{\Sigma}^{- 1}\mathbf{X})}^{- 1}\mathbf{X}}_{F}' + \mathbb{\ E}\left( \mathbf{\varepsilon}_{F}\mathbf{\varepsilon}_{F}' \right)\tag{5.3.1}$

The proof follows directly from Theorem 4.3.1 when (5.1.7), (5.1.8), and (5.1.9) are considered.

Following on from the results of Theorem 5.3.1, the procedures are like those of the univariate method, presented in the previous section. In the PMGLR, we need to obtain the following:

The ${\widehat{s}}_{t,j,j}$ and ${\widehat{s}}_{t,l,j}$ to estimate the $\mathbf{\Sigma}$ matrix, which implies the need to have a first regression, with OLS, to get the sum of the square of the errors.
The parameter α so as to have the $\mathbf{\Sigma}$ (5.11) and $\Sigma_{F}$ (5.14) matrices. Our suggestion, as in Portugal, Pantelous, and Verrall (2021), is to choose the α that minimizes the prediction error.

This will also give us the vector of the loss development factors, given by (5.2.1) and with that we will have $\mathbf{X}_{\mathbf{F}}$ . Having $\mathbf{\Sigma}$ and $\mathbf{\Sigma}_{F}$ , we then have $\mathbb{E}\left( \mathbf{\varepsilon\varepsilon}' \right)$ and $\mathbb{E}\left( \mathbf{\varepsilon}_{\mathbf{F}}\mathbf{\varepsilon}_{\mathbf{F}}' \right)$ and we can calculate the prediction error.

Proposition 5.3.1 Using (5.3.1) and the assumptions from Proposition (5.1), the MSEP is obtained from

$\scriptsize\mathbb{E}\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)^{\prime} = \mathbf{X}_{F}{(\mathbf{X}'\mathbf{\Sigma}^{- 1}\mathbf{X})}^{- 1}{\mathbf{X}'}_{F} + \mathbf{\Sigma}_{F}.\tag{5.3.2}$

5.4. Special cases

As in the univariate method in Section 4, we choose the $\alpha$ that minimizes the prediction error. Analytically, we no longer obtain the loss development factors from VP, for $\alpha = 0,$ CL, for $\alpha = 1$ , and SA, for $\alpha = 2$ . The reason is the consideration of contemporaneous correlations between the regressions that change the loss development factors, see (4.2.1), which is different from expression (5.2.1). However, we can say that, when $\alpha = 0$ , we get a Portfolio Multivariate VP, when $\alpha = 1$ we get a Portfolio Multivariate CL and when $\alpha = 2$ we get a Portfolio Multivariate SA. This is due to the heteroscedasticity level. What defines and differentiates these three methods is the weights given to the link ratios, which defines the heteroscedasticity level. In VP, it is zero, with $\alpha = 0$ , in CL it is one, with $\alpha = 1,$ and in SA it is two, with $\alpha = 2$ .

As with the univariate portfolio data method, we can obtain other methods that give other weights to the link ratios, through the selection of $\alpha$ . As with the univariate method from Section 4, the optimal $\alpha$ is the one that minimizes the prediction error.

Corollary 5.4.1 If $\alpha = 0,$ the variances are homoscedastic, and the regressions correlated. We get

$\mathbb{E}\left( \mathbf{\varepsilon}\mathbf{\varepsilon}^{\prime} \right) = \mathbf{\Sigma}_{\mathbf{VP}}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{F}\mathbf{\varepsilon}_{F}' \right) = \mathbf{\Sigma}_{\mathbf{VP,F}}$

$\mathbf{\Sigma}_{\mathbf{VP}}$ and $\mathbf{\Sigma}_{\mathbf{VP}\mathbf{,}\mathbf{F}}$ are the $\mathbf{\Sigma}$ defined in expressions (5.1.1) and (5.1.4), with the following relations,

$\mathbf{\Sigma}_{t,j,j} = s_{t,j,j}\mathbf{I}_{N \times (T - j)}$

$\mathbf{\Sigma}_{t,l,j} = s_{t,l,j}\mathbf{I}_{N \times (T - j)}$

Here, $\mathbf{I}_{N \times (T - j)}$ is a diagonal identity matrix with size $N \times (T - j)$ . With $\alpha$ =0, the loss development factors are the ones from the VP within a portfolio multivariate context (PMVP). Also, $\mathbf{\Sigma}\mathbf{=}\mathbf{\Sigma}_{\mathbf{VP}}$ .

Then, the MSEP comes from

$\begin{aligned}\mathbb{E}&\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)^{\prime} \\=& \mathbf{X}_{F}{(\mathbf{X}'{\mathbf{\Sigma}_{\mathbf{VP}}}^{- 1}\mathbf{X})}^{- 1}{\mathbf{X}'}_{F} \\&+ \mathbf{\Sigma}_{\mathbf{VP,F}}.\end{aligned}\tag{5.4.1}$

Corollary 5.4.2 If $\alpha = 1,$ the variances are heteroscedastic, and the regressions correlated. We get

$\mathbb{E}\left( \mathbf{\varepsilon}\mathbf{\varepsilon}^{\prime} \right) = \mathbf{\Sigma}_{\mathbf{CL}}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{F}\mathbf{\varepsilon}_{F}' \right) = \mathbf{\Sigma}_{\mathbf{CL,F}}$

The $\mathbf{\Sigma}_{\mathbf{CL}}$ and $\mathbf{\Sigma}_{\mathbf{CL,F}}$ are respectively the $\mathbf{\Sigma}$ and $\mathbf{\Sigma}_{\mathbf{F}}$ defined in expression (5.1.1) and (5.1.4) with

$\mathbf{\Sigma}_{t,j,j} = s_{t,j,j}\begin{bmatrix} x_{1,1,j} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x_{t,T - j,j} \end{bmatrix}$

$\mathbf{\Sigma}_{t,j,j}^{F} = s_{t,j,j}\begin{bmatrix} x_{1,T,j} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x_{t,T + j,j} \end{bmatrix}$

With $\alpha$ =1, the loss development factors are the ones from the CL within a portfolio multivariate context (PMCL).

Then, the MSEP comes from

$\begin{aligned}\mathbb{E}&\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)^{\prime} \\&= \mathbf{X}_{F}{(\mathbf{X}'{\mathbf{\Sigma}_{\mathbf{CL}}}^{- 1}\mathbf{X})}^{- 1}{\mathbf{X}'}_{F} + \mathbf{\Sigma}_{\mathbf{CL,F}}\end{aligned}\tag{5.4.2}$

Corollary 5.4.3 If $\alpha = 2,$ the variances are heteroscedastic, and the regressions correlated. We get

$\mathbb{E}\left( \mathbf{\varepsilon}\mathbf{\varepsilon}^{\prime} \right) = \mathbf{\Sigma}_{\mathbf{SA}}$

$\mathbb{E}\left( \mathbf{\varepsilon}_{F}\mathbf{\varepsilon}_{F}' \right) = \mathbf{\Sigma}_{\mathbf{SA,F}}$

$\mathbf{\Sigma}_{\mathbf{SA}}$ and $\mathbf{\Sigma}_{\mathbf{SA}\mathbf{,}\mathbf{F}}$ are respectively the $\mathbf{\Sigma}$ and $\mathbf{\Sigma}_{\mathbf{F}}$ defined in expressions (5.1.1) and (5.1.4) with

$\mathbf{\Sigma}_{t,j,j} = s_{t,j,j}\begin{bmatrix} {x_{1,1,j}}^{2} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{t,T - j,j}}^{2} \end{bmatrix}$

$\mathbf{\Sigma}_{t,j,j}^{F} = s_{t,j,j}\begin{bmatrix} {x_{1,T,j}}^{2} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & {x_{1,T + j,j}}^{2} \end{bmatrix}$

With $\alpha$ =2, the loss development factors are the ones from the SA within a portfolio multivariate context (PMSA).

Then, the MSEP comes from

$\begin{aligned}\mathbb{E}&\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)\left( \widehat{\mathbf{Y}_{F}} - \mathbf{Y}_{F} \right)^{\prime} \\&= \mathbf{X}_{F}{(\mathbf{X}'{\mathbf{\Sigma}_{\mathbf{SA}}}^{- 1}\mathbf{X})}^{- 1}{\mathbf{X}'}_{F} + \mathbf{\Sigma}_{\mathbf{SA,F}}\end{aligned}\tag{5.4.3}$

6. Application to standard data

We consider for the numerical results three paid claims triangles from the literature. We call them triangle 1, triangle 2, and triangle 3:

triangle 1, Mack (1993);
triangle 2, Taylor and Ashe (1983);
triangle 3, Taylor and McGuire (2016).

The results obtained, once again, see Portugal et al. (2017; Portugal, Pantelous, and Verrall 2021), confirm VP as the method that minimizes the prediction error. We present results for the PUGLR (see Section 4) and PMGLR (see Section 5). We also compare these results with those obtained from an aggregated triangle, those we get if we apply generalized link ratios methods separately to each triangle, and those obtained using the multivariate chain-ladder from Zhang (2010).

6.1. Portfolio univariate generalized link ratios

The results obtained are presented in Table 6.1. The α that minimized the prediction error was zero, confirming once again that VP was the best solution, according to this criterion. The prediction error obtained was 8.9% and the total reserves estimated 18 896 187.

Table 6.1.Portfolio univariate generalized link ratios results

Column	Reserves per Column	Prediction Error	Prediction Error %

2	869 981	240 331	28%
3	1 952 143	347 595	18%
4	3 435 107	475 663	14%
5	2 419 986	567 808	23%
6	2 021 461	658 827	33%
7	2 319 736	837 251	36%
8	1 847 867	698 526	38%
9	3 141 606	594 305	19%
10	888 300	321 286	36%

Total	18 896 187	1 675 306	8.9%

Had we considered just one triangle that corresponded to the sum of the three triangles, the results would have been those presented in Table 6.2.

Table 6.2.Generalized link ratios aggregated triangle results

Column	Reserves per Column	Prediction Error	Prediction Error %

2	865 608	230 665	27%
3	1 937 534	334 062	17%
4	3 397 751	459 846	14%
5	2 406 814	553 000	23%
6	2 016 933	657 000	33%
7	2 309 872	837 757	36%
8	1 844 373	700 222	38%
9	3 138 605	712 769	23%
10	886 667	571 490	64%

Total	18 804 158	1 772 148	9.4%

When compared with the Table 6.1 results, the aggregated triangle results are similar: the prediction error increases to 9.4% and the reserves decrease to 18 804 158. The difference seems small, but we must be aware that triangle 2 has far greater reserves than the other two.

As expected, the reserves obtained in Table 6.1 correspond to the sum of the reserves from the three triangles when the generalized link ratios method is applied to each triangle; see Table 6.3. The prediction error increases.

Table 6.3.Portfolio generalized link ratios results - Totals from the three triangles

Column	Reserves per Column	Prediction Error	Prediction Error %

2	869 981	244 074	28%
3	1 952 143	360 140	18%
4	3 435 107	491 128	14%
5	2 419 986	585 065	24%
6	2 021 461	668 160	33%
7	2 319 736	846 658	36%
8	1 847 867	706 287	38%
9	3 141 606	600 946	19%
10	888 300	324 446	37%

Total	18 896 187	1 707 793	9.0%

This same level of total reserves was obtained because we did not consider any correlations between triangles (nor between equation regressions). We just used portfolio data to estimate all the regressions and triangles at the same time.

The prediction error obtained is a weighted average of the prediction errors of the three triangles. The weights are the estimated reserves.

The individual results from the three triangles are presented in Table 6.4. Here, we can see that the 9.0% prediction error obtained is a weighted average of those for the individual triangles:

$\small9.0\% = \frac{\text{43 772} \times 35.8\% + \ \text{18 479 500} \times 9.1\% + \text{372 915} \times 4.6\%}{\text{18 896 187}}$

Also, the sum of the prediction errors from all the triangles (in monetary units) (see Table 6.4) is equal to the same indicator obtained from Table 6.1:

$\text{1 707 793} = \text{15 651 + 1 675 147 + 16 995}$

Table 6.4.Portfolio generalized link ratios results from the three triangles

Triangle 1

Column	Reserves per Column	Prediction Error	Prediction Error %

2	2 511	3 773	150%
3	5 672	5 359	94%
4	7 501	6 627	88%
5	7 867	7 021	89%
6	7 208	5 801	80%
7	4 283	6 015	140%
8	4 412	4 488	102%
9	2 620	3 957	151%
10	1 698	1 775	105%

Total	43 772	15 651	35.8%

Triangle 2

Column	Reserves per Column	Prediction Error	Prediction Error %

2	831 767	240 301	29%
3	1 901 782	347 477	18%
4	3 373 994	475 532	14%
5	2 363 113	567 670	24%
6	1 970 809	658 792	33%
7	2 276 227	837 223	37%
8	1 806 584	698 504	39%
9	3 102 951	594 286	19%
10	852 273	321 278	38%

Total	18 479 500	1 675 147	9.1%

Triangle 3

Column	Reserves per Column	Prediction Error	Prediction Error %

2	35 703	0	0%
3	44 689	7 304	16%
4	53 611	8 969	17%
5	49 007	10 373	21%
6	43 445	3 567	8%
7	39 225	3 420	9%
8	36 871	3 295	9%
9	36 035	2 703	7%
10	34 329	1 393	4%

Total	372 915	16 995	4.6%

6.2. Portfolio multivariate generalized link ratios

For the PMGLR, we also found that, to obtain the lowest prediction error, α = 0. The prediction error of 2.7% represents an important decrease relative to the PUGLR result (8.9% with α = 0). The reserves increase to 23 619 959 (they were 18 896 187 with PUGLR). See Table 6.5 for the PMGLR results and Table 6.1 for PUGLR.

Table 6.5.Portfolio multivariate generalized link ratios results

Column	Reserves per Column	Prediction Error	Prediction Error %

2	806 936	240 633	30%
3	1 992 818	347 132	17%
4	2 452 163	588 011	24%
5	3 162 513	521 512	16%
6	3 675 198	557 398	15%
7	3 838 679	695 253	18%
8	1 886 760	240 575	13%
9	2 338 183	151 203	6%
10	3 463 709	109 640	3%

Total	23 616 959	630 782	2.7%

Using an aggregate triangle would decrease the reserves to 19 889 001, but with an important increase in the prediction error to 5.3%, is shown in Table 6.6.

Table 6.6.Multivariate generalized link ratios aggregated triangle results

Column	Reserves per Column	Prediction Error	Prediction Error %

2	858 647	230 665	27%
3	1 971 159	327 504	17%
4	3 421 429	557 641	16%
5	2 426 784	529 668	22%
6	2 098 779	583 189	28%
7	2 765 878	624 811	23%
8	1 898 505	155 946	8%
9	3 542 188	225 593	6%
10	905 634	488 026	54%

Total	19 889 001	1 052 298	5.3%

The reason for the increase in the reserves level between the PUGLR and the PMGLR lies in the change of the loss development factors, $\mathbf{\beta}$ , mainly those for triangle 2.

Several loss development factors increase and some of them decrease, but the increase of 5% from that corresponding to j = 9 has a 5% impact on all the ultimate factors from all the origin years, and justifies the increase in the reserves of around 25%.

The change in the loss development factors is a consequence of the change in the weights matrix, as the latter is now considering the contemporaneous correlations between the triangles. The changes in these factors are presented in Table 6.7.

Table 6.7.Changes in loss development factors with PMGLR

Triangle	Loss Development Factors
Number	Number	Variation

1	1	99,1%
1	2	12,8%
1	3	17,8%
1	4	-8,1%
1	5	-38,6%
1	6	-19,2%
1	7	-6,1%
1	8	6,1%
1	9	0,1%
2	1	-4,4%
2	2	0,9%
2	3	-9,1%
2	4	6,0%
2	5	7,8%
2	6	4,7%
2	7	-0,3%
2	8	-2,3%
2	9	5,0%
3	1	-19,8%
3	2	19,0%
3	3	-2,9%
3	4	-0,9%
3	5	1,7%
3	6	0,8%
3	7	0,0%
3	8	-0,2%
3	9	0,2%

Now we compare our results with those obtained when applying Zhang (2010)’s multivariate chain-ladder.

Table 6.8.Zhang model results

Column	Reserves per Column	Prediction Error	Prediction Error %

2	886 549	222 622	25%
3	1 947 950	594 171	31%
4	3 380 539	1 015 477	30%
5	2 503 361	1 293 832	52%
6	2 150 766	976 630	45%
7	2 304 673	1 142 572	50%
8	1 824 189	938 833	51%
9	3 086 871	840 964	27%
10	889 646	611 639	69%

Total	18 974 544	2 707 360	14.3%

As we can see, the reserves decrease significantly from 23 616 959 to 18 974 544. However, this is due to a lack of fit between the chain-ladder and this data. Indeed, the prediction error increases from 2.7% to 14.3%.

Finally, we present another result for a variant of the PMGLR: we assume that there are correlations between triangles but no correlations between each triangle’s equations. This should correspond to the PUGLR method. However, the results will be different from those of the PUGLR. The reason is that the PMGLR methods estimate the correlations between triangles in a different way; see Lemmas 4.2.1 and 5.2.1. Despite this, we may compare the PMGLR results with this calculation. We can see in Table 6.9 that the prediction error increases from 2.7% (see Table 6.5) to 6.4%. We conclude from these figures that, with these triangles, the correlations between equations are more important than the correlations between triangles. Also, the level of estimated reserves drops from 23 616 959 to 19 114 443.

Table 6.9.PMGLR with no correlations between the equations of each triangle

Column	Reserves per Column	Prediction Error	Prediction Error %

2	890 992	240 214	27%
3	1 935 591	324 597	17%
4	3 492 070	515 633	15%
5	2 367 356	549 370	23%
6	1 907 664	640 951	34%
7	2 191 227	662 738	30%
8	1 839 764	187 908	10%
9	3 292 299	61 799	2%
10	1 197 480	128 837	11%

Total	19 114 443	1 225 642	6.4%

Comparing now with Zhang (2010), with correlations inside each triangle, to obtain comparable results with the ones presented above, we can see that the reserves difference is much smaller, from 19 114 443 to 18 974 545, which is reflected by the prediction errors being close, 6.4% against 8.0%.

Table 6.10.Zhang model results with correlations inside each triangle

Column	Reserves per Column	Prediction Error	Prediction Error %

2	886 549	222 622	25%
3	1 947 950	319 591	16%
4	3 380 540	600 569	18%
5	2 503 361	552 862	22%
6	2 150 766	694 711	32%
7	2 304 673	682 973	30%
8	1 824 189	212 400	12%
9	3 086 871	473 315	15%
10	889 646	494 062	56%

Total	18 974 545	1 510 083	8.0%

7. Conclusions

From these methods, using portfolio data, we obtained several conclusions in respect of the theory and from the numerical examples given.

Firstly, it is straightforward to move from one triangle to several triangles, when regression techniques are considered.

Secondly, prediction error formulas come very easily from well-known regression models, using the same approach as for the one-triangle case, adapted to several triangles.

Thirdly, considering several triangles allow us to estimate generalized link ratios adapted to our data. Using prediction error minimization helps us to have more accuracy in the reserves calculation for all triangles. Otherwise, important errors might arise due to the method leverage effect when it is applied to several triangles without accuracy.

Fourthly, the use of a portfolio of triangles confirms the use of VP as the solution that minimizes the prediction error. This happens in both the univariate (PUGLR) and multivariate (PMGLR) case.

Fifthly, the use of data shows that differences may arise between the models considered. When we use the PMGLR, the prediction error decreases when compared to the PUGLR. It seems that it is worth working with more information to predict the reserves. However, using such information also produces an increase in the level of reserves, due to the correlations between triangles. This shows that the latter have a role to play in claims reserving.

Sixthly, as expected, the level of reserves is not the same as arises when we have all the triangles aggregated. The aggregation of the triangles in just one triangle gives us a lower level of estimated reserves but the prediction errors are higher. This is a good example of the danger of not using homogeneous triangles in claims reserving, even if the prediction error is low.

Portfolio Claims Reserving with Univariate and Multivariate Generalized Link Ratios

Abstract

1. Introduction

2. Models in the literature with more than one triangle

3. Portfolio generalized link ratios formulation

4. Portfolio univariate generalized link ratios

4.1. Assumptions

4.2 Parameters’ estimation

4.3 Prediction error

4.4. Special cases

5. Portfolio multivariate generalized link ratios

5.1. Assumptions

5.2. Parameter estimation

5.3 Prediction error

5.4. Special cases

6. Application to standard data

6.1. Portfolio univariate generalized link ratios

6.2. Portfolio multivariate generalized link ratios

7. Conclusions

References