1. Introduction and model assumptions
Often in P claims reserving problems, the claims settlement process goes beyond the latest development period available in the observed claims development triangle. This means that there is still an unobserved part of the insurance claims for which one needs to build claims reserves. In such situations, claims reserving actuaries apply socalled tail development factors to the last column of the claims development triangle which account for the settlement that goes beyond this latest development period. Typically, one has only limited information for the estimation of such tail development factors. Therefore, various techniques are applied to estimate these tail development factors. Most of these estimation methods are ad hoc methods that do not fit into any stochastic modeling framework. Popular estimation techniques, for example, fit parametric curves to the data using the righthand corner of the claims development triangle (Mack 1999; Boor 2006; Verrall and Wüthrich 2012). In practice, one often does a simultaneous study of claims payments and claims incurred data, i.e., incurredpaid ratios are used to determine tail development factors (see Section 3 in Boor 2006).
In this paper we review the paidincurred chain (PIC) reserving method. The lognormal PIC reserving model introduced in Merz and Wüthrich (2010) can easily be extended so that it allows for the inclusion of tail development factors in a natural and mathematically consistent way. Similar to common practice, the tail development factor estimates will then be based on incurredpaid ratios within our PIC reserving framework.
In the following, we denote accident years by Figure 1). Cumulative payments in accident year after development years are denoted by and the corresponding claims incurred by Moreover, for the ultimate claim we assume with probability 1 (see Figure 1). This means that we assume thatafter several development periods beyond the latest observed development year the cumulative payments and the claims incurred lead to the same ultimate claim amount. That is, ultimately, when all claims of accident year are settled, and must coincide.
and development years by Development year refers to the latest observed development year of accident year and the step from to refers to the tail development factors (seeModel Assumptions 1.1
Lognormal PIC reserving model, Merz and Wüthrich (2010)

Conditionally, given parameters
we have 
the random vector
has a multivariate Gaussian distribution with uncorrelated components given by
ξi,j∼N(Φj,σ2j) for i∈{0,…,J} and j∈{0,…,J+1}, and ζi,j∼N(Ψj,τ2j) for i∈{0,…,J} and j∈{0,…,J};

cumulative payments
are given by the recursion, with initial value 
claims incurred
are given by the (backwards) recursion, with initial value
For an extended model discussion we refer to Merz and Wüthrich (2010). Basically, the PIC Model Assumptions 1.1 are a combination of Hertig’s (1985) lognormal model (applied to cumulative payments) and Gogol’s (1993) Bayesian claims reserving model (applied to claims incurred). In contrast to the PIC reserving model in Merz and Wüthrich (2010), we now add an extra development period from J to J + 1. This is exactly the crucial step that allows for the consideration of tail development factors and it leads to the study of incurredpaid ratios for the inclusion of such tail development factors.
The PIC Model Assumptions 1.1 may be criticized because of two restrictive assumptions. We briefly discuss how these can be relaxed.

Assumption
for all If there are known (prior) differences between different accident years this can easily be integrated by setting with constants describing these prior differences. 
Independence between Happ and Wüthrich (2013). To keep the analysis simple, we refrain from studying this more complex model in the present paper.
and : This is probably the main weakness of the model. However, this assumption can easily be relaxed in the spirit of
2. Estimation of tail development factors
At time J one has observed data given by the set
DJ={Pi,j,Ii,j:i+j≤J,0≤i≤J,0≤j≤J},
and one needs to predict the ultimate claim amounts Merz and Wüthrich 2010). In this section we discuss how to modify the general outline of Model Assumptions 1.1 to incorporate tail development estimation.
conditional on these observations On the one hand, this involves the calculation of the conditional expectations and, on the other hand, it involves Bayesian inference on the parameters given (see Theorems 2.4 and 3.4 in2.1. Ultimate claims prediction conditional on parameters
We apply Model Assumptions 1.1 to the tail development factor estimation problem. Therefore, we need to specify the prior distribution of the parameter vector Θ.
Often, there is subjectivity in claims incurred data I_{i, j} because the use of different claims adjusters with different estimation methods and changing reserving guidelines. Therefore, for the present setup we have decided to consider claims incurred data I_{i, j} only for the estimation of tail development factors, i.e., we work under the assumption of having incomplete claims incurred triangles (see also Dahms (2008) and Happ and Wüthrich (2013) for claims reserving methods on incomplete data). It is not difficult to extend the model to incorporate all claims incurred information, but in the present work this would detract from the tail development factor estimation discussion.
The prediction based on incomplete claims incurred data is done as follows. Assume there exists J* ∈ {0, . . . , J} such that with probability 1
ΨJ≡τ2J/2
and if J* J
τJ∗=τJ∗+1=⋯=τJ−1≡τ,ΨJ∗=ΨJ∗+1=⋯=ΨJ−1≡τ2/2.
Note that if
we simply assume These assumptions imply that there is no substantial claims incurred development after claims development period i.e., there is no systematic drift in the claims incurred development after This is seen as follows, forE[exp{−ξi,j}]=E[E[exp{−ζi,j}∣Θ]]=E[exp{−Ψj+τ2j/2}]=1
This implies that on average the claims incurred prediction is correct (and we have only pure random fluctuations around this prediction), i.e., for j ∈ {J* + 1, . . . , J + 1}
E[Ii,j−1∣Ii,j]=Ii,j,Vco(Ii,j−1∣Ii,j)=(exp{τ2j−1}−1)1/2
where Vco(·) denotes the coefficient of variation. The fact that we allow τ_{J} to differ from τ corresponds to the difficulty that the tail development factor may cover several development years beyond the last observed column in the claims development triangle and therefore we may allow for standard deviation parameters
for the development period from to (possibly covering more than one period).Remark. If there is expert judgment about a drift term in the claims incurred development Verrall and Wüthrich (2012).
this can easily be integrated by adjusting assumptions (2.1)–(2.2). This also allows one to consider parametric curves, as mentioned in Section 1, but in this case it is more appropriate to treat this knowledge as informative as to prior distributions specifying prior uncertainty in this expert judgment, similar toThus, assumptions (2.1)–(2.2) imply that there is no systematic drift in {J* + 1, . . . J + 1}, and under these assumptions we consider tail factor estimation under the restricted observations given by
D∗J={Pi,j,Ik,l:i+j≤J,k+l≤J,l≥J∗}=DJ∩{Pi,j,Ik,l:l≥J∗}.
In this spirit, we consider all cumulative payment observations but only claims incurred observations from development year J* on. That is, only the claims incurred I_{i,j} from the latest J − J* + 1 development periods J*, J* + 1, . . . , J are used to estimate tail development factors and the claims reserves. We define the following parameters
ηj=∑jm=0Φm and w2j=∑jm=0σ2m, for j=0,…,J+1,μl=ηJ+1−∑Jn=lΨn and v2l=w2J+1+∑Jn=lτ2n, for l=J∗,…,J.
Moreover, we define the parameters
βj={w2J+1−w2jv2j−w2j>0 for j=J∗,…,J0 for j=0,…,J∗−1.
The following result shows that β_{j} can be interpreted as the credibility weight for the claims incurred observations: Theorem 2.1.Under Model Assumptions 1.1 we have, conditional on Θ and D_{J} *,
E[Pi,J+1∣D∗J,Θ]=P1−βJ−ii,J−iIβJ−ii,J−iexp{(1−βJ−i)J+1∑l=J−i+1(Φl+σ2l/2)+βJ−iJ∑l=J−iΨl}.
For the conditional variance we obtain
Var(Pi,J+1∣D∗J,Θ)=E[Pi,J+1∣D∗J,Θ]2(exp{(1−βJ−i)∑J+1l=J−i+1σ2l}−1).
For (1985) presented in Section 2.1 of Merz and Wüthrich (2010)]
there holds and, therefore, we obtain a purely claims payment based prediction [see also Hertig’s modelPi,J−iexp{J+1∑l=J−i+1(Φl+σ2l/2)}
For
there holds and, therefore, we obtain a correction term to the purely claims payment based prediction which is based on the claims incurredpaid ratio i.e., for a large incurredpaid ratio we get a higher expected ultimate claim as can be seen fromP1−ββJ−ii,J−iIβJ−ii,J−i=exp{(1−βJ−i)logPi,J−i+βJ−ilogIi,J−i}=Pi,J−iexp{βJ−ilogIi,J−iPi,J−i}.
2.2. Parameter estimation, the general case
The likelihood function of the restricted observations D_{J}* is given by [see also (3.5) in Merz and Wüthrich (2010)]
lD∗j(Θ)∝J∏J=0J−j∏i=01σjexp{−12σ2j(Φj−logPi,jPi,j−1)2}×J−J∗∏i=01√v2J−i−w2J−iexp{−12(v2J−i−w2J−i)(μJ−i−ηJ−i−logIi,J−iPi,J−i)2}×J−1∏J=J−1J−1∏t=01τjexp{−12τ2j(Ψj+logIi,jIi,j+1)2},
where ∝ means that only relevant terms dependent on Θ are considered. The first line describes the claims payment development, the last line describes the claims incurred development, and the middle line describes the gap between the diagonal claims incurred and the diagonal claims payment observations.
In order to perform a Bayesian inference analysis on the parameters we need to specify the prior distribution of Θ.
Model Assumptions 2.2. PIC tail development factor model
We assume Model Assumptions 1.1 hold true with positive constants
and Moreover, it holdsΦm∼N(ϕm,s2m) for m∈{0,…,J+1}
with prior parameters _{m} ∈ ℝ and s_{m} 0. □
Under Model Assumptions 2.2 the posterior distribution
of given is given byu(Φ∣D∗J)∝lD∗J(Θ)J+1∏m=0exp{−12s2m(Φm−ϕm)2}
This immediately implies the following theorem:
Theorem 2.3. Under Model Assumptions 2.2 the posterior of is a multivariate Gaussian distribution with posterior mean and posterior covariance matrix Define the posterior standard deviation by
spost j=(s−2j+(J−j+1)σ−2j)−1/2 for j=0,…,J+1
Then, the inverse covariance matrix
is given byan,m=(spost n)−21{n=m}+[(n−1)∧(m−1)∑i=J∗(v2i−w2i)−1]1{n,m≥J∗+1}
The posterior mean
is obtained by(ϕpost 0,…,ϕpost J+1)′=∑(D∗J)(c0,…,cj+1)′
with vector
given bycj=ϕjs2j+1σ2jJ−j∑i=0logPi,jPi,j−1+[J−J∗∑i=J−j+11v2J−i−w2J−i(logIi,J−iPi,J−i+iτ2+τ2J2)]1{j,J∗+1}.
Note that the last term in the definition of
Corollary 2.4. Under Model Assumptions 2.2 the posterior of is a multivariate Gaussian distribution with being independent with
Φj{D∗j}∼N(ϕpost j=γjˉϕj+(1−γj)ϕj,(spost j)2)
for j ≤ J* and credibility weight and empirical mean defined by
γj=J−j+1J−j+1+σ2j/s2j and ¯ϕj=1J−j+1∑J−ji=0logPi,jPi,j−1 for j=0,…,J∗
Henceforth, Corollary 2.4 shows that for development years
we obtain the wellknown credibility weighted average between the prior mean and the average observation The case is more involved: one basically obtains a weighted average between the prior mean the average observation and the incurredpaid ratiosRemark. Model Assumptions 2.2 specify a Bayesian model with multivariate Gaussian distributions. This setup allows for closedform solutions. For other distributional assumptions the problem can only be solved numerically using Markov chain Monte Carlo methods. Bayesian statistics, like the Bayesian information criterion BIC, would then allow for model testing and model selection. If one restricts to linear credibility estimators, see Bühlmann and Gisler (2005), then _{j}^{post} given in (2.4) corresponds to the linear credibility estimator in more general models.
2.3. Parameter estimation, special case J* = J
We consider the special case
Corollary 2.5. Choose Under Model Assumptions 2.2 , the posterior distribution of is a multivariate Gaussian distribution with being independent. For the posterior distribution of is given by (2.4). The posterior of is given by
ΦJ+1{D∗J}∼N(ϕpost J+1=γJ+1(logI0,JP0,J+τ2J2)+(1−γj+1)ϕJ+1,a−1J+1,J+1),
with inverse variance given by
aJ+1,J+1=s−2J+1+(σ2J+1+τ2J)−1 and credibility weight given by γJ+1=11+(σ2J+1+τ2J)/s2J+1.
This means that in the case
we obtain a credibilityweighted average between the prior tail development factor and the observation Henceforth, in this case only the latest incurredpaid ratio is considered for the estimation of the tail development factor.3. Posterior claims prediction and prediction uncertainty
3.1. General case
In view of Theorems 2.1 and 2.3 we can now predict the ultimate claim
conditional on the restricted observations under Model Assumptions 2.2.Proposition 3.1. Bayesian ultimate claims predictor. Under Model Assumptions 2.2 we predict the ultimate claim
given byE[Pi,J+1∣D∗J]=P1−βJ−ii,J−iIβJ−ii,J−iexp{(1−βJ−i)J+1∑l=J−i+1σ2l2+βJ−iiτ2+τ2J2}×exp{(1−βJ−i)J+1∑j=J−i+1ϕpostj+(1−βJ−i)2e′J−i+1∑(D∗J)eJ−i+12},
where
with the first components equal to .Next we determine the prediction uncertainty. Model Assumptions 2.2 and Theorem 2.3 constitute a full distributional model which allows for the calculation of any risk measure (using Monte Carlo simulations) under the posterior distribution, given D_{J}*. Here, we use the most popular measure for the prediction uncertainty in claims reserving, the socalled conditional mean square error of prediction (MSEP). The conditional MSEP has the advantage that we can calculate it analytically. Analytical solutions have the advantage that they allow for more basic sensitivity analysis. The conditional MSEP is given by (see also Section 3.1 in Wüthrich and Merz (2008))
msep∑Ji=0Pi,J+1∣p∗(E[∑Ji=0Pi,J+1∣D∗J])=E[(∑Ji=0Pi,J+1−E[∑Ji=0Pi,J+1∣D∗J])2∣D∗J]=Var(∑Ji=0Pi,J+1∣D∗J),
i.e., in this Bayesian setup the conditional MSEP is equal to the posterior variance. This posterior variance allows for the usual decoupling into average processes error and average parameter estimation error; see (A.3). The conditional MSEP satisfies
Var(J∑i=0Pi,J+1∣D∗J)=J∑i,k=0Cov(Pi,J+1,Pk,J+1∣D∗J).
We obtain the following theorem:
Theorem 3.2. Under Model Assumptions 2.2 the conditional MSEP of the Bayesian predictor for the aggregate ultimate claim is given by
msep∑Ji=0Pi,J+1∣D∗J(E[J∑i=0Pi,J+1∣D∗J])=∑0≤i,k≤J(e(1−βJ−i)(1−βJ−k)e−i−i+1Σ(D∗J)eJ−k+1+1+1i=kξ(1−βJ−i)∑J+1l=J−i+1+σ2l−1)×E[Pi,J+1∣D∗J]E[Pk,J+1∣D∗J].
3.2. Special case J* = J with noninformative priors
We revisit the special case J* = J and we also assume noninformative priors meaning that s_{j}^{2} → ∞. In that case we obtain that the posterior distributions of Φ_{0}, . . . , Φ_{*J*+1} are independent Gaussian distributions with
Φj{Dj}∼N(ϕpost j=¯ϕj=1J−j+1∑J−ji=0logPi,jPi,j−1(spost j)2=σ2jJ−j+1),
for j ≤ J, and
Φj+1{D∗j}∼N(ϕpost J+1=logI0,JP0,J+τ2J2,(spost J+1)2=a−1J+1,J+1=σ2J+1+τ2J).
This implies for the ultimate claim prediction for i 0
E[Pi,J+1∣D∗J]=Pi,J−iexp{J+1∑l=J−i+1ϕpost l+σ2l2+(spost l)22}=Pi,J−iJ∏l=−i+1ˆflˆf(ult)J+1
with chainladder factors
ˆfl=exp{ϕpost l+(1+1J−l+1)σ2l2}
ˆf(ult)J+1=I0,JP0,Jexp{σ2J+1+τ2J}
That is, the first terms in the product on the righthand side of (3.1) are the classical chainladder factors for Hertig’s lognormal model (1985); see also (5.11)–(5.12) in Wüthrich and Merz (2008). The last term in (3.1), however, describes the tail development factor (adjusted for the variance).
For i = 0 we have
E[P0,J+1∣D∗J]=P0,Jˆf(ult)J+1=I0,Jexp{σ2J+1+τ2J}
4. Example
In this section we provide an example. We assume that J = 9 and that the claims payment data P_{i,j} and the claims incurred data I_{i,j} for i + j ≤ J are given by Tables 1 and 2, respectively.
We first need to determine Table 3. In the upper right triangle in Table 3 (with the individual chain ladder factors for years ) we see no further systematic development, so we concentrate on possible choices
We choose the value such that there is no substantial claims incurred development (no systematic drift) after development period This choice is made based on actuarial judgment. We therefore look at the individual chainladder factors and These are provided inThe standard deviation parameters
and should be determined with prior knowledge only. In our example we assume that we have noninformative priors, which means that we set For and we take an empirical Bayesian point of view and estimate them from the data. For j = 0, . . . , J − 1 we setˆσ2j=1J−jJ−j∑i=0(logPi,jPi,j−1−¯ϕj)2
Unfortunately,
and cannot be estimated from the data, because we do not have sufficient observations. Therefore, we make the ad hoc choiceˆσJ+1=ˆσJ=min{ˆσJ−1,ˆσJ−2,ˆσ2J−1/ˆσJ−2}.
We estimate the parameter Table 3). Finally, for we do the ad hoc (expert) choice This suggests that we have (approximately) another three uncorrelated development periods beyond until all claims are finally settled. Of course, additional information about (if available) should be used here. These choices provide the standard deviation parameters given in Table 4. Now we are ready to calculate the claims reserves and the corresponding prediction uncertainty in our model according to Proposition 3.1 and Theorem 3.2. We do this for J* ∈ {6, . . . , 9}. The results are provided in Table 5.
with the empirical standard deviation of for and (because we assume that there is no systematic claims incurred development after development period 6; seeInterpretations

The analysis shows that in the presence of tail development, Hertig’s model (1985) may substantially underestimate the outstanding loss liabilities compared to the PIC tail development factor models for J* = 9, 8, 7. Only the PIC tail development factor model for J* = 6 gives similar reserves. This comes from the fact that the incurred development factors still give a downward trend to incurred losses in development periods 6 and 7 (see average in Table 3), which contradicts our model assumptions (2.1)–(2.2) and suggests to choose J* = 8 or 9. Of course, as mentioned above, this expert choice is based on the rationale that there is no systematic drift after J*, and statistical methods could justify this hypothesis/choice.

Including tail development factors for J* = 8, 9 also gives a higher prediction uncertainty msep^{1/2} compared to Hertig’s model (1985) without tail development factors. This finding is in line with the ones in Verrall and Wüthrich (2012) and shows that prediction uncertainty needs a careful evaluation in the presence of tail development.

Note that for J* = 9 we simultaneously consider claims payments and claims incurred information for accident year i = 0. For J* = 8 we simultaneously consider claims payments and claims incurred information for accident years i = 0, 1. This results in a much lower prediction uncertainty in these accident years (above the horizontal line in the corresponding columns of Table 5). The reason is that the claims incurred information has only little uncertainty (since we assume Ψ_{j} to be constant for j ≥ J*). This substantially reduces the prediction uncertainty.
We may question whether there is so much information in these last claims incurred observations. If this is not the case, we should either increase τ and τ_{J} or we should use less informative priors in (2.1)–(2.2). The latter would bring us back to the model of Merz and Wüthrich (2010) and Happ and Wüthrich (2013) with the additional assumption that there is no systematic drift after J*. Moreover, this latter model would also allow us to consider more information than just the restricted one given by D_{J}*. In the present work we have decided to work with the restricted information D_{J}* only because then we can fully concentrate on tail factor estimation. Otherwise tail factor estimation would be more hidden in the data and analysis.
5. Conclusion
We have modified the PIC reserving model from Merz and Wüthrich (2010) so that it allows for the incorporation of tail development factors. These tail development factors are estimated considering claims incurredpaid ratios in an appropriate way. This extends the ad hoc methods used in practice and because we perform our analysis in a mathematically consistent way we also obtain formulas for the prediction uncertainty. These are obtained analytically for the conditional MSEP and these can be obtained numerically for other uncertainty measures using Monte Carlo simulations (because we work in a Bayesian setup). The case study highlights the need to incorporate tail development factors in the presence of tail development, since otherwise both the outstanding loss liabilities and the prediction uncertainty are underestimated.