1. Introduction
My congratulations to Mr. Leigh J. Halliwell on this paper that clearly presents the mathematics of excess losses with an interesting example. I agree with him that the mathematics of excess losses is beautiful and powerful. However, the mathematics of excess losses also contains several subtle points that are not mentioned in the paper. This discussion note complements the article by clarifying some of these points. To be clear, it is not my intention to be critical of Mr. Halliwell. The purpose of this note is two-fold:
-
To clarify some important hidden points in the mathematics of excess losses;
-
To give references to some uncredited results.
For those ambitious actuaries who want to dig deeper for a full understanding of the rigorous mathematics of excess losses, this note also provides some directions for further studies.
For the convenience of readers, we will adopt the notations in Halliwell (2013). Throughout this note, X will denote a nonnegative random variable. F and G will denote the cumulative distribution function (CDF) and survival function of X, respectively.
2. The reason why we need to watch our steps
Halliwell (2013) argues that points of probability are allowed due to four properties of the CDF: (1) non-decreasing; (2) total probability; (3) continuity from the right; and (4) non-negative. Indeed, we will also need one more important property of the CDF: the left-hand limit of a CDF exists at each point (see, for example, Shiryaev 1996). This is clear from the second equality of the following equation.
P[X=a]=limn→∞P[a−1/n<X≤a]=F(a)−F(a−).
Since a CDF is always non-decreasing, its left-hand limit and right-hand limit exist at each point in the domain (see, for example, Rudin 1976). This means P[X = a] needs not be zero. Precisely, we should define X to be continuous if F is continuous, that is, P[X = a] = 0 for each a. In particular, if F admits a density, then it is continuous. Unfortunately, some elementary probability textbooks do not clarify this and simply define a random variable to be continuous if its CDF admits a density. It is possible but quite difficult to construct a continuous random variable which has no density function.[1] Equation (1) and the definition of Riemann-Stieltjes Integrals[2] show that Riemann-Stieltjes integrals with respect to a CDF integrator[3] will count probability masses at a given point (not just at zero!). However, the ordinary Riemann-integrals do not because in this case we are integrating with respect to the continuous function F(x) = x, i.e., F(x) − F(x−) = x − (x−) = 0. In words, our experience with the ordinary Riemann-integrals could be misleading when we need to deal with Riemann-Stieltjes integrals. This is why we need to watch our steps!
3. A subtle point both casualty and non-casualty actuaries might want to know
There is a subtle point hidden in the derivation of the following formula in Halliwell (2013).
Excessx(r)=0−0+∫∞x=rGx(x)dx
The formula Halliwell (2013) will need the justification of this, too. This important formula has been frequently used in both life and nonlife insurance. See, for example, Cunningham, Herzog, and London (2008); Dickson, Hardy, and Waters (2009); Klugman, Panjer, and Willmot (1998). We point out that this formula is not as trivial as it might seem to be, because a 0 • ∞ form appears here and L’Hôpital’s rule does not seem to be helpful. (We challenge readers to give a correct justification on their own.) To our best knowledge, this crucial point has been missed by the actuarial community for a long time. For example, neither Klugman, Panjer, and Willmot (1998) nor Cunningham, Herzog, and London (2008) provides a justification of this. However, Dickson, Hardy, and Waters (2009, 20) is aware of the subtlety of this; but they impose cumbersome assumptions. Indeed, their assumptions 2 and 3 can be simply replaced by the one that X has finite second moment. Following Hong (2012), we give a correct and simple justification. First, note that
is unjustified in the derivation. A close scrutiny reveals that the derivations of several other formulas in0≤rGx(r)=rP{X>r}=r∫∞rdGx(t)≤∫∞rtdGx(t).
Then we obtain the result by taking r → ∞.
4. References for some uncredited results
There are many formulas derived in Halliwell (2013). It seems to us that the author might be unaware of some part of the existing literature. We respectfully point out that quite a few of these formulas are special cases of well known equations in probability theory but in some new notations. Here we give references to these known results. The formula (2) and the second moment formula on p. 35 of Halliwell (2013) are both given in Cunningham, Herzog, and London (2008); Dickson, Hardy, and Waters (2009); and Ross (2010); the formula on p. 43 and the formula about Excess(x)dh(x) on p. 35 of Halliwell (2013) are given in Klebaner (2005); the formula of layered losses on p. 37 of Halliwell (2013) is given in Wang (1996, 2000). The result in the footnote 11 of Halliwell (2013) is also a well-known result that is documented in Klebaner (2005), Royden (1988), and Rudin (1976). We feel that the proof in Rudin (1976) is shorter and cleaner.
In addition, the derivation of the first formula on p. 36 of Halliwell (2013) is not necessary. The equation follows trivially from the definition of Riemann-Stieltjes integral (cf p. 43 of Halliwell 2013) and the fact adding a constant to a function will not change its variation, i.e., d(h(x) + c) = dh(x).
5. For ambitious actuaries
Finally, we would like to provide ambitious readers with some big picture. The Riemann-Stieltjes integral is a generalization of the ordinary Riemann-integral since the integrator is allowed to be a function F instead of the variable x. Indeed, Riemann-Stieltjes integrals can be defined for a much wider class of integrators than the class of CDFs. But the most interesting (and arguably the most useful) case is the one where the integrator F is a function of bounded variation.[4] In particular, the Riemann-integral with respect to a nondecreasing function integrator (hence CDF integrator) can be defined. Halliwell (2013) makes heavy use of Riemann-Stieltjes integrals. While Riemann-Stieltjes integrals may be a useful tool for studying CDFs, in general it is not a favorable choice in probability. One of the main reasons is it may not preserve limits of increasing sequences of loss random variables. For example, suppose is an increasing sequence of loss random variables that converges to a loss random variable with probability one, a desirable situation for an insurer would be However, this is not true under the framework of Riemann-Stieltjes integrals. On the other hand, Lebesgue integrals do preserve limits in such as case. This explains why most advanced monographs on probability favor Lebesgue integrals. For more details, readers can consult Billingsley (1995), Chow and Teicher (1997) and Shiryaev (1996). The mathematics of excess losses mainly addresses the probabilistic part of excess losses. In practice, an actuary will need to use loss data for his/her work. Therefore, one important direction of future research on this topic could be finding better ways to estimate various excess loss formulas. Efforts along this line are expected to involve survival analysis. Readers can consult Aalen, Borgan, and Gjessing (2008), Andersen et al. (1993), Fleming and Harrington (1991) and Klein and Moeschberger (2005).
6. Conclusion
A well written paper on the mathematics of excess losses is a needed service for our actuarial community. I congratulate Mr. Leigh J. Halliwell again on providing such an paper. I hope actuaries will find his paper and this discussion note useful.
Response by the author, LEIGH J. HALLIWELL
This discussion is a valuable adjunct to my 2012 Variance paper, “The Mathematics of Excess Losses.” Dr. Hong states that the value of his discussion is twofold: (1) to clarify “subtle” or “hidden” points, and (2) to provide scholarly references. The latter is especially welcome, since my formal mathematical education left off in the 1970s. Since then, I’ve learned on my own and from the actuarial syllabus and literature. I’ve never believed my work to be original, and Dr. Hong has shown where in the academic literature others have gone before me. Truly, according to Ecclesiastes, “There is nothing new under the sun.”
1. My Background
My work at NCCI in the early 1990s on retrospective rating and Table M introduced me to the excess-loss function. Table M consists of ninety-nine columns, 01–99. The value of Table M at entry ratio 1.00 equals the column number as a percentage. For example, eighty percent of a loss whose distribution accords with column 80 is in excess of its expected value. Higher column numbers indicate greater variance, or greater variance in relation to expected value. Column 00, were it published, would be the distribution of a constant random variable, none of whose loss is in excess of its expected value. At that point I began to imagine what appears as Figure 1 in my paper, which then led to the double-integral proof of Section 3 that
I’ve been away from Table M for nearly twenty years, and don’t know how NCCI currently calculates it. But in the mid-1990s I programmed as an Excel 4 macro the then complicated Table-M function. By experiment I found that excess-loss functions of gamma distributions with appropriate parameters fairly approximated the Table-M formulas. Appendix A below shows how to do this.
About that time I was also studying for CAS Exam 5, which then covered Risk Theory. It was from the syllabus reading Risk Theory (Chapman and Hall, 1984), by Beard, Pentikäinen, and Pesonen, that I first learned about Stieltjes integrals, a subject on which Dr. Hong rightfully concentrates. I will take this up in Section 3. But for now I will note only that I should have proofread my paper more carefully and corrected several errors in its Appendix A. The cryptic ‘[1, p. 12]’ in its third sentence had originally been a reference to the page of Risk Theory on Stieltjes integrals.
2. Two Subtleties
Dr. Hong deems two points subtle enough to deserve clarification. First, in addition to the four properties that I attributed in Section 2 to the cumulative distribution function, he adds “one more important property of the CDF: the left-hand limit of a CDF exists at each point.” But this is not a separate property; rather, it is implicit in the nature of the real numbers. A fundamental theorem of the real numbers, based as they are on “Dedekind cuts” of the rational numbers, is that any upper-bounded subset of the real numbers has a least upper bound.[5] But since FX(x) is non-decreasing (property 1), FX(x) ≤ FX(a). And since
FX(a) is an upper bound to such limits, there must be a least upper bound, which may be symbolized as FX(a−). Of course, if FX(a−) FX(a), there is a mass of probability at x = a. All this is clear in a quotation from Section 2 of my paper:
Two appealing properties of the excess-loss function are (1) that it is everywhere continuous, and (2) that if it is positive, it strictly decreases. Moreover, its derivative at r, if it exists, equals –GX(r). Even if it does not exist, at least the left and right derivatives exist, and the difference of the left derivative from the right is the probability mass at r.
The other subtle point concerns the equation in Section 2:
ExcessX(r)=∫∞x=r(x−r)dFx(x)=(x−r)Gx(x)|r∞+∫∞x=rGx(x)dx=0−limx→∞(x−r)Gx(x)+∫∞x=rGx(x)dx
In his Section 3 Dr. Hong notes that
xGX(x), which equals (x − r)GX(x), does not necessarily equal zero. I had glossed over this, presuming the reader to understand that if (x − r)GX(x) > 0, then E[X] is infinite, in which case X has no excess-loss function. Dr. Hong’s “justification” that xGX(x) = 0 relies on the inference that if E[X] = converges to a real number, then However, the proof of this inference is complicated by the need to work with nested limits. The complete proof is:lima→∞∫∞x=axdFX(x)=lima→∞{limM→∞∫Mx=axdFx(x)}=lima→∞{limM→∞(∫Mx=0xdFx(x)−∫ax=0xdFx(x)}=lima→∞{limM→∞∫Mx=0xdFx(x)−∫ax=0xdFx(x)}=lima→∞{E[X]−∫ax=0xdFX(x)}=E[X]−lima→∞∫ax=0xdFX(x)=E[X]−E[X]=0
A similar argument will prove that if E[X] = ExcessX(0) is finite, then
ExcessX(r) = 0.Nonetheless, I believe the following proof to be more elegant and insightful. And because Dr. Hong rightly says that “. . . other formulas . . . will need the justification of this too,” I will generalize from E[X] to E[h(X)] by revisiting (and correcting) the formula for E[h(X)] in my Appendix A. The usual derivation, using integration by parts, is:
E[h(X)]=h(0)Prob[X=0]+∫∞x=0h(x)dFx(x)=h(0)Prob[X=0]−∫∞x=0h(x)dGx(x)=h(0)Prob[X=0]−h(x)Gx(x)|∞0+∫∞x=0GX(x)dh(x)=h(0)Prob[X=0]−limx→∞h(x)GX(x)+h(0)Gx(0)+∫∞x=0GX(x)dh(x)=h(0)Prob[X≥0]−limx→∞h(x)GX(x)+∫∞x=0Gx(x)dh(x)=h(0)−limx→∞h(x)Gx(x)+∫∞x=0Gx(x)dh(x)
But another derivation uses the inversion-of-a-double-integral technique in my Section 3:
E[h(X)]=h(0)Prob[X=0]+∫∞x=0h(x)dFX(x)=h(0)Prob[X=0]+∫∞x=0{h(0)+∫xy=0dh(y)}dFx(x)=h(0)Prob[X=0]+h(0)∫∞x=0dFX(x)+∫∞x=0∫xy=0dh(y)dFX(x)=h(0)Prob[X=0]+h(0)Prob[X>0]+∫∞y=0∫∞x=ydFx(x)dh(y)=h(0)⋅Prob[X≥0]+∫∞y=0Prob[X>y]dh(y)=h(0)+∫∞x=0GX(x)dh(x)
Comparison of the last lines of both derivations leads to the conclusion that if the integral for E[h(X)] converges, then
h(x)GX(x) must be zero.But one must not succumb to the fallacy of affirming the consequent. Even if
h(x)GX(x) = 0, the integral for E[h(X)] will not converge, unless h(x)GX(x) approaches zero quickly enough. Assume that h(x) = ∞; so, for large enough M, h(x > M) is positive. Restate the integral as:E[h(X)]=h(0)+∫∞x=0GX(x)dh(x)=h(0)+∫Mx=0Gx(x)dh(x)+∫∞x=Mh(x)GX(x)dh(x)h(x)
If h(x)GX(x) approaches zero on the order of an inverse power curve, i.e., h(x)GX(x)
:∫∞x=Mh(x)Gx(x)dh(x)h(x)≈∫∞x=Mdh(x)h(x)1+ε=−1ε1h(x)ε|∞M=1ε1h(x)ε|M∞=1εh(M)ε
Then the integral for E[h(X)] converges. But if h(x)GX(x) approaches zero on the order of an inverse logarithm, i.e., h(x)GX(x)
the integral will not converge:∫∞x=Mh(x)Gx(x)dh(x)h(x)≈∫∞x=M1lnh(x)dh(x)h(x)=ln(lnh(x))|∞M=∞−ln(lnh(M))=∞
Therefore,
h(x)GX(x) = 0 is necessary, but not sufficient, for E[h(X)] to be a real number.3. Stieltjes Integrals and Cardinality
Knowing just enough about Stieltjes integrals to be dangerous,[6] I used them in the paper only because the Stieltjes integral, unlike the classical or Riemann integral, allows for discontinuities in its integrand. In other words, the formulas using them accommodate discrete and mixed distributions. I am intrigued by Dr. Hong’s claim that “It is possible but quite difficult to construct a continuous random variable which has no density function.”[7]
Although the subtleties of measure theory are beyond me, the proof in my Footnote 11 was enough to justify the Stieltjes integral as a shorthand for the expectation of a mixed distribution:
E[g(X)]=∫∞x=−∞g(x)fx(x)dx+∞∑i=1g(xi)Prob[X=xi]
But to use this formula, one must prove that the number of points at which a random variable has positive probability must be countable (hence indexable in the sigma operator). Although I knew that this was well-known to mathematicians, Dr. Hong’s remark surprised me: “We feel that the proof in Rudin (1976) is shorter and cleaner.” My response is: What can be simpler and cleaner than Cantor’s equation ℵ0 × ℵ0 = ℵ0? In words, a countable union of countable sets is countable. In regard to probability distributions this means that if the number of the probability masses of a random variable were uncountable, there would exist a positive value the number of points whose probability is greater than which would be uncountable. But then the total probability would be infinite, rather than the required unity. Hence no random variable may have an uncountable number of mass points. This argument is so powerful that in Appendix B I will use it to prove the theorem of analytic continuation.
In conclusion, I thank Dr. Hong for his discussion, and hope that the “ambitious actuaries” to whom at the end he appeals will continue to integrate the mathematics of excess losses into the broader wealth of modern mathematics.