Loading [MathJax]/jax/output/SVG/fonts/TeX/fontdata.js
Ludkovski, Michael, and Howard Zail. 2022. “Gaussian Process Models for Incremental Loss Ratios.” Variance 15 (1).
Download all (9)
  • Figure 1. Distribution of ILRs for each of the six business lines
  • Figure 2. Training a GP model for a toy one-dimensional example
  • Figure 3. Three-dimensional view of the loss square for a representative comauto data set with red dots indicating the training Lp,q’s and black dots indicating the bottom triangle to be completed
  • Figure 4. Predictive distribution of the incremental loss ratios Lp,q for three representative accident years
  • Figure 5. Left: Percentile rank of realized ultimate losses in terms of the predictive distribution of the ILR-Hurdle+Virt model across 57 wkcomp triangles. Right: Kolmogorov-Smirnov test across three models for wkcomp
  • Figure 6. Lengthscales ρAY and ρDL (left panel), observation variance σq (middle panel), and hurdle probability hq (right panel) for three representative business lines
  • Figure 7. Top row: 1,000 conditional simulations of future cumulative losses CCp,q for AY =1995,q=1,,10 and a representative comauto triangle. The solid cyan line is the predictive mean of CCp,q, and the dashed red line represents the actual realized losses. (Left: ILR-Plain model; Right: ILR-Hurdle+Virt model.) Bottom row: Predictive density of Rult=pCCp,Q, together with the realized ultimate losses (vertical line) for the same triangle. (Left: GP ILR-Plain model; Right: Partial Bayesian versus Full Bayesian for ILR-Hurdle+Virt)
  • B. Step-Ahead RMSE
  • Figure 8. RMSE of cumulative loss ratios as a function of step-ahead n across the six business lines


We develop Gaussian process (GP) models for incremental loss ratios in loss development triangles. Our approach brings a machine learning, spatial-based perspective to stochastic loss modeling. GP regression offers a nonparametric probabilistic distribution regarding future losses, capturing uncertainty quantification across three distinct layers—model risk, correlation risk, and extrinsic uncertainty due to randomness in observed losses. To handle statistical features of loss development analysis—namely, spatial nonstationarity, convergence to ultimate claims, and heteroskedasticity—we develop several novel implementations of fully Bayesian GP models. We perform extensive empirical analyses over the NAIC loss development database across six business lines, comparing and demonstrating the strong performance of our models. Our computational work is performed using the R and Stan programming environments and is publicly shareable.

Accepted: June 16, 2020 EDT