Embedded Predictive Analysis of Misrepresentation Risk in GLM Ratemaking Models
By Michelle Xia, Lei Hua, Gary James Vadnais
Misrepresentation is a type of insurance fraud that happens frequently in policy applications. Due to the unavailability of data, such frauds are usually expensive or difficult to detect. Based on the distributional structure of regular ratemaking data, we propose a generalized linear model (GLM) framework that allows for an embedded predictive analysis on the misrepresentation risk. In particular, we treat binary misrepresentation indicators as latent variables under GLM ratemaking models for rating factors that are subject to misrepresentation. Based on a latent logistic regression model on the prevalence of misrepresentation, the model identifies characteristics of policies that are subject to a high risk of misrepresentation. The method allows for multiple factors that are subject to misrepresentation, while accounting for other correctly measured risk factors. Based on the observed variables on the claim outcome and rating factors, we derive a mixture regression model structure that possesses identifiability. The identifiability ensures valid inference on the parameters of interest, including the rating relativities and the prevalence of misrepresentation. The usefulness of the method is demonstrated by simulation studies, as well as a case study using the Medical Expenditure Panel Survey data.
Keywords Misrepresentation, ratemaking, predictive analysis, generalized linear models, Bayesian inference, Markov chain Monte Carlo