Processing math: 100%
Spedicato, Giorgio Alfredo, and Marco De Virgilis. 2023. “A Review of ML Techniques for Vehicle Symbols Determination.” Variance 16 (2).
Download all (15)
  • Figure 1. Autoencoder
  • Figure 2. Optics Terminology, Ankerst et al. (1999)
  • Figure 3. PCA screeplot
  • Figure 4. PCA two dimension projections
  • Figure 5. Principal components vs frequency, one-way analysis
  • Figure 6. t-SNE Deciles vs frequency, one-way analysis
  • Figure 7. Mixed data H2O Kmeans
  • Figure 8. GLRM two dimension projections
  • Figure 9. GLRM archetype Decile vs frequency, one-way analysis
  • Figure 10. Mixmod clustering structure
  • Figure 11. OPTICS clustering structure
  • Figure 12. MOB tree model
  • Figure 13. MOB groups
  • Figure 14. Vehicles representation in Deep Features space
  • Figure 15. Claim frequency vs deep feature


Vehicle Symbols group vehicle types into homogeneous clusters for motor insurance ratemaking. Despite their relevance in the motor insurance industry, little attention has been paid to their determination throughout actuarial literature. The paper reviews existing approaches and presents suitable machine learning algorithms that can be used to define vehicle symbols based on claim frequency as risk measure. Such methods will be contrasted in terms of predictive performance. An empirical section illustrates the application of the discussed techniques using open-source software and freely available data sources.

The analysis found that a supervised approach based on Classification and Regression Trees (CART) outperformed the unsupervised approaches, however some latent features highlighted by Generalized Low Rank Models (GLRM) and Autoencoders showed interesting and relatively competitive performances.

Accepted: September 20, 2021 EDT