Kuo, Kevin, and Ronald Richman. 2026. “Embeddings and Attention in Predictive Modeling.” Variance 19 (April). https://doi.org/10.66573/001c.159901.
Download all (18)
  • Figure 1. Architecture of Models 2 and 4.
  • Figure 2. Learned embeddings of occupancy type visualized on a number line.
  • Figure 3. Learned embeddings of flood_zone by prefix.
  • Figure 4. First two principal components of learned embeddings of flood_zone in Model 4.
  • Figure 5. t-SNE plots of learned embeddings of flood_zone in Model 4, with various perplexity values. From left to right, then top to bottom: 2, 3, 5, and 10.
  • Figure 6. Architecture of Model 5.
  • Figure 7. Architecture of Model 6.
  • Figure 8. Embedding versus contextual embedding, flood zone, first PCA components.
  • Figure 9. Embedding versus contextual embedding, flood zone, first PCA components, points colored based on crawl space variable.
  • Figure 10. Histogram of the log claim amount.
  • Figure 11. Log-log plot of claim amount.
  • Figure 12. Distribution of observations across the primary residence covariate.
  • Figure 13. Distribution of observations across the occupancy type covariate.
  • Figure 14. Distribution of observations across the basement enclosure and crawl space covariate.
  • Figure 15. Distribution of observations across the number of floors covariate.
  • Figure 16. Distribution of observations across the flood zone covariate.
  • Figure 17. Histogram of the building coverage covariate.
  • Figure 18. Distribution of the community rating system discount covariate.

Abstract

We explore in depth how categorical data can be processed with embeddings in the context of claim severity modeling. We develop several models that range in complexity from simple neural networks to state-of-the-art attention-based architectures that utilize embeddings. We illustrate the utility of learned embeddings from neural networks as pretrained features in generalized linear models and discuss methods for visualizing and interpreting embeddings.

Finally, we explore how attention-based models can contextually augment embeddings, leading to enhanced predictive performance.

Accepted: June 19, 2023 EDT