Rebalancing the Off-Balance Factor with the Complement of Credibility

Joseph Boor

Boor, Joseph. 2022. “Rebalancing the Off-Balance Factor with the Complement of Credibility.” Variance 15 (1).

Abstract

The ratemaking algorithms used to calculate class factors, territory factors, allocations of rate changes to coverages and other types of rating values often require either an off-balance factor (off-balance correction), or, when capping is involved, a test correction factor. The current approach is to multiply a common off-balance or test correction factor by all the post-credibility rates, so that the weighted average of the final rates matches the overall rate indication. The present correction algorithm appears to have disparate impacts on some classes, though. For example, when correction factors are applied uniformly to the post-credibility rates, rates for already fully credible classes may be artificially raised or lowered.

However, mathematical equity does not specifically require that the current correction algorithm be used. Given that flexibility, it is important to use the most effective correction process possible. This paper considers an alternative process: applying the correction factor to the complement of credibility instead. It presents two ratemaking scenarios. In the limited fluctuation credibility scenario applying the off-balance correction to the complement of credibility appears to result in rates that are more reasonable than those the present process creates. Rates of fully credible classes are not altered, and the off-balance is split among the lower credibility classes that generate it. Further, this process does not sacrifice or mitigate the reduction in volatility that credibility provides. This process also creates the most plausible rates possible given the circumstances. Further, the mathematics of optimization suggest that the complement of credibility should also be used to distribute the off-balance in best estimate ratemaking. If one begins with best estimate credibility rates for each class, the paper shows that the final off-balance adjusted class rates have the minimum expected squared error in predicting the loss costs. Thus, that simple change in the ratemaking formula appears to be helpful in a wide variety of situations.

1. Introduction

Off-balance and test correction factors are pervasive in class ratemaking and other circumstances where an overall change must be allocated to subsets of a book of business, but some of the subsets are not fully credible. For example, the off-balance is specifically discussed in Modlin and Werner (2016) and the test correction factor in mentioned in Daley (2009). The two types of factors work similarly, but test correction factors are special off-balance factors that also compensate for rate capping. In some situations, the correction factor is minor, but in other cases it is a substantial part of the pricing. When the off-balance is significant, using the best correction algorithms available will help compute the most effective rates.

As part of fine tuning the off-balance algorithm, this paper presents formulas that eliminate some undesirable aspects of the current algorithm. Additionally, these new correction algorithms are based on the theories underlying each of the credibility formulas commonly used in ratemaking. So, they are individually attuned to the mathematics underlying the corresponding approaches to credibility. Therefore, one would expect them to generate more accurate class, etc. indications.

Of course, those statements must be supported in the body of this paper. As a starting point, the properties of the current algorithm are relevant. Usually, when individual class rates are made for a large group of classes, the resulting exposure-weighted average of the post-credibility rates does not match the raw overall average rate in the loss and exposure data. So, the rates (or loss costs) must be adjusted so that they do average to the (presumably fully credible or “almost”^[1] fully credible) overall average rate. Actuaries currently rebalance the rates resulting from the credibility process by multiplying all those post-credibility class rates by either a common off-balance factor or a common test correction factor to match the overall average rate. Of note, when the data elements underlying some of the individual rates are also fully credible (or almost fully credible), this will result in rates that are clearly inconsistent with the underlying data. In effect, because those classes are so credible by themselves, one may be fairly certain that, after altering the rates with the off-balance factor, the resulting rates will be either too high or too low. So, the challenge is to develop methods that simultaneously avoid this unreasonable behavior and best fit the rationale for each credibility method.

To illustrate how significant this issue can be, the next section will present two examples that could occur in common actuarial practice where the current algorithm creates some very significant distortions in some of the rates. Following some background to support the notation and certain formulas, a corrected off-balance algorithm (“Method 1”) for limited fluctuation credibility is presented in the article. Next, a similar algorithm (“Method 2”) for best estimate credibility is presented. Note that the methods involve exactly the same formula. Therefore, given the same final credibilities (an unlikely event given the differences in their basic goals) and the same data, the results of the two methods are identical. Then, the special concerns and relevance of the methods in a test correction situation are discussed. In each section, examples are provided. Lastly, further support and extensions of Method 2 are provided in appendices.

2. The Current Off-Balance Factor Algorithm in Typical Class and Subline Ratemaking Schemes

Two examples of how the current off-balance factor works in practice are provided in this section. This should lend some perspective to issues present in the current algorithm. Each begins with some sample data and shows all the resulting calculations and final indications. Reviewing those, one may evaluate the effectiveness of the current algorithm

The first case, in Table 1, involves the subsets of a small personal lines book of business (although the volatility might also match a medium sized commercial book), where the complement of credibility is assigned to no rate change. The subsets are listed as coverages, but they could also be states, producers, major classes, or any other relevant split of the book of business into subsets. The credibility of the individual coverages ranges from modest to full credibility, presumably per a limited fluctuation credibility approach. Although this is not the type of workers compensation or general liability class ratemaking problem where one would see a correction factor in the actuarial literature, this often arises in practice. One may see that the credibility adjusted changes for individual coverages combine together to produce a 6.9% increase. Consequently, they do not generate a sufficient overall change to match the 14.4% increase that the all classes combined dataset indicates is necessary. Therefore, standard ratemaking technique would introduce an additional loading factor, the off-balance factor, which would be multiplied by all the indicated “change ratios” (1.0+the rate change) to obtain the final rate changes by coverage. As one may see, the fully credible +16.9% for Coverage E becomes +25.1% after off-balance correction. That would appear to be excessive compared to the fully credible +16.9% indication. The other rate indications appear to be somewhat plausible, but given the limited credibility it is difficult to assess them critically. Considering class E, this approach appears to distort the rates in some classes (in this case, a class) considerably.

Next, calculations using the typical class rate ratemaking algorithm with an off-balance factor are shown below in Table 2. This tableau is an abstract version of what one might see in an individual company’s internal rate analysis or rate filing, or in rating/advisory organization ratemaking. So, in this case, the complement of credibility is assigned to the overall average rate.

A few aspects of the calculations should be disclosed, though. First, in this context, expense loadings create unnecessary calculations. Therefore, all the examples in this paper (except Tables 1 and 3) use “rate” to denote what are actually “loss costs”. Second, generalized exposures are used, but this could just as easily have used current level earned premium and loss ratios rather than rates. Third, the full credibility standard should be mentioned. Of course, given freedom to choose any confidence level and accuracy level, the full credibility standard may take any value. But, in order to use the limited fluctuation credibility process, some standard must be chosen. Since it is already present in the literature, a full credibility standard of 683 expected claims was used to generate the credibilities.

Table 1.Allocation of an Overall Change to a Few Subsets with Sample Loss Ratio Data

	(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)
	(Data)	(Data)	(2)/(1)	(Data)	(Data)	[((3)/(5))-1.0] $\times$ (4)	[Combined(6)]/[Weighted(6)]	(1.0+(6)) $\times$ (7)-1.0
Coverage	On-Level Earned Premium	Trended Ultimate Losses	Loss Ratio	Credibility	Permissible Loss Ratio	Indicated Changes	Off-Balance Factor	Revised Indicated Changes
A	$ 450,000	$ 900,000	200.0 %	25 %	65 %	52.3 %		63.0 %
B	$ 500,000	$ 500,000	100.0 %	27 %	65 %	14.3 %		22.3 %
C	$ 1,000,000	$ 800,000	80.0 %	38 %	65 %	8.7 %		16.3 %
D	$ 3,000,000	$ 1,400,000	46.7 %	65 %	65 %	-18.3 %		-12.6 %
E	$ 5,000,000	$ 3,800,000	76.0 %	100 %	65 %	16.9 %		25.1 %
Weighted Average						6.9 %		14.4 %
Combined Total	$ 9,950,000	$ 7,400,000	74.4 %	100 %	65 %	14.4 %	1.070

Table 2.Ratemaking Scenario with Off-Balance Factor Using Sample Class Ratemaking Data

Main Calculations for Class Rates
	(1)	(2)	(3)	(4)	(5)	(6)	(7)
	(Data)	(Data)	(1)/(2)	(1) $\times$ C	$\sqrt{(4)/683}$	(3) $\times$ (5)+A(1.0-(5))	(6) $\times$ E
Class	Exposure	Losses	Raw Rate "L"	Expected Claim Count	Credibility of Class	Credibility Adjusted Rate	Corrected Rate
1	25	$ 78,427	$ 3,137	65	31 %	$ 1,324	$ 1,564
2	30	$ 40,687	$ 1,356	78	34 %	$ 800	$ 946
3	36	$ 65,073	$ 1,808	93	37 %	$ 994	$ 1,174
4	43	$ 35,837	$ 830	112	40 %	$ 644	$ 761
5	52	$ 59,918	$ 1,156	134	44 %	$ 800	$ 946
6	62	$ 72,435	$ 1,164	161	49 %	$ 832	$ 983
7	75	$ 63,990	$ 857	193	53 %	$ 698	$ 825
8	90	$ 57,059	$ 637	232	58 %	$ 587	$ 694
9	107	$ 90,110	$ 838	278	64 %	$ 722	$ 853
10	129	$ 46,934	$ 364	334	70 %	$ 410	$ 485
11	155	$ 47,281	$ 305	401	77 %	$ 355	$ 420
12	186	$ 54,427	$ 293	481	84 %	$ 329	$ 389
13	223	$ 69,726	$ 313	577	92 %	$ 329	$ 389
14	267	$ 64,108	$ 240	692	100 %	$ 240	$ 283
15	321	$ 86,197	$ 269	831	100 %	$ 269	$ 317

Total	1,801	$ 932,211	$ 518	4,661	%	$ 438	$ 518
Reference Values for All Classes
	A =Needed Overall Average Rate					$518
	B = Severity (Computed Outside the Table)					$200
	C = A/B = Expected Claims/Exposure					2.588
	D = Average Z-Adjusted Rate					$438
	E = A/D = Off-Balance Factor					1.1814

The general setup of the table is as follows. The top portion of the table (“Main Calculations for Class Rates”) has the main analysis. The bottom portion ("Reference Values’) has computations of various overall values that are needed for the main analysis. The table is annotated with the calculation formulas that describe each column. In order to compute the credibility values, the average severity is needed. The severity value used in the table, $200 is an additional input needed in addition to the class’s data. Of course, actuaries compute average severities fairly often, so computing this should not create a burden.

Note that with this type of data and complement of credibility a very large off-balance may be required. In this example it is 1.184. Consequently, the off-balance appears to generate clearly excessive rates for nearly fully credible classes in addition to fully credible classes. The schemes in the Tables 1 and 2 are similar, and as the examples suggest, both generate high corrections. Corrections this substantial may not arise in all ratemaking situations, but the examples illustrate how they can come about. So, there are situations where the off-balance factor is both very significant and very prone to distort fully credible rates.

A few things may be said about when such a large off-balance may arise. It appears that when the data multiplied by the complement of credibility is very similar to the new pre-credibility indications the off-balance will have a smaller size (up or down). Also, when the credibility of each class or other rating element is high, there is a lesser off-balance. Further, the situation is amplified when the smaller classes generally have lower indications than the larger classes. So, the amount of concern for the outsized impact of the off-balance on the high credibility classes varies from ratemaking situation to ratemaking situation. However, as the examples in Tables 1 and 2 illustrate, the impact can be quite significant.

3. Background: Notation and an Optimization Technique

Later sections deal with alternate views of the off-balance. To make the discussion in those sections easier to follow, this section contains some notation for various quantities used in rate calculations, as well as a reminder about the mathematics associated with finding values that simultaneously provide a best estimate and fulfill certain limitations (a “constraint”).

The notation is:

$\begin{align} n &= \mbox{ the number of classes (more generally} \\ &\quad \ \mbox{ subsets, etc.) included in the line of} \\ &\quad \ \mbox{ business or program;}\\ i &= \mbox{ index running through the individual} \\ &\quad \ \mbox{ classes;}\\ e_i &= \mbox{ exposures for class $i$;} \\ l_i &= \mbox{ historical losses for class $i$';} \\ L_i &= \frac{l_i}{e_i} = \mbox{ unadjusted, pre-credibility loss} \\ &\quad \ \mbox{ rate for class $i$;} \\ Z_i &= \mbox{ credibility for class $i$;} \\ M &= \ \frac{\sum_1^n l_i}{\sum_1^n e_i} = \mbox{ overall loss rate (grand} \\ & \ \quad \mbox{ mean), (so }Z_iL_i+(1-Z_i)M \mbox{ is the} \\ &\quad \ \mbox{ post-credibility loss rate for class }i \mbox{);} \\ T_i &= \mbox{ off-balance add-in for class $i$;} \\ r_i &= L_iZ_i+(1-Z_i)M +T_i \\ &= \mbox{ final off-balance corrected loss rate} \\ &\quad \ \mbox{ for class $i$;} \\ \mu_i &= \mbox{ true (but unknown) underlying loss} \\ &\quad \ \mbox{ rate for class $i$ (target of the} \\ &\quad \ \mbox{ ratemaking process); and} \\ C &= \mbox{ multiplier, constant across all classes,} \\ &\quad \ \mbox{to be used in off-balance and, later,} \\ &\quad \ \mbox{ test correction.} \\\end{align} \tag{3.1}$

Note that the off-balance or test correction is expressed as an add-in, rather than its common form as a factor. This will facilitate later computations (which, in all the cases in this paper, do eventually result in factors).

Notation associated with a specific method, the “Lagrange Multiplier” method from calculus, is needed. One method for calculating the off-balance add-ins requires the computation of the best estimate when the $T_i$ ’s are limited to values which fulfill a constraint. For example, a method may target that the $r_i$ ’s generated by the $T_i$ ’s minimize error in some way. But all off-balance approaches also require that the constraint $\sum e_ir_i/\sum e_i = M$ (that the average rate resulting from the process matches the average rate in the data) be satisfied.

The notation in this case uses an expression, $S(T_1, ... , T_n),$ whose minimum sets the criterion (for example, “least squared error”) for the best estimate of the true underlying means $\mu_i.$ However, that minimum must only consider the values that fulfill the constraint. So, the calculations also require a Lagrange multiplier, $\lambda,$ times a constraint term requiring that the off balance-corrected rates weight to the overall mean $M.$ In effect, the combined function, defined here as “ $Q$ ”, would look like

$\begin{align} &Q(T_1, ... ,T_n) \\ &\ = S(T_1, ... ,T_n) \\ &\quad \ -\lambda \left[\frac{\sum_1^n e_i(Z_iL_i+(1-Z_i)M+T_i)}{\sum_1^ne_i }- M \right]. \end{align} \tag{3.2}$

Then, one must compute the corresponding partial derivatives by each of the $T_i$ ’s (and $\lambda$ as well). The mathematics of Lagrange multipliers then dictates that the values of the $T_i$ ’s and $\lambda$ where all those derivatives are simultaneously zero will yield the minimum possible value of $S,$ subject to the constraint.

4. Method 1: Leave the Credible Data Alone—Spreading the Off-Balance Correction Across the Complement of Credibility

The first step in establishing an off-balance algorithm is to carefully articulate the goal of the algorithm. Among all possible ways to allocate the overall off-balance, this establishes a way to determine which one is best. As may be seen in the remainder of the paper, different off-balance approaches result from different definitions of why an off-balance approach is “best”. Once each such target is set, the problem may be expressed in mathematical terms and the optimum algorithm may be calculated.

This case begins with the theory of “square root” or “limited fluctuation” credibility (and similar credibility calculations). In that methodology, the experience data is the focus of the calculation, and the statistic receiving the complement of credibility is essentially ballast. So, from that viewpoint, it makes sense to decide that the $Z_iL_i$ ’s should be unaffected under the correction algorithm. Rather, the aggregate off-balance

$M\sum_{i = 1}^n e_i- \sum_{i = 1}^n e_i \times \left[Z_iL_i+(1-Z_i)M\right] \tag{4.1}$

should be pro-rated among the $e_i(1-Z_i)M$ ’s. This is still a multiplier approach. But here the multiplicative factor is multiplied by a different"basis" for the correction^[2]. In this case the factor is multiplied by the complement of credibility-based portion of each rate, or $T_i=C\times(1-Z_i)M.$ This approach preserves the basic credibility process. Further, since the large, fully credible or almost fully credible classes have very small complements of credibility, the correction algorithm will barely affect them. As a result, this off-balance correction algorithm eliminates one of the main problems with the current approach.

Considering the details underlying limited fluctuation credibility, one may make a stronger statement about this method. Specifically, this method creates the most compliance with the underlying theory of limited fluctuation credibility. Therefore, it is optimal. The reasons for that begin with the basic theory of limited fluctuation credibility. The original theory held that one was limiting the probability that randomness in the loss data would cause a spurious increase or decrease beyond some chosen threshold, limiting say, the probability of a spurious 15% increase to under 5%. The current usage seems to focus instead on spurious differences from a some benchmark (“ $A$ ” in Table 2). However, the current off-balance factor, when it is above unity, magnifies the size of all the rates, including any rates that are close to the threshold. Thus, it enhances the probability that the criteria underlying the credibility will not be fulfilled. Since this “'Method 1” adds to the benchmark or complement term and not the random loss rate $L_i,$ it could still swing a rate beyond the threshold. However, it does not disproportionately affect the rates with larger changes that are somewhat more prone to exceed the threshold.

Other issues must be considered as well. Just like most of the insurance premium, considerations of equity would require that the off-balance be allocated using expected losses. Since the only available unbiased estimators of $\mu_i$ are $L_i$ and $M,$ any allocation of the off-balance that tracks evenly across expected losses must use the same linear combination of the $L_i$ ’s and $M$ for each of the $i$ ’s. Since the $L_i$ ’s are more volatile, assigning all of the off-balance to $M$ would clearly be optimal. Thus, it is an optimal method.

Further, given that the only unbiased estimators for each class rate are the raw data rate and the overall average rate, and that the credibility is to be maintained (eliminating the possibility of varying the mix of the two unbiased estimators from class to class^[3]), this method generates the most plausible rates given the data and credibility values.

For illustration, the first two examples in section 2 may be reworked with the off-balance spread across the complement of credibility rather than the entire rate. Both the final indications resulting from applying the off-balance to the complement of credibility and the values from Table 1 in Section 2 that they replace are shown in Table 3. So the final results may be readily compared.

As one may see, the rate for the fully credible coverage E is now proper, and the smaller classes bear most of the weight of the off-balance they are primarily responsible for. The largest effects of the credibility process are on coverages A and B, with their pre-credibility 200% and 100% loss ratios and lower credibility. However, the resulting indications for those coverages are still very reasonable in comparison to the raw data. The combined credibility and off-balance computations in Table 3 did not affect the indications for those classes as much as the process in Table 1 reduced them. However, spreading the off-balance without using Coverage E meant that Coverages C and D were more affected than in Table 1. From a standpoint of overall fairness, though, this appears to be much more equitable than the present approach.

Sample calculations using this method with the by-class data from Table 2 is contained in Table 4. The table is annotated with the calculation formulas as well as the mathematical formulas that describe each column. As with Table 3, the result of Table 2 are shown in column (12) for comparison.

As one might expect, this generates much improved rates for classes 14 and 15. The class 13 rate is also much improved. Chance generated little difference in the class 12 rate. Lastly, the rates for the other classes are generally increased, in keeping with their loss experience generally being higher than the complement of credibility. So, once again, applying the test correction to the complement of credibility term appears to generate a better result than the present system .

Thus, strengths of this approach are it’s mitigation of off-balance effects on fully credible and nearly fully credible classes, that it focuses correction on the classes generating most of the off-balance, and how many circumstances it may be used in. When the off-balance matters, this can improve the quality of the rates.

5. Method 2: An Approach for Best Estimate Credibility: Minimizing the Expected Squared Error After Off-Balance

As discussed in Section 4, best estimates are driven by the mathematical definition chosen for “best”^[4]. This section relates to the “ $P/(P+K)$ ” $(e_i/(e_i+K)$ in the notation of this paper) best estimate credibility developed in Bailey (1945), where $K$ is the ratio of the expected process variance of the losses in a single unit of $P$ or $e$ divided by the variance of the hypothetical (or possible) values of the mean loss rate. Most actuaries would say that the $P/(P+K)$ formula and formulas derived from it are the most commonly used best estimate credibility formulas. Hence, the assumptions and approach underlying that formula are the foundation of the off-balance approach in this section.

Table 3.Complement of Credibility Off-Balance Allocations for Limited Fluctuation Credibility with the Few Subsets/Classes in Table 1

	(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)	(10)	(11)	(12)
	(Data)	(Data)	(2)/(1)	(Data)	(Data)	((3)/(5))-1.0	(1.0-(4))*(5)	((1)*(7)	(1) $\times$ [Combined (6)]-[Weighted (6)]	(9) $\times$ (8)/[Total (8)]	(6)+[(10)/(1)]	Table 1 Col. (8)
Coverage	On-Level Earned Premium	Trended Ultimate Losses	Loss Ratio	Credibility	Permissible Loss Ratio	Indicated Changes	Complement of Credibility Term	Dollar Amount of Complement	Dollar Deficiency in Base Calculation	Off-Balance Correction Pro-Rated by Dollars in Complement	Final Off-Balance Corrected Indications	Indications Under Old Off-Balance Approach
A	$ 450,000	$ 900,000	200.0 %	25 %	65 %	52.3 %	48.6	$ 218,865		$ 105,474	75.7 %	63.0 %
B	$ 500,000	$ 500,000	100.0 %	27 %	65 %	14.3 %	47.8	$ 238,758		$ 115,061	37.3 %	22.3 %
C	$ 1,000,000	$ 800,000	80.0 %	38 %	65 %	8.7 %	40.6	$ 406,070		$ 195,691	28.2 %	16.3 %
D	$ 3,000,000	$ 1,400,000	46.7 %	65 %	65 %	-18.3 %	22.8	$ 682,500		$ 328,906	-7.4 %	-12.6 %
E	$ 5,000,000	$ 3,800,000	76.0 %	100 %	65 %	16.9 %	0.0	$ 0			16.9 %	25.1 %
Weighted Average						6.9 %					14.4 %	14.4 %
Combined Total	$ 9,950,000	$ 7,400,000	74.4 %	100 %	65 %	14.4 %		$ 1,546,192	$ 745,132	$ 745,132

Table 4.Complement of Credibility Off-Balance Allocations for Limited Fluctuation Credibility on Class Data from Table 2

Main Calculations for Class Rates
	(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)	(10)	(11)	(12)
$i$	$e_i$	$l_i$	$L_i$	$c_i$	$Z_i$	$Z_iL_i +(1-Z_i)M$	$Z_iL_i +(1-Z_i)M \times e_i$	$e_i(1-Z_i)$	$e_i(1-Z_i)M\times C$	$e_i\times r_i$	$final \ r_i$	$old \ r_i$
	(Data)	(Data)	(1)/(2)	(1) $\times$ C	Sqrt((4)/683)	(3) $\times$ (5)+A(1.0-(5))	(6) $\times$ (1)	((1) $\times$ (1.0-(5)) $\times$ A	(8) $\times$ H	(5) $\times$ (3) $\times$ (1)+(9)	(5) $\times$ (10)/(1)	Table 2 Col. 7
Class	Exposure	Losses	Raw Rate "L"	Expected Claim Count	Credibility of Class	Credibility Adjusted Rate	Losses in Adjusted Rates	Losses From Complement of Credibility	Complement Losses After Off-Balance Correction	Off-Balance Corrected Total Losses	Off-Balance Corrected (Rate	Indications Under Old (Off-Balance
1	25	$ 78,427	$ 3,137	65	31 %	$ 1,324	$ 33,097	$ 8,958	$ 15,465	$ 39,605	$ 1,584	$ 1564
2	30	$ 40,687	$ 1,356	78	34 %	$ 800	$ 24,012	$ 10,293	$ 17,771	$ 31,489	$ 1,050	$ 946
3	36	$ 65,073	$ 1,808	93	37 %	$ 994	$ 35,787	$ 11,752	$ 20,290	$ 44,324	$ 1,231	$ 1,174
4	43	$ 35,837	$ 830	112	40 %	$ 644	$ 27,814	$ 13,314	$ 22,986	$ 37,486	$ 868	$ 761
5	52	$ 59,918	$ 1,156	134	44 %	$ 800	$ 41,498	$ 14,941	$ 25,795	$ 52,352	$ 1,010	$ 946
6	62	$ 72,435	$ 1,164	161	49 %	$ 832	$ 51,736	$ 16,567	$ 28,602	$ 63,771	$ 1,025	$ 983
7	75	$ 63,990	$ 857	193	53 %	$ 698	$ 52,124	$ 18,089	$ 31,231	$ 65,265	$ 874	$ 825
8	90	$ 57,059	$ 637	232	58 %	$ 587	$ 52,598	$ 19,353	$ 33,413	$ 66,657	$ 744	$ 694
9	107	$ 90,110	$ 838	278	64 %	$ 722	$ 77,642	$ 20,130	$ 34,753	$ 92,265	$ 858	$ 853
10	129	$ 46,934	$ 364	334	70 %	$ 410	$ 52,902	$ 20,088	$ 34,681	$ 67,495	$ 523	$ 485
11	155	$ 47,281	$ 305	401	77 %	$ 355	$ 54,971	$ 18,759	$ 32,386	$ 68,598	$ 443	$ 420
12	186	$ 54,427	$ 293	481	84 %	$ 329	$ 61,145	$ 15,482	$ 26,728	$ 72,392	$ 390	$ 389
13	223	$ 69,726	$ 313	577	92 %	$ 329	$ 73,422	$ 9,338	$ 16,122	$ 80,206	$ 360	$ 389
14	267	$ 64,108	$ 240	692	100 %	$ 240	$ 64,108	-	-	$ 64,108	$ 240	$ 283
15	321	$ 86,197	$ 269	831	100 %	$ 269	$ 86,197	-	-	$ 86,197	$ 269	$ 317

Total	1,801	$ 932,211	$ 518	4,661		$ 438	$ 789,053	$ 197,065	$ 340,223	$ 932,211	$ 518	$ 518
Reference Values for All Classes
	A =Needed Overall Average Rate					$ 518
	B = Severity (Computed Outside the Table)					$ 200
	C = A/B = Expected Claims/Exposure					2.588
	D = Total Losses in Data					$ 932,211
	E = Total Losses in Adjusted Rates					$ 789,053
	F = D-E = Off-Balance in $					$ 143,158
	G = Total Losses in $(1-Z_i)$ Term					$ 197,065
	H = 1.0+F/G = Off-Balance Factor					1.742

Therefore, the assumptions and definition of the best estimate in this section will match those of that article. The first assumption is that the overall mean rate in the data $(M)$ contains so little process or parameter variance that it can be functionally treated as an exact measure of the overall average expected loss rate. The remainder of the Bailey (1945) assumptions are: that each true but unknown class mean $\mu_i$ is an independent sample from a common distribution with mean equal to the overall mean $M$ and parameter variance $\sigma^2$ ; and, that all individual exposures have a common, independent from everything else and all other exposures, process variance of $s^2$ per exposure. Thus, each raw loss rate $L_i$ is affected by the “observation” error $(L_i-\mu_i)$ with process variance $s^2/e_i,$ that interferes with predicting the underlying but unknown true loss rates $(\mu_i$ ’s). Then, to estimate each $\mu_i$ using $L_i$ and $M,$ the Bailey formula chooses each credibility $Z_i$ to be $e_i/(e_i+K),$ where $K=\frac{\mbox{expected process variance}}{\mbox{variance of the hypothetical means}}=\frac{s^2}{\sigma^2},$ to minimize^[5] each expected squared error term

$E\left[ \{Z_iL_i+(1-Z_i)M-\mu_i\}^2|\sigma^2,s^2\right]. \tag{5.1}$

The $T_i$ ’s should fit within that framework. Of course, the left hand side inside the brackets is only an estimate of $\mu_i.$ Since the post-credibility rate is the optimum estimate of the rate for each $\mu_i$ in isolation (given that $M$ is the true underlying overall mean), the estimates are point-by-point optimal when $T_i=0.$ However, in that scenario the rates will not average to the overall mean. Thus, if the rates are to match the overall mean, one must initially rephrase the expression to

$\begin{align} &E\left[ \{Z_iL_i+(1-Z_i)M+T_i-\mu_i\}^2\right] \\ &= minimum\mbox{ (for each }i \mbox{),} \end{align} \tag{5.2}$

given the overall constraint, or

$\begin{align} &\sum_{i=1}^n \left[E\left[ \{Z_iL_i+(1-Z_i)M+T_i-\mu_i\}^2\right] \right] \\ &= minimum, \end{align} \tag{5.3}$

subject of course to the rates averaging to the overall mean.

$\frac{ \sum_1^n e_i (Z_iL_i+(1-Z_i)M+T_i)}{ \sum_1^n e_i } = M. \tag{5.4}$

Under those assumptions and criteria, Appendix A also shows that all of the $T_i$ ’s should be in proportion to the “same” (similar formula, but with different underlying credibilities) $(1-Z_i)M$ ’s as in Method 1. Thus, up to differences in the underlying credibilities, Method 1 and Method 2 are the same. That illustrates how robust the formula is.

As with the Method 1 analysis, this should be reviewed, in this case conceptually, to review whether it generates more reasonable results than the current method. As expected, this method eliminates the problems with the current off-balance factor method. Just as with Method 1, as the credibility of a class approaches unity, use of $(1-Z_i)M$ as a basis will prevent the large changes to the nearly fully credible post-credibility rates. Further, it also assigns more of the the off-balance to the classes that generate more off-balance, without bypassing the credibility process. Considering the improvements in both accuracy and reasonableness this approach offers, this can generate significant^[6] improvement in the accuracy of the final rates.

Importantly, this method is “scalable”. Any needed amount of correction can be obtained by adjusting the multiplier $C$ without changing the basis $(1-Z)$ for off-balance correction. If more correction than that needed to balance to $M$ is required (for example, to compensate for capping), then one need only increase the multiplier $C.$

It is, however, relevant to ask whether or not increasing the multiplier in Method 2 additionally to offset capping (or other processes) will still generate optimal results. As discussed in Appendix A, the basis $(1-Z_i)M,$ without any changes, is still an optimal basis when the amount of correction changes. So, it provides a scalable formula for the optimum correction values. Thus, when extra correction is needed, all the off-balance formulas are flexible enough to generate appropriate results.

Table 5 illustrates the algorithm flow from raw data to final rates in a step-by-step fashion. The table is annotated with the calculation formulas as well as the mathematical formulas that describe each column. Except as noted below, all the calculations from raw data to conclusion are shown. As one may see, the calculations are not overly difficult. The calculations are split among those that determine the underlying parameters needed for the credibility process in the main calculations (the “First Step”, with values denoted by Roman numerals); those that comprise the main part of the analysis (the “Main Step”, with columns denoted by numbers); and those in a “Reference Values” table containing the overall totals and other overall quantities (denoted by letters) needed for the computations in the Main Step.

Various formulas for estimating the variance structure $s^2$ and $\sigma^2$ have been published. The example uses the nonparametric variance structure estimation method for Bühlmann-Straub data documented in Dean (2005). The calculations begin with the $\alpha^2$ and $\beta^2$ described in the paper^[7]. Due to the number of required computations, $\alpha^2$ is presumed to have been computed outside the table. Note that the value of $\beta^2$ matches the data in the main table but the computations are not shown. Details of the calculation formulas for the variance structure are presented in Appendix B. However, reviewing the formulas , $\alpha^2$ and $\beta^2$ should not be overly difficult to compute with modern tools.

Table 5.Off-Balance Factor Computations for Best Estimate Credibility (Method 2)

First Step: Calculation of Basic Variance Parameters
		I. Sum of Squared Differences from Sample Means Within Classes " $i$ " = $\alpha^2$ =							7.379E+9
		II. Sum of Exposures Times Squared Differences Between Class Sample Means " $L_i$ 's" and Overall Mean " $M$ " = $\beta^2$ =							3.941E+8
		III. = I./((total(1)-15(=" $n$ "))=Process Variance = $s^2$ =							4,131,869
		IV. = [II.- ( $n$ -1.0)III.]/[total(1)- $\sum_{i=1}^{15}e_i^2$ /total(1)]= Variance of Hypothetical Means = $\sigma^2$ =							208,304
		V. III./IV = Credibility Constant= $K$ =							19.84
Second Step: Main Calculations for Class Rates
	(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)	(10)
$i$	$e_i$	$l_i$	$L_i$	$Z_i$	$Z_iL_i+(1-Z_i)M$	$[Z_iL_i+(1-Z_i)M]\times e_i$	$e_i(1-Z_i)$	$e_i(1-Z_i)M\times C$	$e_i\times r_i]$	$final \ r_i$
	(Data)	(Data)	(1)/(2)	(1)/((1)+V.)	(3) $\times$ (4)+A $\times$ (1.0-(4))	(5) $\times$ (1)	(1) $\times$ (1.0-(4))	(7) $\times$ A $\times$ F	(6)+(8)	(5) $\times$ (9)/(1)
Class	Exposure	Losses	Raw Rate "l"	Credibility of Class	Credibility Adjusted Rate	Losses in Adjusted Rates	Off-Balance Correction Basis	Additional Losses From Off-Balance Correction	Off-Balance Corrected Total Losses	Off-Balance Corrected (Rate
1	25	$ 78,427	$ 3,137	56 %	$ 1,978	$ 49,456	$ 11	$ 3,162	$ 52,617	$ 2,105
2	30	$ 40,687	$ 1,356	60 %	$ 1,022	$ 30,674	$ 12	$ 3,413	$ 34,087	$ 1,136
3	36	$ 65,073	$ 1,808	64 %	$ 1,349	$ 48,576	$ 13	$ 3,656	$ 52,232	$ 1,451
4	43	$ 35,837	$ 830	69 %	$ 731	$ 31,596	$ 14	$ 3,886	$ 35,482	$ 821
5	52	$ 59,918	$ 1,156	72 %	$ 979	$ 50,762	$ 14	$ 4,101	$ 54,863	$ 1,058
6	62	$ 72,435	$ 1,164	76 %	$ 1,008	$ 62,708	$ 15	$ 4,299	$ 67,007	$ 1,077
7	75	$ 63,990	$ 857	79 %	$ 786	$ 58,669	$ 16	$ 4,480	$ 63,149	$ 846
8	90	$ 57,059	$ 637	82 %	$ 615	$ 55,121	$ 16	$ 4,642	$ 59,764	$ 667
9	107	$ 90,110	$ 838	84 %	$ 788	$ 84,741	$ 17	$ 4,787	$ 89,528	$ 833
10	129	$ 46,934	$ 364	87 %	$ 384	$ 49,578	$ 17	$ 4,914	$ 54,492	$ 422
11	155	$ 47,281	$ 305	89 %	$ 330	$ 51,012	$ 18	$ 5,026	$ 56,038	$ 362
12	186	$ 54,427	$ 293	90 %	$ 315	$ 58,453	$ 18	$ 5,123	$ 63,576	$ 342
13	223	$ 69,726	$ 313	92 %	$ 330	$ 73,457	$ 18	$ 5,207	$ 78,664	$ 353
14	267	$ 64,108	$ 240	93 %	$ 259	$ 69,241	$ 18	$ 5,279	$ 74,520	$ 279
15	321	$ 86,197	$ 269	94 %	$ 283	$ 90,850	$ 19	$ 5,340	$ 96,191	$ 300

Total	1,801	$ 932,211	$ 518		$ 480	$ 864,895	$ 235	$ 67,315	$ 932,211	$ 518
Reference Values for All Classes
	A = Overall Average Rate					$ 518
	B = Total Loss in Credibility Adjusted Rates					$ 864,895
	C= Total Losses in Data					$ 932,211
	D = C-B = Shortfall					$ 67,315
	E =Total Off-Balance Correction Basis					$ 235
	F = D/E = Off-Balance Factor					285.86

Since a comparison of final rates using Method 1 to those obtained under the current algorithm was already included in section 4, this does not include a similar comparison. Further, the example of the current method only used limited fluctuation credibility, so no comparison is already available within this paper^[8]. One may note, though, that since there is no full credibility under best estimate credibility, the values for the almost fully credible classes may experience some minor adjustments from the off-balance.

As a whole, this section explains why the $(1-Z_i)M$ ’s make the best basis and shows complete calculations to create rates using best estimate credibility and this best estimate off-balance algorithm. In summary, this off-balance approach, derived from best estimate credibility, provides a very effective off-balance correction method.

6. Enhancements to the Best Estimate Formula

Additional results and information are provided in other appendices. Appendix C shows that the formula above still works when the process error variance per exposure varies from class to class. It also provides a similar, but not quite identical, formula for the basis when the parameter variance differs from class to class. One may conclude that, as long as the best estimate $T_i$ ’s are defined in terms of minimum expected squared error and also when something other than a best estimate underlies the rates, the complement of credibility term^[9] $(1-Z_i)M$ usually forms the best basis for distributing the off-balance. This very general result, involving quantities that are already in the rate computation, should simplify any conversion to distributing the off-balance across the complement of credibility terms.

Two other appendices with relevant topics are included as well. Appendix D explains how, in the absence of capping, replacing $M$ with the Bühlmann credibility-weighted mean can eliminate the need for off-balance correction in the best estimate scenario. Considering the Central Limit Theorem, Appendix E shows that when the various probability distributions are approximately normal, then the formula of Section 5 approximates the maximum likelihood estimate. Considering the amount of losses in many ratemaking scenarios, that would suggest that one may often expect an approximate maximum likelihood estimate when this approach is used with best estimate credibility.

7. Comparison of the Two Methods

One cannot help but notice that formula for distributing the off-balance is exactly the same in Method 1 and Method 2. Of course, the credibilities used in that formula could be expected to vary considerably between limited fluctuation credibility and best estimate credibility. However, the algorithm employed following the calculation of the initial post-credibility class rates is exactly the same.

Also, while scalability was discussed for Method 2, it is apparent that Method 1 is scalable as well. Its basis is the set of complement of credibility portions of the rates, and clearly the multiplier may be adjusted as needed. Even the current algorithm, where the basis is the set of post-credibility rates, allows for the basis to be multiplied by whatever off-balance factor is needed to achieve whatever overall rate level is needed.

Overall, Method 1 and Method 2 are very similar, suggesting that the algorithm in this article may be used in a broad variety of situations.

8. Testing the Test Correction Factor: Off-Balance Corrections for Credibility and Capping Combined

When capping rates for individual classes, the off-balance process becomes a test correction process. For example, one might begin with a class ratemaking system that generates a group of rates, but then specify that no rate receive more than a 25 percent increase or 20 percent decrease. If more rates^[10] are capped from above than from below, then the average rate after the capping will be lower than before the capping, even when the initial off-balance before capping is allocated using the $(1-Z_i)M$ ’s. Present practice is to successively increase or decrease the multiplier $C$ until the needed overall average rate is achieved.

So, in this test correction algorithm, the first step is to recompute the aggregate off balance for test correction. The new aggregate off-balance is equal to the total rate dollars in the overall mean less the aggregate rate dollars in uncapped classes, less again the aggregate rate dollars in capped classes

$\begin{align} &M\sum_{i = 1}^n e_i- \sum_{\matrix{uncapped \\ classes \ j}}^n \left\{e_j \times \left[Z_jL_j+(1-Z_j)M\right]\right\} \\ &\ - \sum_{\matrix{capped \\ classes \ k}}^n \left\{e_k \times (capped \ rate \ for \ class \ k)\right\}. \end{align} \tag{8.1}$

Since the $(1-Z_i)M$ algorithm is scalable under both the Method 1 and Method 2 assumptions, that balance would be pro-rated according to the $e_j(1-Z_j)M$ ’s above. That would adjust the common multiplier $C,$ which, before capping, would now apply both sets of class rates. Then the caps would be applied to the new data. It is likely that the changes in rates that generates will now place some rates formerly outside the caps inside them (or vice versa). So, a different group of classes may be capped after this step. When that occurs, the value in equation (8.1) will change, and the needed overall test correction cannot be achieved without changing the $C$ in

$\begin{align} &M\sum_{i = 1}^n e_i- \sum_{\matrix{uncapped \\ classes \ j}}^n \left\{e_j \times \left[Z_jL_j+C(1-Z_j)M\right]\right\} \\ &\ -\sum_{\matrix{capped \\ classes \ k}}^n \left\{e_k \times (capped \ rate \ for \ class \ k)\right\}. \end{align} \tag{8.2}$

Consequently, additional iterations of the test correction algorithm are often needed. Since each test correction has the potential to result in a need to cap additional rates, or to move previously capped class rates away from the capping limits, the process often flows through a series of iterations before producing a final result. So, one effectively spreads the remaining aggregate off-balance correction among the classes whose rates are not capped, possibly with partial effects on the capped rates. In limited fluctuation and best estimate credibility alike, the basis for correction remains $(1-Z_i)M$ through all the iterations.

Since one must increase or decrease the multiplier applied to the basis values (again, the “ $C$ ” in equation (A.11)) in order to accommodate capping, this requires flexibility that the Bühlmann best estimate complement of credibility does not have. Thus, the off-balance approach of this paper is more robust.

Tables 6 and 7 illustrate two steps of test correction using the best estimate data from Table 5 and caps of ±15%. The internal process within each iteration proceeds as follows. In the part of the chart titled “First Step”, the loss cost rates resulting from the last test correction are compared against the caps and the presently capped rates are identified. In the “Last Step”, the initial values are adjusted by applying additional test correction to the pre-capping rates from the previous iteration (column 10 in Table 5 is input as the previous iteration for Table 6, then column 16 of Table 6 is the previous iteration used in Table 7). Key reference values applicable to all classes (including a new test correction factor) are computed and shown at the bottom of each table^[11]. Following through the calculations, Tables 6 and 7 fully illustrate the test correction process.

Table 6.Test Correction Factor Computations for Best Estimate Credibility (Method 2) Under Capping - Iteration 1

First Step: Calculations Using Uncapped Rates
	(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)	(10)
	(Data)	Table 5 col.(5)	Table 5 col.(10)	(1)*(3)	(Data)	.85*(5)	1.15*(5)	(3) within (6),(7)	(1)*(8)	Y=Knocked Out
Class	Exposures	Credibility Adjusted Rate	Pre-Cap Off-Balance Corrected Rate (Set 0)	Losses in Pre-Cap O-B Corrected Rate	Present Rate	Cap Below	Cap Above	Capped Rates (Set 0)	Total Losses in Capped Rates	Is Rate Knocked Out of TCF by Capping?
1	25	$ 1,978	$ 2,105	$ 52,617	$ 2,000	$ 1,700	$ 2,300	$ 2,105	$ 52,617
2	30	$ 1,022	$ 1,136	$ 34,087	$ 1,500	$ 1,275	$ 1,725	$ 1,275	$ 38,250	Y
3	36	$ 1,349	$ 1,451	$ 52,232	$ 1,200	$ 1,020	$ 1,380	$ 1,380	$ 49,680	Y
4	43	$ 731	$ 821	$ 35,482	$ 800	$ 680	$ 920	$ 821	$ 35,482
5	52	$ 979	$ 1,058	$ 54,863	$ 1,000	$ 850	$ 1,150	$ 1,058	$ 54,863
6	62	$ 1,008	$ 1,077	$ 67,007	$ 1,200	$ 1,020	$ 1,380	$ 1,077	$ 67,007
7	75	$ 786	$ 846	$ 63,149	$ 850	$ 723	$ 978	$ 846	$ 63,149
8	90	$ 615	$ 667	$ 59,764	$ 500	$ 425	$ 575	$ 575	$ 51,508	Y
9	107	$ 788	$ 833	$ 89,528	$ 750	$ 638	$ 863	$ 833	$ 89,528
10	129	$ 384	$ 422	$ 54,492	$ 375	$ 319	$ 431	$ 422	$ 54,492
11	155	$ 330	$ 362	$ 56,038	$ 300	$ 255	$ 345	$ 345	$ 53,404	Y
12	186	$ 315	$ 342	$ 63,576	$ 300	$ 255	$ 345	$ 342	$ 63,576
13	223	$ 330	$ 353	$ 78,664	$ 350	$ 298	$ 403	$ 353	$ 78,664
14	267	$ 259	$ 279	$ 74,520	$ 250	$ 213	$ 288	$ 279	$ 74,520
15	321	$ 283	$ 300	$ 96,191	$ 300	$ 255	$ 345	$ 300	$ 96,191

Total	1,801	$ 480	$ 518	$ 932,211	$ 489	$	$	$ 512	$ 922,932
Second Step: Test Correction Step Post Capping
	(11)	(12)	(13)	(14)	(15)	(16)	(17)
	Table 5 col.(7)	(11)) Less Knockouts	F*(12)	(4)+(13)	(14)/(1)	(15) within (5),(6)	Y=Knocked Out
Class	Original Test Correction Basis	Test Correction Basis Loss Knockouts	Additional Losses for Test Correction	Revised Test Corrected Total Losses	Test Corrected Rate (Set 1)	Capped Rates (Set 1)	Is Rate Knocked Out of TCF by Capping?
1	$ 11	$ 11	$ 580	$ 53,197	$ 2,128	$ 2,128
2	$ 12	$	$ 626	$ 34,713	$ 1,157	$ 1,275	Y
3	$ 13	$	$ 671	$ 52,902	$ 1,470	$ 1,380	Y
4	$ 14	$ 14	$ 713	$ 36,195	$ 838	$ 838
5	$ 14	$ 14	$ 752	$ 55,616	$ 1,073	$ 1,073
6	$ 15	$ 15	$ 789	$ 67,796	$ 1,090	$ 1,090
7	$ 16	$ 16	$ 822	$ 63,970	$ 857	$ 857
8	$ 16	$	$ 852	$ 60,615	$ 677	$ 575	Y
9	$ 17	$ 17	$ 878	$ 90,406	$ 841	$ 841
10	$ 17	$ 17	$ 902	$ 55,394	$ 429	$ 429
11	$ 18	$	$ 922	$ 56,960	$ 368	$ 345	Y
12	$ 18	$ 18	$ 940	$ 64,516	$ 347	$ 345	Y
13	$ 18	$ 18	$ 955	$ 79,619	$ 357	$ 357
14	$ 18	$ 18	$ 968	$ 75,488	$ 282	$ 282
15	$ 19	$ 19	$ 980	$ 97,170	$ 303	$ 303

Total	$ 235	$ 177	$ 12,349	$ 944,560	$ 524	$ 517
Reference Values for All Classes
	A = (Last Table) Overall Average Rate in Data					$ 518
	B = Total Loss in Set 0 Rates					$ 923,632
	C= Total Losses in Data					$ 932,211
	D = C-B = Shortfall					$ 9,279
	E =Total Test Correction Basis on Non-Capped Classes (12)					$ 177
	F = D/E = Test Correction Factor					52.44

Table 7.Test Correction Factor Computations for Best Estimate Credibility (Method 2) Under Capping - Iteration 2 (Final)

First Step: Calculations Using Rates from First Iteration
	(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)	(10)
	(Data)	Table 5 col.(5)	Table 6 col.(15)	(1)*(3)	(Data)	.85*(5)	1.15*(5)	(3) within (6),(7)	(1)*(8)	Y=Knocked Out
Class	Exposures	Credibility Adjusted Rate	Pre-Cap Test Corrected Rate (Set 1)	Losses in Pre-Cap Test Corrected Rate	Present Rate	Cap Below	Cap Above	Capped Rates (Set 1)	Total Losses in Capped Rates	Is Rate Knocked Out of TCF by Capping?
1	25	$ 1,978	$ 2,128	$ 53,197	$ 2,000	$ 1,700	$ 2,300	$ 2,128	$ 53,197
2	30	$ 1,022	$ 1,157	$ 34,713	$ 1,500	$ 1,275	$ 1,725	$ 1,275	$ 38,250	Y
3	36	$ 1,349	$ 1,470	$ 52,902	$ 1,200	$ 1,020	$ 1,380	$ 1,380	$ 49,680	Y
4	43	$ 731	$ 838	$ 36,195	$ 800	$ 680	$ 920	$ 838	$ 36,195
5	52	$ 979	$ 1,073	$ 55,616	$ 1,000	$ 850	$ 1,150	$ 1,073	$ 55,616
6	62	$ 1,008	$ 1,090	$ 67,796	$ 1,200	$ 1,020	$ 1,380	$ 1,090	$ 67,796
7	75	$ 786	$ 857	$ 63,970	$ 850	$ 723	$ 978	$ 857	$ 63,970
8	90	$ 615	$ 677	$ 60,615	$ 500	$ 425	$ 575	$ 575	$ 51,508	Y
9	107	$ 788	$ 841	$ 90,406	$ 750	$ 638	$ 863	$ 841	$ 90,406
10	129	$ 384	$ 429	$ 55,394	$ 375	$ 319	$ 431	$ 429	$ 55,394
11	155	$ 330	$ 368	$ 56,960	$ 300	$ 255	$ 345	$ 345	$ 53,404	Y
12	186	$ 315	$ 347	$ 64,516	$ 300	$ 255	$ 345	$ 345	$ 64,084	Y
13	223	$ 330	$ 357	$ 79,619	$ 350	$ 298	$ 403	$ 357	$ 79,619
14	267	$ 259	$ 282	$ 75,488	$ 250	$ 213	$ 288	$ 282	$ 75,488
15	321	$ 283	$ 303	$ 97,170	$ 300	$ 255	$ 345	$ 303	$ 97,170

Total	1,801	$ 480	$ 524	$ 944,560	$ 489			$ 517	$ 931,779
Second Step: Test Correction Step Post Capping
	(11)	(12)	(13)	(14)	(15)	(16)	(17)
	Table 5 col.(7)	(11)) Less Knockouts	F*(12)	(4)+(13)	(14)/(1)	(15) within (5),(6)	Y=Knocked Out
Class	Original Test Correction Basis	Test Correction Basis Less Knockouts	Additional Losses for Test Correction	Revised Test Corrected Total Losses	Test Corrected Rate (Set 2)	Capped Rates (Set 2)	Is Rate Knocked Out of TCF by by Capping?
1	$ 11	$ 11	$ 30	$ 53,227	$ 2,129	$ 2,129
2	$ 12	$	$ 32	$ 34,746	$ 1,158	$ 1,275	Y
3	$ 13	$	$ 35	$ 52,937	$ 1,470	$ 1,380	Y
4	$ 14	$ 14	$ 37	$ 36,232	$ 839	$ 839
5	$ 14	$ 14	$ 39	$ 55,654	$ 1,074	$ 1,074
6	$ 15	$ 15	$ 41	$ 67,837	$ 1,090	$ 1,090
7	$ 16	$ 16	$ 43	$ 64,013	$ 858	$ 858
8	$ 16	$	$ 44	$ 60,659	$ 677	$ 575	Y
9	$ 17	$ 17	$ 45	$ 90,452	$ 841	$ 841
10	$ 17	$ 17	$ 47	$ 55,440	$ 430	$ 430
11	$ 18	$	$ 48	$ 57,008	$ 368	$ 345	Y
12	$ 18	$	$ 49	$ 64,564	$ 348	$ 345	Y
13	$ 18	$ 18	$ 49	$ 79,669	$ 357	$ 357
14	$ 18	$ 18	$ 50	$ 75,538	$ 282	$ 282
15	$ 19	$ 19	$ 51	$ 97,221	$ 303	$ 303

Total	$ 235	$ 159	$ 639	$ 945,199	$ 525	$ 518
Reference Values for All Classes
	A = (Last Table) Overall Average Rate in Data					$ 518
	B = Total Loss in Set 1 Rates					$ 931,779
	C= Total Losses in Data					$ 932,211
	D = C-B = Shortfall					$ 431
	E =Total Test Correction Basis on Non-Capped Classes (12)					$ 159
	F = D/E = Test Correction Factor					2.71

9. Summary

A choice of how off-balances are split among classes establishes an approach to off-balance correction. However, it is important that the off-balance approach be coordinated with the credibility process that generated the off-balance. Two different goals for off-balance correction were presented. The first approach, leaving the credible data alone when using limited fluctuation credibility, was seen to create effective results . Moreover, that approach was also robust enough to use in virtually any multi-class ratemaking scenario involving a credibility-induced off-balance. The last approach, the optimum off-balance arising from minimizing the expected squared error with respect to best estimate credibility, is by definition the best companion approach for best estimate credibility. As this paper shows, ultimately the limited fluctuation and best credibility estimate approaches use exactly the same allocation formula (multiplying a constant $C$ by the complement of credibility terms $(1-Z_i)M).$ That formula should significantly improve the off-balance correction algorithm.

This formula eliminates the disparity associated with correcting classes that are already fully (or almost fully) credible. Further, it tends to spread the off-balance more heavily among the classes that generate the off-balance. Since different sets of assumptions resulted in the same algorithm, it also has broad applicability for best estimate credibility situations. Last, the paper showed that the algorithm extends easily to accommodate the capping of class rates under both sets of assumptions. So, the use of $(1-Z)M,$ or just $(1-Z),$ as a basis for distributing the off-balance appears to work in a very wide variety of situations.

It is hoped that this method will lead to much more rational and optimal class rates. When the off-balance correction is material, absorbing the off-balance in the complement of credibility term appears to offer significant opportunities to improve both the accuracy and reasonableness of the resulting indications.

Submitted: June 12, 2019 EDT

Accepted: June 16, 2020 EDT

References

Bailey, A. 1945. “A Generalized Theory of Credibility.” Proceedings of the Casualty Actuarial Society 32:13–20.

Google Scholar

Boor, J. 1993. “A Stochastic Approach to Trend and Credibility.” Casualty Actuarial Society Forum Special Ratemaking Edition:341–400.

Google Scholar

Bühlmann, H., and A. Gisler. 2005. A Course in Credibility Theory and Its Applications. Heidelberg, Germany: Springer-Verlag.

Google Scholar

Daley, T. 2009. “Class Ratemaking for Workers Compensation: NCCI’s New Methodology.” Casualty Actuarial Society Forum Winter Edition:48–147.

Google Scholar

Dean, C. G. 2005. “Topics in Credibility Theory.” Society of Actuaries Short-Term Actuarial Mathematics Study Note.

Modlin and Werner. 2016. “Implementation.” In Basic Ratemaking, 5th ed., 263–88. Arlington, VA: Casualty Actuarial Society.

Google Scholar

Appendices

A. The Best Estimate $T_i$ ’s Should be Proportional to the Complement of Credibility Too

The section will show that the off-balance basis $(1-Z_i)M$ from limited fluctuation credibility is also the best basis under best estimate credibility. The first step is to establish the criteria that define the “best estimate”. This is to be a minimum expected squared error best estimate per the best estimate credibility in Bailey (1945). The parameters used in Bailey credibility were described in Section 5. When the corresponding values of the mean and variance of the $\mu_i$ ’s are added, that will fully specify the inputs needed for a minimum expected squared error analysis.

To begin, one may use a Bayesian analysis to determine the mean and variance of the true mean $\mu_i$ given the observed data values (losses) they generate. What is specified in practice is the distribution of the observed loss data around each $\mu_i,$ rather than the distribution of the $\mu_i$ around the data. The “flat”^[12] uniform diffuse prior on the real line that was discussed in Boor (1993) provides a tool to reverse the relationship. It indicates that when a source distribution generates a value with a certain mean and variance, then the unknown mean of the source distribution has a mean equal to that value. Further the variance of the possible source means has the same variance as the variance of the possible losses that might be generated by the source^[13]. An implicit assumption in those results is that there is no reason “a priori” that any possible value for the unknown mean is more likely than another.

Thus, one may reverse the distribution from random values of the losses in the data to random values of the expected losses that generate them. Using the structure from Section 5, one may infer that the mean of of each $\mu_i$ given $L_i$ is itself $L_i,$ since the mean of $L_i$ given $\mu_i$ is $\mu_i.$ Similarly, since $Var[L_i|\mu_i] = \frac{s^2}{e_i},$ the variance of each $\mu_i$ would be $Var[\mu_i|L_i] = \frac{s^2}{e_i}.$ Similarly, $E[M|\mu_i]=M,$ and $Var[M|\mu_i] = Var[\mu_i|M] =\sigma^2.$ Further, since $L_i-\mu_i$ and $\mu_i-M$ involve separate and unrelated probability samples, they are clearly independent. Similarly, the values related to $i$ are independent from each of the values arising from the other classes $j \ne i.$

Those results are combined in the equations below

$\mu_i$ is the mean value of $Z_iL_i+(1-Z_i)M$ and vice versa; also
noting that $L_i-\mu_i$ and $\mu_i-M$ are independent, and $Z_i$ is a constant,

$\begin{align} Var[\mu_i |L_i,M]= Z^2_i\frac{s^2}{e_i}+(1-Z_i)^2\sigma^2 \\ \mbox{ (both for any specified $i)$}.\end{align} \tag{A.1}$

However, the variance may be simplified. Using the Bailey credibility constant

$\begin{align} K &= \frac{\mbox{expected process variance}}{\mbox{variance of the hypothetical means}} \\ &= \frac{s^2}{\sigma^2}, \end{align} \tag{A.2}$

one may revise the expression for the variance to

$Var[\mu_i |L_i,M]= \sigma^2\left[Z_i^2\frac{K}{e_i}+(1-Z_i)^2\right]. \tag{A.3}$

A little algebra reduces that to

$Var[\mu_i |L_i,M]= \sigma^2(1-Z_i). \tag{A.4}$

In other words, each estimator $Z_iL_i+(1-Z_i)M$ follows a probability distribution around the corresponding true value $\mu_i$ with that specific mean and variance. However, one must add a set of additional values $T_i$ to each estimate so that the overall average rate matches the overall average cost $(M)$ in the data. Algebraically, that means

$\sum_{i=1}^n Z_iL_i+(1-Z_i)M +T_i=M. \tag{A.5}$

Further, since the values $L_i$ are independent, the various unknown $\mu_i$ ’s are considered to be independent of one another in the Bailey analysis, as are the $L_i-\mu_i$ ’s. For the final rates $r_i$ being sought, $r_i =Z_iL_i+(1-Z_i)M+T_i.$ Each $Z_iL_i+(1-Z_i)M$ is equal to the mean or expected value of the corresponding $\mu_i$ and each non-random $T_i$ generates movement away from the mean. Thus, the aggregate unweighted mean squared error of the $Z_iL_i+(1-Z_i)M+T_i$ ’s in estimating the corresponding true means $\mu_i$ is

$\sum_{i=1}^n E\left[\left\{ Z_iL_i+(1-Z_i)M+T_i-\mu_i \right\}^2\right]. \tag{A.6}$

However, the simple aggregate error does not fully reflect the pricing situation. It would seem that accuracy predicting the rate for larger classes would be more important than it is for the smaller classes, as they are a bigger part of the business. So, the expected squared errors should be weighted with the $e_i$ ’s. Further, it would be logical to tolerate more prediction error when one is more uncertain about the value of the true $\mu _i.$ Therefore, each expected squared error is divided by the variance of the corresponding $\mu_i$ in all subsequent calculations. That produces the mean squared error function to optimize, using equation (A.4) of

$\begin{align} &S(T_1, ... ,T_n) \\ &\ =\frac{\sum_{i=1}^n \frac{e_iE\left[\left\{ Z_iL_i+(1-Z_i)M+T_i-\mu_i \right\}^2\right]}{\sigma^2(1-Z_i)}}{\sum_{i=1}^ne_i}. \end{align} \tag{A.7}$

As discussed earlier, the values of $T_i$ that may be used are constrained by equation (A.5). The results under that constraint may be obtained using the Lagrange multiplier method discussed in Section 3. Per that method, the first step to satisfying the constraint while minimizing $S$ is to subtract some $\lambda$ times a zero-based^[14] version of the constraint. Then, the final step is to simultaneously find $T_i$ values and a $\lambda$ where all the partial derivative functions (by the $T_i$ variables and $\lambda)$ of $Q$ (below) are zero. The resulting $T_i$ values and $\lambda$ generate the minimum value under the constraint.

Hence, the next step is to find the partial derivatives of

$\begin{align} &Q(T_1, \ T_2, \ ... \ T_n, \ \lambda ) \\ &\ = \frac{\sum_{j=1}^n e_j\frac{E\left[\left ( Z_jL_j+(1-Z_j)M+T_j-\mu_j\right )^2\right]}{\sigma^2(1-Z_i)}}{\sum_{j=1}^ne_j}-\lambda \\ &\ \quad \times \Biggl[\frac{\sum_{j=1}^n Z_jL_j+(1-Z_j)M+T_j+T_j}{\sum_{j=1}^ne_j} \\ &\quad \quad \quad -M\Biggr]. \end{align} \tag{A.8}$

Using a representative $T_i$ and setting its partial derivative to zero (and performing some algebra) one may obtain an equation for $T_i$ of

$\begin{align} 0 &=\frac{\partial}{\partial T_i} Q(T_1, \ T_2, \ ... \ T_n, \ \lambda ) \\ &= \left[\frac{e_iT_i}{\sigma^2(1-Z_i)}-\lambda e_i\right]/\sum e_i, \end{align} \tag{A.9}$

$\begin{align} \frac{T_i}{\sigma^2(1-Z_i)}& =\lambda \\ &= \text{constant across all classes } i. \end{align} \tag{A.10}$

But that does not finish the analysis. A little algebra converts that to

$T_i = \lambda \sigma^2(1-Z_i). \tag{A.11}$

Note that $\lambda$ and $\sigma^2$ are constant across the classes. Only the $(1-Z_i)$ term varies from class to class. Consequently, the total difference between the overall mean and the weighted average of the credibility-adjusted loss rates should be split among the various classes in proportion to to the complements of credibility. Further, since $\lambda \sigma^2$ is also constant across the classes or subsets (the $i$ ’s), one could combine $1/M,$ $\lambda,$ and $\sigma^2$ in a constant $C=\lambda \sigma^2 /M.$ Then, one one may set

$T_i = C(1-Z_i)M. \tag{A.12}$

Thus, the optimum basis for this type of best estimate credibility is equal to the complement of credibility term, just as in limited fluctuation credibility.

One additional conclusion may be reached. Sometimes a different amount of correction, effectively either adding some extra amount to the right side of equation (A.8) or subtracting from it, is needed. Following the process through to equation (A.12) it should be apparent that the basis for off-balance correction will not change. So, the basis for correction is scalable.

As a last discussion point, one should be aware that the “optimum value” depends on what type of optimum is being sought and what underlying assumptions are involved. In the case, the $T_i$ ’s are the values that minimize the squared error in estimating the means, subject to the constraint, when the $Z_iL_i+(1-Z_i)M$ ’s are individually believed to be the best estimates of the $\mu_i$ ’s (per the Bailey model).

B. Determining the Variance Structure $(s^2$ and $\sigma^2)$ of the Best Estimate Data

Once the relevant data is provided, the estimation process for $s^2$ and $\sigma^2$ essentially just involves a certain amount of arithmetic. In the past, key data (such as the variance of the losses of a single insured) was difficult to develop. However, given contemporary computing tools, it should be much more straightforward. To provide some background for the reader, a brief description of the calculations follows.

A discussion of the arithmetic is relevant. Given $\sum e_i$ risks in the history that are spread among $n$ classes, the first step is to compute the $L_i$ ’s or raw class means. Then, one must compute the sum of all the risk-by-risk squared differences between each risk’s losses and the mean $L_i$ of the class it belongs to. This produces a sum of $n$ squared differences, which may be denoted $\alpha^2.$ The Dean paper indicates that $\alpha^2/([\sum e_i]-n)$ approximates $s^2.$

To estimate the variance of the hypothetical means $\sigma^2,$ one begins with the risks-within-each-class weighted^[15] sum of squared differences between the class means and the overall mean $M,$ or $\sum_i e_i(L_i-M)^2$ (defined to be $\beta^2).$ The Dean paper shows that $\beta^2$ may then be combined with the previously determined estimate $\hat{s^2}$ for the process variance to estimate $\sigma^2$ using $\hat{\sigma^2}$ $= [\beta^2-(n-1)\hat{s^2}]/[\sum_ie_i-\frac{\sum_ie_i^2}{\sum_ie_i}].$ Those formulas are used to estimate the variance structure underlying the data in Table 5.

C. What if the Process or Parameter Error Variance Differs From Class to Class?

In some circumstances, the unit process variance $s^2$ or the variance of the hypothetical means $\sigma^2$ may vary from class to class. One may readily revise the calculations in Appendix A to accommodate the differences. Thus, equation (A.11) will still hold and

$\frac{T_i}{(1-Z_i)\sigma_i^2}=\lambda = \mbox{constant across all classes}. \tag{C.1}$

As above, this would simply result in pro-rating the off-balance across the exposure-weighted values of $(1-Z_i)\sigma_i^2$ (or, incorporating the constant $M,$ $T_i(1-Z_i)M\sigma_i^2).$ Further, just like the formula for the constant variance component scenario, this provides an optimum estimate.

In many common situations only the process variances $s_i^2$ vary between classes. In that case, the actual Method 2 basis $T_i = (1-Z_i)M$ will still be optimal. Lastly, Method 2 is only marginally revised when the parameter variances differ between classes. Thus, the complement of credibility term either underlies the basis for correction or is the basis in every minimum squared error best estimate situation. This supports the general use of $(1-Z_i)M$ as a basis for allocating off-balances arising from credibility.

D. A Digression—The Bühlmann Best Estimate Complement of Credibility

The complement of credibility approach^[16] documented in Bühlmann and Gisler (2005) provides another approach to resolving off-balances arising from credibility. This text shows that when the credibility-weighted average rate $B=\sum Z_iL_i/\sum Z_i$ replaces $M$ in the complement of credibility term, two improvements occur. First, the approximations to the individual class means $\mu_i$ are optimized (they have minimum expected squared error) considering all the data, even the data from other classes. As such, they are more accurate than the Bailey calculations using $M.$ Second, the text shows that the average of the post-credibility rates will automatically equal $M.$ Hence, no off-balance correction will be needed. As long as the “best estimate” and “averaging to the overall mean” criteria completely specify the problem, this approach requires fewer calculations and is more mathematically elegant than the other off-balance methods.

As it turns out, this approach and the approach of Method 2 yield identical results. They both solve a problem of “Given the data and the credibilities associated with it, what is the optimum prediction of the true loss rates when the average rate must match the overall average loss rate in the data?” The two approaches use slightly different frameworks for the estimate, though. Method 2 applies an off-balance correction to what are believed to already be the most accurate rates (considering only each rate in isolation) so that they average to the overall average rate. The Bühlmann approach associates a different value with the complement of credibility, to achieve the best estimates of individual rates when considering all the data. The approach underlying Method 2 essentially was that some correction to the “best estimate” post-credibility rates was necessary to balance to the overall average rate, and the correction was done in way that minimized the error. The analysis in the Bühlmann and Gisler book indicated that using $B$ with best estimate credibility produced the class-by-class minimum expected squared error optimums, given all the information. The two approaches are close enough, though, that one would expect the results to be similar.

In fact, the two approaches are mathematically identical. To show that the two approaches produce identical rates, one may begin with the fact that replacing the overall mean $M$ with $B$ in the complement of credibility term in the rate formula generates an overall average rate of $M.$ Thus

$\begin{align} M &= \frac {\sum e_i \left[Z_iL_i+(1-Z_i)B\right]}{\sum e_i} \\ &=\frac {\sum e_i \left[Z_iL_i+(1-Z_i)M+T_i\right]}{\sum e_i} = M.\\ \end{align} \tag{D.1}$

Then, since $T_i=C(1-Z_i)M,$ a little basic algebra gives

$\begin{align} &\frac {\sum e_i \left[Z_iL_i+(1-Z_i)B\right]}{\sum e_i} \\ &\ =\frac {\sum e_i \left[Z_iL_i+(1-Z_i)M+C(1-Z_i)M\right]}{\sum e_i} \\ &\frac {B\sum e_i (1-Z_i)}{\sum e_i} =\frac {M(1+C)\sum e_i (1-Z_i)}{\sum e_i} \\ &C =\frac{B}{M} - 1\\\end{align} \tag{D.2}$

So, $\frac{B}{M} - 1$ is the constant $C$ from Method 2. It is then necessary to show that using that value to generate each term $Z_iL_i+(1-Z_i)M+T_i$ will produce rates that exactly match the rates for the various classes under the Bühlmann approach. However, it is clear from another use of basic algebra that

$\begin{align} &Z_iL_i+(1-Z_i)M+T_i \\ &\ = e_i Z_iL_i+(1-Z_i)M+C(1-Z_i)M \\ &\ = Z_iL_i+(1-Z_i)M+(\frac{B}{M}-1)(1-Z_i)M \\ &\ = Z_iL_i+(1-Z_i)B. \\ \end{align} \tag{D.3}$

Thus, the Method 2 approach produces rates that are identical to those generated by the Bühlmann best estimate credibility-based ratemaking formula.

However, there is one important difference. As noted earlier, Method 2 of Section 5 is scalable. So, it provides flexibility in how much correction is used. On the other hand, the Bühlman complement of credibility strictly and only generates the exact offset needed for the rates to weight to the overall mean $M.$ Method 2 is clearly needed when extra offset (test correction) for the effects of capping is needed.

Another quibble with this approach is that it requires reliance on somewhat opaque mathematics. So, it may be difficult to convincingly explain to non-actuaries. One must also consider that it only works with best estimate credibility. So the Bühlmann credibility complement term may not be suitable in all ratemaking situations or for all audiences.

E. Method 2 Often Approximates Maximum Likelihood Estimates

Considering the Central Limit Theorem, one may consider what happens when there is so much data that all the distributions are approximately normal. In that vein, it is worth analyzing the situation where the distributions underlying the process variance and parameter variance are normal distributions. The key to handling this case is to review the calculations earlier in this paper^[17]. Per Section 5, the Method 2 correction algorithm generates the minimum expected squared error. That is a type of optimum result. However, if all the distributions are normal, and the constraint is included, this is really a linear problem involving normal distributions. Hence, one would expect that the minimum squared error values are also the mean values which are also the maximum likelihood values.

As discussed in Appendix D, the quantities $Z_iL_i+(1-Z_i)B$ are minimum expected squared error estimates of the various $\mu_i$ ’s. Since, in the context of this appendix, they would arise from a normal distribution, the mean of each $Z_iL_i+(1-Z_i)B$ must equal $\mu_i.$ However, using equation (D.3), $Z_iL_i+(1-Z_i)M+T_i$ $=Z_iL_i+(1-Z_i)B$ $=E[\mu_i].$ So, $Z_iL_i+(1-Z_i)M+T_i$ has a mean value of $\mu_i.$ Since the mean of a normal distribution has the most probability, the quantity $Z_iL_i+(1-Z_i)M+T_i$ would be the maximum likelihood estimate of $\mu_i.$

So, the $T_i$ ’s would have an additional level of optimality. Considering the Central Limit Theorem, one would expect that for “large enough” volumes of data and a “low enough” severity variance, the associated $T_i$ ’s would also approximate the maximum likelihood estimates.

Of course, “fully credible” does not exist in Bayesian ratemaking, but may be regarded as “having a negligible difference from full credibility”.
This paper uses the word “basis” interchangeably between the formula that determines the basis, i.e. $(1-Z_i)M$ and the values for pro-rating the correction, in this case $e_i(1-Z_i)M.$
That would just be a proxy for changing the credibiliies
Of course, it is important that the criterion or criteria and assumptions are as reasonable as possible and that as few assumptions as possible are included.
The bar symbol “ $|$ ” denotes “given the conditions on the right side of the bar”. and is used to facilitate a clear understanding of the conditions underlying an expectation or variance.
Depending on the size of the off-balance.
As explain in Appendix B, these are roughly related to the process variance and variance of the hypothetical means in the database.
An interested reader may compute the values using the data in the table.
Or something similar when the parameter variance is different from class to class.
Technically, this would require that the exposures underlying the rates capped from above times the weighted average capped amount on those rates be greater in absolute value than the exposures underlying the rates capped from below times the weighted average capped amount on the rates capped from below.
The reader should note some of the unusual items, such as the test correction adjustment moving some of the formerly capped values away from the caps. One may see that by comparing columns (10) and (17) in Table 6.
This “prior” distribution is a distribution where every real number is equally likely.
More generally, the distribution of the source mean given the observed data exactly mirrors the distribution of the observed values given the source mean.
A rewriting of the constraint as some expression that must equal zero.
Technically, since one does not divide by the sum of the $e_i$ ’s, this is a process of extending columns rather than one of weighting loss costs.
The author thanks the reviewer(s) of this paper for bringing this concept to his attention.
One may review the calculations in Boor (1993) as well.

Rebalancing the Off-Balance Factor with the Complement of Credibility

Abstract

1. Introduction

2. The Current Off-Balance Factor Algorithm in Typical Class and Subline Ratemaking Schemes

3. Background: Notation and an Optimization Technique

4. Method 1: Leave the Credible Data Alone—Spreading the Off-Balance Correction Across the Complement of Credibility

5. Method 2: An Approach for Best Estimate Credibility: Minimizing the Expected Squared Error After Off-Balance

6. Enhancements to the Best Estimate Formula

7. Comparison of the Two Methods

8. Testing the Test Correction Factor: Off-Balance Corrections for Credibility and Capping Combined

9. Summary

References

Appendices

A. The Best Estimate TiT_i’s Should be Proportional to the Complement of Credibility Too

B. Determining the Variance Structure (s2(s^2 and σ2)\sigma^2) of the Best Estimate Data

C. What if the Process or Parameter Error Variance Differs From Class to Class?

D. A Digression—The Bühlmann Best Estimate Complement of Credibility

E. Method 2 Often Approximates Maximum Likelihood Estimates

This website uses cookies

A. The Best Estimate $T_i$ ’s Should be Proportional to the Complement of Credibility Too

B. Determining the Variance Structure $(s^2$ and $\sigma^2)$ of the Best Estimate Data