Workers Comp Statistical Significance Better Known As The Benchmark
Most discussions and articles concerning Workers Comp Statistical Significance center on the Experience Modification Factor and Loss Development Factors. Benchmarks also come from tests for statistical significance.
The article that generated this article can be found here. I find it amazing that data scientists recommended the elimination of Correlation due to its inexactness. I have used the test for Correlation often. Most correlation tests have proved to be accurate when looking back to the test a few years later.
One of the main reasons to use this powerful statistic occurs when the incoming data is unreliable. Why run a battery of statistical tests when the data is not verifiable or consistent?
To me, the word benchmark’s overuse knows no bounds in insurance. I prefer workers comp statistical significance. Why?
I often am given massive data sets to analyze.
The three main tests I use are:
- Regression and Stepwise Regression
- Coefficient of Correlation and Determination
- Proportional Math (almost magic)
- Loss Triangles – added in to keep the actuaries happy
Let us look at each of these tests in Workers’ Compensation. Excel provides the first three with no problem. One can accomplish purchasing loss triangle software from a few vendors on the internet.
Hang in there with me. This article will not cause you too much pain. The Excel formula name appears below the title if you wish to reference what Microsoft says about each of the statistical tests. You must have the Data Analysis Toolpak installed to use any of the Excel functions.
Regression and Stepwise Regression
For Workers Comp, regression may seem too long-tail (covers too many years) to use in most cases. Estimating trends requires fewer data points than one might think for analysis.
Experience Modification Factors usually cover only a short range of employer data. With the advent of Predictive Analytics, E-Mods sometimes take a backseat to longer-tail data. E-Mods cover only 4.5 years in the past maximum. Predictive analytics examine losses over a much longer amount of time, such as 20 years or as long as the business has been in existence.
I use the LINEST and TREND functions to forecast Experience Modification factors with an acceptable level of accuracy. NCCI and the other rating bureaus store long periods of Experience Mod. Workers Comp statistical significance began to step away from Mods over the last 10 – 15 years.
Not to bore anyone, but the formula for linear regression (sum of least squares) is:
Y=mx + b
A simple yet powerful formula for forecasting and examining the relationships between two or more variables.
The red line represents a linear trend among data plots. Let us leave the regression discussion with the graph.
Coefficient of Correlation and Determination
Regression and correlation go hand in hand as regression measures the amount of linear correlation. The coefficient of correlation can be a great tool when given multiple sets of supposedly related data to examine for any workers comp statistical significance. I received three sets of what turned out to be unrelated data after putting the Coefficient of Correlation to work for me.
The University of Connecticut performed an excellent job of explaining how to use and graph correlation in Excel here.
A company forwarded two sets of claims payment data last week. One set was temporary total payments on new claims paid for one year. I then received a set of data based on the number of accidents.
By running a Coefficient of Correlation test between the two data sets, I could tell that some inaccuracies existed in one or both sets of the data. The low Workers Comp statistical significance of the data showed that the correlation came in at .32. I was expecting a .80 result or higher.
The Coefficients are:
- 1 means perfect correlation
- .5 – the two variables are not probably related
- 0 – random data or the test was not set up properly
- -1 -the perfect negative correlation – can still be a good thing
If you wish to see how the Coefficient of Correlation can cause the gain or loss of billions of dollars, follow the first link in the article to the Claims Journal article.
You can run any two sets of data together to see if they have something in common (a correlation) I have used this powerful statistical test numerous times to test data sets for reliability.
Proportional Math – Great Workers Comp Statistical Significance Test
Proportions remain my go-to non-statistical statistic when dealing with Experience Modification Factors. Sometimes, I have to run different numbers through a premium audit or E-mod calculation extremely quickly.
Going through all the numbers will not do as the time is short. Proportions are a quick-and-dirty way to go through a series of calculations without spending the time to do each calculation. Premium audit statements and bills can be gone through in a few steps.
In algebra, proportions can be used to solve many common problems about changing numbers. As an example, for the increase in a $40 purchase of gasoline, if the price rose 35 cents, from $3.50 to $3.85, the proportion would be:
x⁄3.85 = $40⁄3.50
The solution is simple:
x = $40/3.50 x 3.85 = $44.00, or $4 more when $0.35 higher.
As long as a linear relationship exists between the variables, one can crank out proportions that make sense very rapidly.
Loss Triangles – The Actuarial Debate
The final measure of workers compensation statistical significance is loss triangles sometimes known as loss picks. A few software vendors provide these powerful assessors of the current risk involved with a set of claims.
A great Loss Development Factor article can be found here from Sigma Actuarial Group/Risk66.
Measurements for workers comp statistical significance usually are not that difficult with the advent of so much software.
©J&L Risk Management Inc Copyright Notice