Banning Data Won’t Keep It Away


Banning Data Won't Keep It AwayEarlier this month, Washington State Insurance Commissioner Mike Kreidler temporarily enacted a rule that bans insurers from using credit information to set insurance rates for auto, homeowners, and renters policies. This rule is part of a growing trend among regulators to prevent practices where marketers, underwriters, claims adjusters, or risk managers use potentially discriminatory data such as race, ethnicity, sex, or, in this case, credit score to inform decisions about policy applications, premium amounts, fraud likelihood, or general sales targeting. 

These regulations fall in line with existing guidelines and laws such as the New York Insurance Circular Letter No. 1 (2019), which bans the use of race, creed, national origin, and other data dimensions in underwriting decisions. This is also the general direction of the California Privacy Rights Act, GDPR, and other consumer data privacy laws governing when, how, and whether an individual’s data can be used by companies that are collecting it. 

Credit scores have always been tricky.

Of course, credit score has been a bugaboo metric in financial services for decades. Under the Equal Credit Opportunity Act and similar laws, lenders cannot use any of these protected classes of data when computing anything like a “credit score.” The top credit bureaus in the U.S.—Equifax, Experian, and TransUnion—are very careful about avoiding protected data dimensions from the scoring algorithms they use to compute the credit scores they provide to lenders, financial institutions, and insurers. 

But the predictive power of a credit score for risk analysis, sales likelihood, underwriting, and other decision-making processes is extremely tempting in an industry where margins are tight and the activation of data analytics can be a real differentiator. What the State of Washington doesn’t realize, given the current maturity of the consumer data and analytics landscape, is that “credit score” can be easily computed from other data dimensions using basic AI techniques.

Data transparency matters more than category.

An analytics team can train an algorithm to predict “credit score” based on thousands of other data dimensions, most of which are not protected, many of which are provided by third parties, and few of which have clear usage consent histories. 

One doesn’t even have to call it a credit score; it can be an effect size computation, goodness-of-fit metric, beta coefficient, or a dozen other outputs of statistical modeling. It doesn’t have to be exact, either—after all, “real” credit scores by the credit bureaus are statistical computations too. Banning the use of a credit score is a moot effort today, because like every dimension of consumer data, it can be predicted from other dimensions with reasonable accuracy.

In fact, banning the use of credit scores might backfire on Kreidler and achieve the opposite effect, since these analytics teams are no longer held to the same standards or practices as the credit bureaus are. It might be better to accept a credit score derived transparently, and with a long history of regulatory compliance, than to use a proxy that is computed within many black boxes.

Instead of focusing on data category, regulators should focus on data transparency. This transparency includes cataloguing and communicating how data is being used, manipulated, or transformed. It also includes making sure that this usage has been consented to by consumers, whose data it ultimately is.

If you have thoughts on this topic, I’d love to hear from you! Please reach out to me at [email protected].