It’s been exactly one year since we launched Merchant Real Industry. It’s the only tool using LLMs to automatically classify an SMB’s industry classification (MCC & six-digit NAICS codes) based on website analysis, online reputation, and other proprietary data.
Many customers have used Merchant Real Industry to auto-classify their SMB customers with high accuracy and speed. Customers have specifically told us that Merchant Real Industry delivers a more accurate reflection of a business’s true industry than existing approaches.
We wanted to verify these results, so we ran an experiment comparing the results from Merchant Real Industry against the current gold standard for industry classification: manual risk analysis.
TL;DR: Merchant Real Industry’s results matched 90% of the results from manual analysis, at a fraction of the time (minutes vs. hours). This result is consistent with our customers’ results.
Read on to learn more about our experiment, and reach out to try Merchant Real Industry today.
We obtained NAICS code data from the Small Business Administration’s Paycheck Protection Program (PPP). PPP data is one of the few public data sources containing details on SMBs, including their NAICS codes, so it was a reliable data set for our experiment.
To assess the performance of Merchant Real Industry, we structured our evaluation around two methods: a comparative analysis against PPP data, and an independent QA test by three risk analysts:
Our first step was to compare the NAICS codes generated by Coris with those listed in the PPP data. We know that a high match rate with PPP does not necessarily indicate accuracy.
We found the following:
The variance across the match rates underlines how difficult it is to precisely classify a business’s industry. Based on our industry experience, businesses tend to be more accurately bucketed in broad NAICS codes categories (at the 2-digit level) vs. more granular levels. This makes intuitive sense: there are 1,012 six-digit NAICS codes, so there is a lot of room for individual discretion on the exact NAICS code a business falls into.
As previously mentioned, match rates alone don’t tell you enough about industry classification. There could be inaccuracies or inconsistencies within the control data set (PPP data).
We enlisted three independent analysts to manually classify businesses blindly. Analysts agreed with Coris’s NAICS codes in 90% of the cases, consistent with our ongoing findings with customers.
When we compared the speed of classification results, we noticed that Merchant Real Industry took a few seconds to auto-generate the NAICS code for a business, while a risk analyst took a few minutes. These time savings are significant when analyzing a large SMB data set.
We’re encouraged by the results of this experiment, and see them as a vote of confidence for our vision to build the future of SMB risk intelligence.
We’ve recently improved Merchant Real Industry to provide even greater value to customers.
Now, Merchant Real Industry shares the top three most likely NAICS and MCC codes for each business, accompanied by confidence scores for each. This recognizes the complex nature of industry classification and provides more nuanced insights.
By automatically and accurately classifying businesses, anyone who manages SMB risk – from fintechs to software platforms and more – can improve their risk management practices more effectively. Customers can access these attributes through our developer-first APIs or our no-code portal.
In addition to NAICS code classification, Coris already supports industry classification for MCC and is continuing to invest in this area with new enhancements coming soon.
Get in touch if you’d like to use this feature or learn more.