The State of GTM Data Quality
- Published
- 29 Apr 2026
- Reading time
- 6 min read
- Author
- Stride Research
- Category
- Benchmarking
Overall accuracy range
40-80%
Depending on provider, industry, and region.
Account impact range
North America
The segments you would expect to perform better, like North America and Enterprises, do perform better, but maybe not as much as you may think.
Employee count quality
22,000
Benchmark workspace is built around thousands of human-rated records from a huge sample of stratified records.
Why Stride started with company firmographics and entity mapping
The origin of Stride was not a data product. Our earliest work focused on building agents that could help with expansion tasks, such as finding customers in a new country.
As we prototyped that concept, we kept returning to the same problem. The agent could be useful, but the underlying company data was surprisingly poor. Revenue values were stale or implausible, employee estimates were inconsistent, entities were confused, and locations often described legal paperwork rather than real operating activity.
The more uncomfortable point was that quality control seemed thin across the wider market. Providers and AI workflows were producing plausible-looking outputs, but very little evidence was available on whether those outputs were good enough to support strategy or account-level decisions.
So we set out to answer a simple question: is any of this data good enough to trust? We built human-expert datasets and cross-referenced them against different providers and LLMs.
Data quality: the uncomfortable truth
Spoiler alert: the data is not very good. Our benchmark snapshot suggests the market is still a long way from decision-grade company data.
We explain the benchmark methodology in more detail in our methodology article. In essence, decision-grade data in a GTM context means the data helps the customer get the right motion, experience, and support. We call this the Account Impact benchmark, and providers range from roughly 55% to 80%.
Account Impact benchmark
Account Index measures whether the errors are likely to result in an account being in a different segment and receiving different treatment.
Stride
Clay (Argon)
GPT 5.4
ZoomInfo
Clay (Premium Data Providers)
Stride
83%
Clay (Argon)
76%
GPT 5.4
73%
ZoomInfo
63%
Clay (Premium Data Providers)
59%
For company-level strategic decisions, raw accuracy is more important. TAM, planning, and prioritisation need values that are materially correct, not merely good enough for a broad routing band. For that, we use the Accuracy benchmark, which ranges from roughly 40% to 75%.
Accuracy benchmark
Accuracy Index measures whether the provider output is within a 0.5x/2x error threshold of the human-rated reference value.
Stride
Clay (Argon)
GPT 5.4
ZoomInfo
Clay (Premium Data Providers)
Stride
75%
Clay (Argon)
67%
GPT 5.4
64%
ZoomInfo
49%
Clay (Premium Data Providers)
43%
Where is quality breaking most?
We assumed the quality gaps were mostly coming from international markets outside major Western economies and from smaller businesses, because of provider focus, training data, and data sparsity.
That is true to an extent. North America has the most accurate view of the market, but the difference is not dramatic. Even in North America, accuracy remains a long way from what is possible.
We actually think North America should be more difficult over the long term because many EMEA and APAC markets have stronger filing, governance, and transparency infrastructure. Italy and Japan, for example, are markets where we would expect strong coverage and reporting practices across both the private and public sectors, yet they remain weak for many model providers.
Regional benchmark performance
Account Index measures whether the errors are likely to result in an account being in a different segment and receiving different treatment.
Likewise, we see the expected pattern by business size, with enterprise companies having more accurate data. But this effect becomes less pronounced in the Account Impact benchmark. Because enterprise segmentation bands are usually much wider, providers can be materially inaccurate in raw terms while still placing the account into the same commercial category and motion.
Business size benchmark performance
Account Index measures whether the errors are likely to result in an account being in a different segment and receiving different treatment.
We also observe that legacy providers appear more exposed to large errors, including errors above 100x. Our working assumption is that this reflects a dependency on historically scraped, inherited, or weakly refreshed records.
Finally, when cutting the data by industry, we see large gaps in certain sectors. Generally speaking, industry performance is heavily influenced by reporting requirements and regional composition. Healthcare is an area we have invested in heavily at Stride, and we are starting to see early signs of that investment in the benchmark results.
What GTM and Strategy operators should do
If we had one piece of advice for GTM Ops and Strategy Ops teams, it would be to consider quality alongside coverage. In our opinion, the industry has over-optimised for breadth at the expense of reliability. Many companies will be paying that debt for years, especially in a future where agents play a larger role inside companies and will not always apply the same safeguards as a human operator.
Stride is constantly broadening these benchmarks and improving against them. The results in this article are a snapshot, not a ceiling. We have already made material gains beyond these numbers as we expand source coverage, improve entity resolution, and recalibrate agents against known reference values.
If you want to learn more about your own industry, get in touch. We would be glad to see whether we can help.