Why B2B Lookalikes Still Suck

Published: 13 May 2026
Reading time: 7 min read

1Intro

One of the first things we looked at building agents for at Stride was lookalikes.

The idea was obvious: take your best customers, understand what makes them similar, then find more companies that look like them. Package that into a workflow that helps GTM teams expand into new markets, build better account lists, prioritise outbound, and route leads more intelligently.

But in our experience, the outputs were not very good.

That was surprising because in B2C, lookalike models are one of the foundations of ads targeting. At Meta, I worked on teams that built lookalike products generating tens of billions in revenue. Lookalikes were especially powerful for growth and expansion because you could change one variable while keeping others constant. For example, you could take a seed audience in Germany and use it to find similar prospects in the UK.

You would expect the same idea to be even more valuable in B2B. CAC is higher. Buying cycles are slower. Sales teams need more precision. Wasted outreach is expensive. Expanding into a new market is hard. Finding accounts that look like your best customers should be an obvious use case.

Yet in most B2B products, lookalikes are an orphaned feature. They sit somewhere inside an enrichment tool, intent platform, CRM add-on, or sales intelligence product, and most of the time they are not very good.

2Problem

2.1B2B entity mapping is broken and inconsistent

In B2C, a lookalike model usually learns from the same entity it later targets. That entity might be a user, device, customer profile, or account inside a closed ecosystem.

The setup is not perfect. There is still noise. But the feedback loop is relatively clean. The model learns from users who converted, sees other users with similar behavioural, demographic, or contextual patterns, and can then test whether those users also convert. The loop between signal, targeting, and outcome is tight.

In B2B, that assumption usually breaks. The “company” in your CRM might be the global parent, regional subsidiary, local operating unit, legal entity, domain-level approximation, inferred account created by a data vendor, or a manually created CRM account object.

Its revenue may belong to the parent. Its employee count may belong to the subsidiary. Its industry code may belong to an outdated legal entity. Its website may describe the local operating unit. Its LinkedIn page may be maintained by a regional marketing team. Its CRM record is often a messy combination of all of the above.

The model is not comparing apples with oranges. It is often comparing one entity boundary with another.

Key finding

B2B lookalikes have an entity problem

The issue is not just that B2B company data is noisy. It is that the model often does not know which version of the company it is supposed to learn from or target.

2.2The B2B data layer is much noisier

The second problem is that a lot of B2B company data is simply not good enough for this job.

A surprising amount of it is scraped, inferred, aggregated, re-sold, mapped, and re-mapped across systems, with LinkedIn often treated as the main source of truth.

In our benchmarks at Stride, most core company attributes are only around 60% accurate with existing providers (see our benchmarks). That means fields such as employee band, revenue band, industry, company type, or entity attachment may be wrong around 40% of the time.

This creates a bigger issue than most teams realise because data errors compound. A lookalike model is not usually making one decision from one field. It is making a similarity judgement across many fields.

If those fields are noisy, correlated, stale, or attached to the wrong entity, the model’s ability to separate genuinely similar companies from superficially similar companies collapses.

When running a 1% lookalike model against a universe of 1 million companies, we currently see a false positive rate of roughly 55%. This is driven by baseline company-attribute and entity-attachment accuracy, which benchmarks at about 60%.

2.3The problem is not just inaccurate data. It is inconsistent data.

For frontline sales and RevOps teams, a bad employee count is annoying. A bad revenue estimate is annoying. A bad industry classification is annoying.

But inconsistent entity definition is much worse for the business.

Imagine you sell to Acme Payments UK Ltd, a 400-person UK operating company dealing with payment compliance, acquiring, fraud, and cross-border merchant risk.

Parent record: One database records the customer as Acme Global Holdings, a 9,000-person multinational.
Regional subsidiary: Another records it as Acme Europe GmbH, a German regional subsidiary.
Mixed entity attributes: Another records it as Acme Payments UK Ltd, but attaches the parent company’s revenue and a generic financial services industry code.
Domain-level CRM account: Another creates a CRM account from the website domain and attaches contacts from three different countries.

Which company is the lookalike model supposed to learn from?

3Solution

3.1What B2B teams should do instead

There are three better ways to think about B2B lookalikes.

1Start at the people layer: A person is usually a clearer entity than a company. Title, function, seniority, location, and employer are often easier to reason about than whether a company record refers to the parent, subsidiary, or local operating unit. This maps more closely to the B2C equivalent, especially when there is a high density of roles in your ICP or buying committee.
2Start with other data sources: Traditional firmographics are useful when accurate, but are still low-density data points. Website content, product pages, hiring patterns, expansion announcements, regulatory context, market presence, and business model language can better explain the problem a company is experiencing. LLMs are useful here because they can extract themes from messy text and identify whether companies describe similar markets, operating constraints, or customer problems.
3Better firmographics, entity resolution, and explainable scoring: Firmographics need to be benchmarked, source-aware, and attached to the right entity. A useful system should know the ultimate parent, regional subsidiary, CRM entity, source of the revenue estimate, confidence level, and what changed. Recommendations should also be explainable enough for GTM teams to understand which signals are strong, which are weak, and whether the account is safe to send to sales.

3.2How Stride helps

This is one of the reasons we are building Stride around the data layer first. Stride helps GTM teams benchmark their CRM against a validated company universe, improve the quality of the firmographic layer, resolve entity definitions, and expand from one market to another with more confidence.

In practical terms, that means helping teams answer whether accounts are attached to the right entity, whether they are comparing parents, subsidiaries, or local operating units, which fields are reliable enough to use, and which similarities are based on company attributes, people signals, or contextual themes.

The goal is not to produce a magical AI score. The goal is to create a data layer that makes GTM decisions more reliable for enterprises. Once the data layer is better, lookalikes become much more interesting.

With a stronger data layer, teams can use lookalikes for expansion, vertical discovery, routing, account recommendations, and territory design without asking sales to trust a black-box score.

Example applications

1Expansion: Expand from Germany to the UK with entity definitions held constant, or find similar accounts in a new vertical using company, people, and contextual signals.
2Entity-level targeting: Identify local operating entities that resemble your best customers, then route inbound leads and build territories around real market opportunity.
3Trusted recommendations: Give sales teams account recommendations they can understand, inspect, and act on.