🚀 We're hiring, check out our careers page to join our team!

Spotted: Data Enrichment Tools & Best Practices

Almost every team we work with at Hull uses data enrichment.

Data enrichment enables you to associate publicly available, relevant customer data with an existing lead or customer profile. With deeper profiles of people and companies, they can personalize their messaging, qualify leads, and deliver a better buying and customer experience without laborious form fills and data capture.

In this Spotted post, we’re going to share the tactics and techniques that we’ve noticed being used by data-driven sales and marketing teams. But first, we need to address the modern day implications of using 3rd party customer data.

What about the privacy and GDPR implications of data enrichment?

Firstly, data enrichment isn’t about accessing private data. Rather, the focus is on packaging up data that is already public and tying it together. Data enrichment providers offer an auditable trail of how the raw data is captured (although the raw data may see some processing)

For instance, Datanyze (a technographic enrichment service) offers a source for each technology they’re tracking — for many tools this is checking for the presence of a tracking script in a webpage (like gtm.js) but to give a better idea how data enrichment works they’ll package, particularly for technologies which aren’t in the page like Salesforce.

You can see Slack uses Salesforce from job posts, or StackOverflow uses Salesforce from G2Crowd reviews. This is data a good sales rep might be able to find anyway, but data enrichment providers can find and package up much easier.

The next question is the right to control and process that data. GDPR outlines the direction we see privacy and data going — in short, giving a lot more control and visibility to the people whom the data is about.

The first for holding data is someone has given you consent. With data enrichment, this isn’t direct. One method we’ve seen is to use pre-fill forms once you’ve captured an initial set of data. This passes the data through as a “1st party” source, using 3rd party data — similar to how browsers can remember key form fields. Mention found a 54% increase in conversions using this method.

However, under GDPR consent is not the only reason for holding data about a person. For instance, a security firm keeping data about a person. Read through Article 6 of GDPR on Lawfulness of processing (my highlighting):

Processing shall be lawful only if and to the extent that at least one of the following applies:

a. the data subject has given consent to the processing of his or her personal data for one or more specific purposes;

b. processing is necessary for the performance of a contract to which the data subject is party or in order to take steps at the request of the data subject prior to entering into a contract;

c. processing is necessary for compliance with a legal obligation to which the controller is subject;

d. processing is necessary in order to protect the vital interests of the data subject or of another natural person;

e. processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

f. processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.

Article 6f is the point to dig into. The UK’s Information Commissioner, Elizabeth Denham, published a post tackling the “myth” that consent is the only way to comply with GDPR.

But Article 6f lacks specificity into what constitutes “legitimate interests”, and if you do choose to use data enrichment then this must be made clear to your leads and customers, and it must be controllable by any one individual.

So let’s be clear. Consent is one way to comply with the GDPR, but it’s not the only way. […] The new law provides five other ways of processing data that may be more appropriate than consent. ‘Legitimate interests’ is one of them and we recognise that organisations want more information about it.

Elizabeth DenhamUK Information Commissioner

But this isn’t a license to continue with some of the (sadly) reckless tactics of yesteryear.

Marketers have the terrible habit of ruining everything. I think Velocity Partners are bang on when they say GDPR = Generally Dickish Practice Retirement.

You need to reconcile how GDPR limits the processing and storage of unnecessary data with the power that comes from data enrichment. Data enrichment can bring a lot of data around individuals which isn’t directly relevant to your business needs (a provider like Clearbit has over 100 data points).

As data-driven marketers, it is our responsibility to prune the data we hold and use, whilst also defending our data needs to our legal teams.

With that in mind, here are the best practices we’ve spotted…

Tactical Takeaway #1: Use data enrichment to identify your ideal customer profiles

Data enrichment providers can offer dozens and dozens of data points. For some teams, this can be overwhelming (and in breach of compliance). How do you make use of all this data?

The best teams we see selectively use data enrichment providers based on their ideal customer profile. They use whatever data is needed to identify and engage potential best-fit customers.

Ideal customer profiles are not “wooly” personas. They’re clear, objective definitions shared by all customer-facing teams. Follow our process here to define your ideal customer profile.

Your ideal customer profile should spell out for you the types of data you need. This makes sure you don’t end up sourcing or buying (and then managing or wrangling) data you don’t need. Not only does this keep your data structure leaner and easier to manage, but this also helps with GDPR compliance since you’re not holding personal data without purpose.

Voila! You’ve identified the data you need, and no more.

Tactical Takeaway #2: Identify sources of data enrichment

You need to approach data enrichment providers in the same way you’d go grocery shopping. You have your defined list of needs (from your ICP) and you want to avoid splurging on unnecessary upsells and splurges.

In terms of finding data, there are a few sources.

  1. Web scraping
  2. Manual lookup
  3. Enrichment providers

Web scraping

You can do some basic web scraping with xPath in Google Sheets using the =importxml function. Though this quickly falls over and doesn’t work at scale, you can write simple xPath queries to lookup content with a page like a page title.

Say you wanted to pull a list of all the H2 chapter headings in this post, you could write =importxml("https://www.hull.io/blog/lead-nurturing/", "//h2") into a cell. Learn a little xPath syntax and try it in Google Sheets!

But this quickly falls over at any scale. Ghetto Google Sheets hacks can’t keep up with what’s possible with Python scripts and the like. If you can’t write your own scripts, then using a tool like Import.io can let you build your scraping model within a point-and-click interface and then export or sync via API (using Hull and Hull Processor) to your all tools.

Whatever method you choose here, the web is a messy place. There’s likely to be a lot of post-processing needed before your data is usable.

Manual work and “brute force”

Whether you have “your guy” or another trusted source, sometimes you just need an army of people to fetch data for you. For very common tasks, you want to avoid your (expensive) sales reps spending their time doing manual work that can be outsourced. This can be helpful (although more expensive) for more nuanced data.

For instance, at Hull, we sell to B2B SaaS companies. But SaaS might have a free trial or might only be accessible through demo and a “formal” sales process. We’d want to talk to each of these differently, discussed product qualified leads with one, and different playbooks with the other. It’s hard to get an accurate, complete picture of their business and pricing model without taking a look at their website first.

Mechanical Turk is one option to outsource these sorts of manual tasks. You can also define your tasks with a service like Crowdflower, and have the responses synced to your other tools (or Hull) through their API. This is important — you need to be able to automate the requests and responses of your manual lookups, then tie it together with your other tools.

Data enrichment providers

The simplest and easiest solution is to work with a data enrichment service. Data enrichment tools do three things:

  1. Fetch publicly available data
  2. Organize, cleanse, and format that data it in a useful manner
  3. Make the neatly packaged data available around a “key” (an identifier like email, domain name, or IP address).

The best providers can fetch and organize more data and better, and offer enrichment in different forms. Think of this as different ways to associate data and query their database. Things like:

  • Email address to a person profile
  • Domain name to a company profile
  • Domain name to persons’ profile (prospecting)
  • IP address to company profile (reverse IP address)

We’ve seen and worked with many data enrichment providers. As a category, we see them as essential to data-driven marketing and building a customer-focused experience. There’s five we’d mention in particular which are good at different things (and this is not an exhaustive list). It’s not uncommon for teams to use more than one tool.

FullContact for person-level data. With their APIs, you can get much more data about an individual than any other including social data, gender, and relationships. Ideal for B2C and person-level data. They also license their data to popular marketing tool vendors like HubSpot and Intercom to “fill in the blanks” in profiles there (so you might be using FullContact already without knowing it!).

Clearbit for professional and business level data. They have a collection of APIs to enrich, prospect & reveal people and companies. With over 100+ attributes, they have a depth of data and reliable coverage (particularly in the mid-market where others struggle) that others don’t appear to offer for B2B.

Datanyze for technographics. For companies selling software, understanding the technographic profiles of leads is essential. Datanyze’s data is fresher and (unlike most others) looks up technologies that aren’t just scripts in the page but technologies “out of the page” like CRMs and backend databases.

Madkudu for predictive scoring. Once companies get to a certain volume of leads and have established sales processes, the cost of missing best-fit opportunities and “talking points” with each of those leads grows. Using Madkudu you can enrich your leads with the scores and signals that are most relevant to them.

LeadGenius for custom data enrichment. The idea here is to combine and manage manual lookups with web scraping to package together custom person and company data that’s specific to your business, that’s not possible or unreliable from other data sources.

If you are looking for a data enrichment provider, we’d recommend talking with their reps to get some sample data for a set of leads or customers that you define (100 is a good number) and then measure the completeness and accuracy of the data they provide.

(At Hull, we’re also working with more and more data enrichment providers to co-sell Hull + Enrichment packages. Fewer contracts and a faster sales process. Learn more about data enrichment through Hull and talk to sales.)

Voila! You have identified the sources for your 3rd party data

Tactical Takeaway #3: Outline your data mapping and fallback strategies

It’s not just about identifying and sourcing data. You need to understand how the different sources of data map between tools and their fallback strategies. For instance, a job role (like Marketing Sales or Operations) might be captured:

  • 1st party from a lead or customer directly (through form, chat, or a sales call)
  • 2nd party inference (through onsite or in-product behavior and categorization. Templates for marketers)
  • 3rd party data enrichment provider

The best teams have a clear strategy how these different sources interact and how to prioritize them within a data flow. This means mapping out your data.

The first part of this is understanding how your data will appear in existing tools. Often, you’ll already have some of these fields setup (and tied to other logic like workflows and email templates). You need to map equivalent fields between tools. Here’s an example for name, email, and job title.

Salesforce HubSpot Intercom
Name First name, Last name First name, Last name Name
Email Email Email Email
Job Role Job Title Title Employment Title

You need to map data across all your key tools that own lead or customer profiles, for all the data points that define your ideal customer profile.

Next, you need to understand the priority of each data source. For instance, you may find data from lead forms more precise and accurate than 3rd party data enrichment, or the other way around.

Your data should write into one global trait from all the possible sources so you just need to map one trait that is populated with the most precise, most accurate answer at any one time.

To do this you can use Hull Processor to write and update hull.traits() using any of your other other data. You can then use this single set of reliable traits using fallback strategies to cleanly map the right data to all your tools.

We notice amongst data-driven teams that this is an iterative process. CDPs like Hull can help by giving a schemaless (reading: nothing’s “locked down”) unified customer profile whilst you iterate on and extend your customer profile data models.

Voila! You’ve now know where your data will come from and how it will map across all your tools

Tactical Takeaway #3: Universal data enrichment

Often, data enrichment is not distributed equally. Teams often start by enriching tools with plug-and-play integrations like Salesforce. This creates a data silo.

In the ideal world, you need to break and sync your customer data. You want every tool to share a common view of each lead and customer.

Illo: Sync

This is particularly true with marketing-to-sales handoff. The modern practices we see with CRMs like Salesforce is that contacts are only being created after signup or when sales reps are getting involved, with lead management held in a separate tool or database like a marketing automation platform or CDP. This keeps CRMs much cleaner, more manageable, and cheaper to run.

However, marketing needs to be able to use data enrichment to qualify leads, segment messages, and personalize content. There are likely to be many more total leads under marketing than by sales too (particularly if only sales-ready leads are in Salesforce). So, you need to be able to enrich your marketing leads too across all your marketing tools:

  • Email
  • Ads
  • Live chat
  • Web personalization
  • Sales calls
  • Direct mail
  • Analytics

… and everywhere you have a lead or customer profile. One way to do this is with a unified lead profile where you can capture everything about each lead, and everything they’ve ever done, according to every tool and database you have.

Your data enrichment providers and sources should make it easy and accessible to have your data flow to more than one tool. Some providers charge more for integration with different tools (often Salesforce). This is understandable if there’s additional value in how they integrate.

But avoid tools which charge you multiple times to use the same data in different tools with no other value-add — at Hull, we believe in breaking, not making data silos.

By enriching your main lead management system (CRM, marketing automation, or customer data platform) then syncing lead profiles between all your other marketing tools, you can integrate enriched data across your marketing campaigns.

Voila! You now have enriched data to use across all your sales and marketing tools

More "Spotted" playbooks like this?

Subscribe to The Crow's Nest newsletter from Hull for best practices, ideas & resources for customer data management.

Tactical Takeaway #4: Selective and sequential data enrichment

It doesn’t always make sense to enrich everyone all the time, particularly if there are multiple layers of enrichment to deliver the desired result. You need full control over who is enriched and when. There are two scenarios we see data-driven teams using these strategies in.

  1. Selective data enrichment
  2. Sequential data enrichment

Hull Segments

Selective data enrichment

Data enrichment providers often charge on a credit system based on the number of records returned. It might be the case that you can already disqualify and remove a set of leads based on data you already have from forms or previous data enrichment. No additional data will convince you that this account is a good fit.

For instance, it may not matter if someone’s a marketer if they’re part of a Government agency that you can’t service. Or it may not matter if someone’s a good fit or not if the company or person is in a country that speaks (and needs to be served in) another language.

To control data flows, you need to be able to create segments to sync selective contacts to your enrichment provider.

Sequential data enrichment

As well as using segmentation for controlling data flows, you can use segments to control a series of data enrichment calls.

Customer data flows in loops — where you request data about a profile, and get a response. In some cases, you may want to trigger multiple flows. For example, first to call Clearbit to get data about the person and company, and then to call Datanyze to get matching technologies if they happen to be a good fit.

You can also use sequential enrichment to manage powerful data driven strategies where the response from one data enrichment loop is the key for the next:

  1. IP address to company domain — “company revealed”
  2. Company domain to company profile — “company enrichment”
  3. Company domain to person profiles — “prospecting”
  4. Email address to person profile — “person enrichment”

This is called the Reveal Loop — a strategy we’ve noticed many scaling B2B SaaS startups adopt.

Voila! You have full control over your data enrichment flows.

Tactical Takeaway #5: Transform data enrichment into action

Most data enrichment providers do a good job at packaging up (usually messy, badly formatted) data into a clean, consistent set. But harking back to the first tactical takeaway…

The best teams we see selectively use data enrichment providers based on their ideal customer profile. They use whatever data is needed to identify and engage potential best-fit customers.

Data-driven teams don’t stop at whatever their enriched data source offers. Once you’ve got all these data points there’s processing that’s needed to turn it into action.

Illo Processor


Using data from data enrichment providers, you can compute new sets of data that are just relevant to your business. For instance, at Hull, we want to calculate the CRM a lead or customer is using.

But most data enrichment providers provide only provide an array of technologies through their API. We use Processor to “unpack” this array and write in their key tools in certain categories to personalize our emails and notifications to sales reps.

Subscribe for a future Spotted post on Lead Scoring & Signals…


Simply creating segments to control data flows — who is enriched, who is synced to your CRM and so forth. These segments can use enriched data to create and update. email lists, ad audiences, qualified leads, and more.

Automated lead qualification using data enrichment fixed a 50% drop off in demo request responses for one Hull customer.


Finally, you need to connect up a data flow between all your tools. For instance, at Hull and many similar companies with work with, each demo request that comes in through form or live chat needs to be enriched with Clearbit and Datanyze data.

If they match certain criteria, they enter a segment of Qualified Leads which is then synced to a CRM and Slack to notify sales reps.

Voila! You can transform your enriched data into automated actions

Best-fit criteria for data enrichment

Though data enrichment has broad usage and appeal, we’ve observed the best results amongst companies who:

Have a clear ideal customer profile (even if it’s not validated)

You need to find the data to define your ideal customer profile, not the other way around.

Some earlier stage companies have used data enrichment to test and experiment with different markets. Enriched customer data records also gave them more basis for quicker analysis (and faster iteration) than without.

Have a clear customer profile data model already

You have a tool setup which owns each key stage of the customer journey (like leads and customers) with a means to upload or sync contacts and contact data between them. This means you can put your enriched profile data to use right away.

Have some contacts and web traffic

If you’ve very few contacts to enrich, your time will be better spent working elsewhere in the funnel. Some data enrichment providers offer a simple one-off upload and batch service if you want to test the waters relatively cheaply.

Results we’ve seen

We’ve observed amongst Hull customers who fit the criteria:

Get started with data enrichment & lead scoring

Fill in the blanks in your lead & customer profiles, and compute lead scores via Hull's integrations.

Explore Hull's data enrichment integrations

What you should do now

  1. Test drive Hull with our 14 day free trial - and see how to unify & sync all your tools, teams & data (like we did for all these companies), or book a demo with a product expert.

  2. If you'd like to learn our best practice for customer data integration, read our free Guide to Getting Started with Customer Data Integration.

  3. If you enjoyed this article, perhaps your team will too? Why not share it with the links below.

Ed fry
Ed Fry

Prev 'Ed of Growth at Hull, working on all things content, acquisition & conversion. Conference speaker, flight hacker, prev. employee #1 at inbound.org (acq. HubSpot). Now at Behind The Growth

If you've questions or ideas, I'd love to geek out together on Twitter or LinkedIn. 👇