What Data Should a Candidate Enrichment API Actually Return?

Most candidate enrichment APIs return contact data and work history. Recruiting platform builders need nine layers including verification signals, employer context, and freshness metadata.

Published

May 10, 2026

Written by

Abhilash Chowdhary

Reviewed by

Nithish

Read time

minutes

The gaps in a candidate enrichment API show up usually after you start building on it. Your outreach email bounces because the API returned a business email for someone who changed jobs. Fraud filtering fails because the response has no profile age or verification fields to filter on. Employer context is missing, so your matching engine treats a candidate at a 10-person seed startup the same as one at a public company.

These are fields that most enrichment APIs do not return consistently, and that tool builders only realize they need once their HR teams and managers who talk to candidates after the initial sourcing flag poor fit candidates.

After speaking with over 60 teams building recruiting tools, sourcing platforms, and talent marketplaces, we identified nine distinct data layers that a candidate enrichment API should return. This guide covers all nine, with the specific fields in each and why they matter for recruiting platforms.

Contact Data: Personal Email, Business Email, and Phone

Contact data is the data point teams ask about first, and the one that causes the most frustration when it requires multiple API calls to assemble.

Teams consistently describe using one provider to find candidates, a second to get email addresses, and sometimes a third for phone numbers. One builder described running a three-step process across separate databases just to get a personal email, business email, and phone number for a single candidate.

What your enrichment API should return in the contact layer:

Personal email (persists when candidates change jobs, unlike business email)
Business email (verified, with validation status)
Personal mobile number (verified, the channel that actually converts for passive candidates)
Office direct dial (for executive outreach where personal mobile is unavailable)
Email verification status (valid, invalid, or catch-all, so your platform can suppress bounces before sending)

The critical distinction here is personal vs business email. Business emails bounce the moment someone leaves a company, and with B2B data decaying at roughly 2% per month according to Marketing Sherpa's research, a quarter of your contact records go bad within a year. A people enrichment API that returns both email types with verification status lets your platform route outreach through the channel most likely to land.

Identity Data: Name, Location, Headline, and Profile URLs

Identity fields are foundational for deduplication, candidate matching, and display.

What your enrichment API should return in the identity layer:

Full name (first, last, and any aliases)
Current location (city, state, country, plus geo-coordinates for distance filtering)
Headline and summary (the candidate's self-description)
LinkedIn URL (canonical profile link)
Profile photo URL
Languages spoken

Location data might not be important for some teams, but is critical for others. For example, we spoke to a healthcare staffing platform who told us they need to match candidates and jobs within commutable radius at scale across regions without manual filtering. An enrichment API that returns structured location fields (city, state, country) paired with a search API that supports geo-distance filtering enables this without a separate geocoding step.

With the world moving back to work from office, we believe this datapoint will become increasingly more important.

Work History: Current and Past Employers, Titles, and Tenure

Work history is the layer where data freshness problems cause the most visible damage. According to Marketing Sherpa's research, B2B data decays at roughly 2.1% per month, which means a database that was accurate in January is only about 70% accurate by September.

Platform builders running outreach agents on top of out-of-date work history data send messages referencing roles candidates left months ago, affecting their deliverability rates, domain reputation and wasted efforts of time and cost reaching out to incorrect candidates.

What your enrichment API should return in the work history layer:

Current employer (company name, company ID, domain)
Current title
Start date at current role
Past employers (array, each with title, company, start date, end date)
Total years of experience (calculated from employment history)
Industry classification

The key evaluation criterion is whether the API returns dates granular enough to calculate tenure and detect recent job changes. An API that returns "works at Company X" without a start date cannot support job-change detection, tenure-based filtering, or career trajectory scoring. If you are building a candidate search engine, these date fields are what make filters like "changed jobs in the last 90 days" or "5+ years at current company" possible.

An auto-updating candidate database built on top of enrichment data with proper date fields can detect when profiles go out of date and trigger re-enrichment automatically.

Skills Data: Endorsements, Certifications, and Duration

Every competing enrichment provider returns a skills array. None of them connect skills to experience duration or distinguish between self-reported skills and verified certifications.

One recruiting platform builder described hiring eight out of ten candidates who were let go within their first two months because skill claims on profiles did not match the candidate's actual capability. The problem is that most enrichment APIs return skills as a flat list of strings ("Python", "React", "AWS") with no information about how long someone has used each skill, how many peers have endorsed them for it, or whether they hold a formal certification.

What your enrichment API should return in the skills layer:

Skills array (skill name + endorsement count from connections)
Certifications (name, issuing organization, issue date, expiration date)
Education (school, degree, field of study, graduation year) If your enrichment API returns skills with endorsement counts alongside detailed role history with dates, your platform can infer skill duration by cross-referencing the two. A candidate who lists "Python" and has held three Python-heavy engineering roles over seven years is a fundamentally different profile than someone who completed a bootcamp last month. Without endorsement counts or role timelines in the enrichment response, your platform cannot make that distinction. This also affects lookalike candidate search, where finding candidates similar to a top performer requires skill depth and tenure data, not just matching keywords.

Signals: Job Changes, Open-to-Work, and Profile Activity

Passive candidates do not apply to jobs, so the only way to identify them programmatically is through behavioral signals like recent job changes, profile edits, open-to-work indicators, and content activity.

Most enrichment providers treat signals as a separate product with its own pricing and API surface. For platform builders, this creates an architectural problem. You want signal data attached to the candidate profile, not sitting in a separate system that requires its own integration.

What your enrichment API should return in the signals layer:

Last job change date (when did this person most recently switch roles)
Open-to-work flag (publicly or privately marked)
Recent profile edit timestamp (profile edits correlate with job-seeking behavior)
Post activity (count and recency of content published)
Engagement signals (commenting, reacting, connecting at higher-than-baseline rates)

One team building a candidate-matching engine described wanting a "feed of profile changes" that would surface candidates actively updating their skills, changing headlines, or engaging with content in their target industry. Another platform builder needed "real-time visibility into job-seeking signals" to trigger automated outreach sequences the moment a passive candidate showed intent.

A Watcher API that pushes these signals as webhook events, combined with enrichment that returns the current state of each signal field, gives your platform both the point-in-time view and the ongoing change. For tracking specific events like role transitions, a job change tracking workflow can trigger re-enrichment automatically.

Verification Signals: Detecting Fake Profiles Before Outreach

23 teams we spoke with raised candidate fraud detection as a requirement for their enrichment provider. Not one found a provider that returns verification metadata.

The problem becomes prominent when you're scanning large volumes of candidates. When your platform is sourcing thousands of candidates automatically, fake or abandoned profiles waste outreach budget and damage sender reputation. One team building a talent-fraud detection layer described needing the professional network profile age and verification badges from their enrichment provider, and finding neither available from their current data source. Another reported that data from their provider included profiles that were "either inaccurately filled in or fake profiles entirely," eroding trust in their sourcing pipeline.

What your enrichment API should return in the verification layer:

Profile creation date (accounts created recently with extensive history are fraud signals)
Verification badge (identity-verified profiles)
Connection count (extremely low counts on "senior" profiles suggest fake accounts)
Recommendation count (hard to fake at volume)
Profile completeness score (sparse profiles with impressive titles are a red flag)
Last activity timestamp (dormant profiles are either abandoned or fabricated)

These fields are largely absent from most data providers, yet the cost of routing outreach toward fake profiles at scale is significant. Wasted enrichment credits, bounced emails, and degraded domain reputation all compound, affecting deliverability across your entire outreach operation.

Technical Data: GitHub, Stack Overflow, and Portfolio Links

For engineering and technical hiring, professional network profiles are insufficient signal. A frontier AI lab told us their internal sourcing data for Rust engineers and researchers was "pretty old and scarce." Solo headhunters sourcing AI and engineering roles specifically asked for GitHub-connected talent search. Multiple platform builders wanted thought-leadership signals from technical content alongside profile data.

What your enrichment API should return in the technical layer:

GitHub URL (profile link)
GitHub metrics (public repos, stars, contribution frequency)
Stack Overflow profile (if available, with reputation score)
Personal website / portfolio URL
Twitter/X handle

Social Posts and Content Activity

Beyond static profile links, returning recent post data provides a real-time signal layer. A candidate who published three articles about distributed systems this month is demonstrably active in the field, regardless of what their static professional network skills list says. A Posts API that returns recent content alongside profile enrichment gives your platform this signal without a separate integration. You can also use a web search API to scan the web for published blogs, open-source projects, or research papers on topics relevant to the role you are hiring for.

Employer Context: Funding Stage, Headcount, and Growth Signals

Recruiters and sourcing platforms evaluate candidates partly based on where they work. A senior engineer leaving a Series-C company with 200% headcount growth is a different profile than one leaving a company that just went through layoffs, yet most enrichment APIs treat the candidate and their employer as entirely separate entities.

Multiple platform builders flagged this gap. One UK headhunting firm building a BigQuery candidate database needed revenue-band filters on candidates' employers to identify target profiles. Teams building executive search longlists particularly need this, because evaluating a VP-level candidate requires knowing whether their employer is a 50-person startup or a 10,000-person enterprise.

What your enrichment API should return in the employer context layer:

Employer funding stage (seed, Series A/B/C, public, bootstrapped)
Employer total raised
Employer headcount (current)
Employer headcount growth (3-month and 6-month percentage change)
Employer industry
Employer founding year

Returning employer context alongside candidate profile data eliminates the need for a separate company enrichment API call for every candidate. Your platform can filter candidates by employer growth stage, identify candidates at companies likely to downsize (negative headcount growth), or prioritize outreach to candidates at well-funded competitors.

Freshness: When Was This Profile Actually Last Checked?

Every enrichment provider claims to have "up to date" data. None expose when a specific profile was last actually verified as a returnable field in their API response.

This matters because providers with monthly refresh cycles deliver data that is 30 to 90 days old on average. One platform builder described their enrichment data as "always old because the patchwork mechanisms are delayed." At 2% monthly decay, a profile enriched 90 days ago has a meaningful probability of containing at least one out-of-date field, most likely the employer or title.

What to look for when evaluating freshness:

last_enriched_at (timestamp of when this specific profile was last queried from source)
data_source (which provider or method populated this record)
Real-time enrichment option (can you force a live re-query for a specific profile rather than getting cached data?)

When freshness metadata is available, your platform can make intelligent decisions about re-enrichment. A profile enriched three days ago does not need re-enrichment. A profile last enriched nine months ago, sourced from a batch provider, should be re-enriched before any outreach is sent. Without this metadata, your platform treats all profiles identically regardless of data age. Crustdata's People Enrichment API supports both cached (in-database) and real-time enrichment modes, so your platform can choose per-request whether to accept cached data or force a live re-query.

Candidate Enrichment API Delivery: REST, Bulk, and Webhooks

The nine data layers above are only useful if they arrive in the format your architecture needs. Different build stages and use cases require different delivery mechanisms.

REST API (per-record enrichment): Best for early-stage platforms, real-time enrichment on candidate view or import, and low-to-moderate volume. Returns a single candidate's full profile in one response.

Bulk datasets (flat files or parquet): Best for platforms at scale where per-record API costs become prohibitive. One team building an AI-native ATS replaced their per-call CoreSignal API integration with bulk dataset delivery once they hit volume. Another described reaching "the scale where it probably makes sense to just buy data sets versus on-demand API calls."

Webhook / Watcher delivery (push-based): Best for signal-driven workflows. Instead of polling an API daily to check whether candidates changed jobs, a watcher pushes notifications only when tracked events occur.

Waterfall enrichment (multi-provider fallback): The architecture that platform builders consistently described for achieving 95%+ deliverability. One team described running a primary provider for search, a second provider for contact data, a third for phone numbers, and Crustdata as the real-time verification layer. A real-time and batch enrichment architecture that combines bulk for baseline coverage with API and webhook for freshness on priority records is the pattern we see most teams end up using.

What Most Candidate Enrichment API Providers Leave Out

After mapping what 60+ recruiting platform builders need against what providers actually return, specific gaps stand out across the major vendors.

People Data Labs: 1.5B+ people profiles with monthly refresh cycles and broad coverage, but their profiles are updated on a monthly basis. Returns business email but personal email coverage is inconsistent.

CoreSignal: 865M+ employee profiles with real-time API access, GitHub URLs, and 300+ fields per record. Strong raw data for engineering teams, but independent reviews note that profile changes are not pushed via webhooks, so your platform has to poll for updates.

Apollo: 275M contacts with built-in search and outreach features, but multiple platform builders described data quality issues with outdated records. Profiles frequently listed candidates at companies they left months prior, and accuracy issues compound when used for automated outreach at scale.

Bright Data: Raw scraping infrastructure that can return profile data at scale, but multiple teams reported receiving fake or fabricated profiles. Without verification signals in the response, there is no programmatic way to filter them out before your outreach pipeline processes them.

ZoomInfo: 321M profiles with claimed 92% contact accuracy, but platform builders report a high volume of outdated profiles and limited European data coverage. Enterprise-only pricing does not fit usage-based architectures, and the API is not designed for the kind of high-volume programmatic access recruiting platforms need.

Layer	PDL	CoreSignal	Apollo	ZoomInfo	Crustdata
Contact (email + phone)	Partial (business only)	Professional email only	Yes	Yes	Yes (personal + business + phone)
Work history	Yes (monthly refresh)	Yes	Yes (accuracy issues)	Yes	Yes (real-time option)
Skills + certifications	Basic list	Inferred from text	Basic list	Yes	Yes + duration inference
Behavioral signals	No	Workforce signals	Limited	Intent data	Yes (job change, profile activity, posts)
Verification / fraud	No	No	No	No	Yes (profile age, connections, badges)
Technical profiles	No	Yes (GitHub URL)	No	No	Yes (GitHub, portfolio)
Employer context	No	No	No	Partial	Yes (funding, headcount, growth)
Freshness metadata	No	No	No	No	Yes (cached + real-time modes)

Conclusion

A basic candidate enrichment API returns contact data and work history. A recruiting platform needs layers five to nine, described in this blog, including behavioral signals, verification, technical profiles, employer context, and freshness metadata. These fields determine whether your sourcing agent sends relevant outreach or wastes budget on outdated, fake, or contextless profiles.

As recruiting platforms become more autonomous through AI agents and agentic workflows, the data layer they are built on determines their maximum capability. An enrichment API that returns all nine layers gives your platform the foundation to build intelligent sourcing, automated matching, and signal-driven outreach without stitching together five separate providers.

If you are building a recruiting platform and evaluating enrichment providers, Crustdata's recruiting solution returns real-time people data with contact, signals, employer context, and freshness metadata through API, bulk, and webhook delivery. Book a demo to see the full field inventory.

Abhilash writes about data-driven automation, enrichment systems, and API-powered intelligence for GTM, recruiting, and investment use cases. He writes for builders who care about accuracy, latency, and reliability with technical guidelines and tips.