How to Enrich a Candidate From Their GitHub Handle With a People API

How to turn a GitHub handle into a structured candidate record with a people API, resolve it to the right person, and pull repos, languages, and recent activity.

Published

Jun 19, 2026

Written by

Abhilash Chowdhary

Reviewed by

Nithish

Read time

7

minutes

How to Enrich a Candidate From Their GitHub Handle With a People API

A recruiter we spoke with described a telling habit. They find an engineer on GitHub, then send the same message on GitHub and on the person's professional profile, just to see which one gets read. They are not sure the two accounts even belong to the same human. That guesswork is what slows down github sourcing once you are past finding the candidate in the first place.

This guide is about that step. You have a GitHub handle, and you want to turn it into a structured candidate record that holds the real person behind the account, a confidence score that the match is right, their repositories, the languages they actually write, and what they have shipped recently. We will do it through a people API, with copy-paste code, and we will be honest about where GitHub stops being useful. You can run the same calls on a free Crustdata account, which comes with 100 credits.

Why is GitHub a qualifier rather than a discovery channel

It helps to start with the objection, because it is correct. On the recruiting and engineering forums, the common take on GitHub is that it is overrated as a way to find people. A long cscareerquestions thread on whether employers even check it concludes that by the time a candidate is being looked at this closely, most have already been screened out by other means. A hiring manager on Hacker News puts it more plainly, that most GitHub profiles they see are worthless, mostly trivial contributions or cookie-cutter tutorial code.

So this guide does not treat GitHub as a discovery channel. It treats GitHub as the layer where you verify and enrich a candidate you already have. The handle confirms identity, surfaces the work, and tells you what to write in the first line of an outreach. For the broader question of why strong engineers so often have thin public profiles in the first place, we wrote a separate piece on identifying great candidates from sparse profiles.

How do you match a GitHub handle to the right person

A handle can point to more than one candidate. Two engineers share the same name, a strong developer hides behind a pseudonymous account, and GitHub hands out noreply commit emails by default. A talent-marketplace builder we spoke with named the doubt directly when he asked how you figure out which GitHub belongs to which person, and how you know that is accurate. His plan was to test it by hand, because he had no other way to trust it.

A people API closes that gap by returning the linked person and a confidence score on the match. You pass the GitHub URL, and you get back the Crustdata person it resolves to along with the developer-profile data.

The response carries a crustdata_person_id and a dev_platform_profiles array. Each profile includes a confidence_score from 0 to 1, the account name and bio, follower counts, and the organizations the account belongs to. The same dev_platform_profiles data also comes back inline from the Person Enrich endpoint when you already have a professional profile URL or a work email, so you can fold it into an enrichment call you are making anyway.

Reading the confidence score

The score reflects how well the account ties back to a known person, using signals like the handles the owner declared on the profile, the name, and the linked accounts. Treat anything above roughly 0.9 as a match you can act on, and anything in the middle as a match you should confirm with a second signal before outreach. Not every handle resolves, and a crustdata_person_id of 0 means the account is not yet linked to a person in the graph. That last case is the honest answer to the accuracy question, and it is worth testing on your own list of handles before you rely on it.

What a people API returns from a GitHub handle

Once the handle resolves, each entry in dev_platform_profiles gives you a structured view of the account without any scraping of your own.

Field

What it tells you

account_type

u for a user, o for an organization

name, bio, company_text

Display name, free-text bio, and declared employer

public_repo_count

How many public repositories the account has

followers, following

Reach and how actively they follow others

org_memberships

Public organization memberships, by login

declared_handles

Other accounts the owner linked, like a personal site or X

is_hireable

Whether the account opted into hireable status

location, confidence_score

Declared location and the match confidence

The declared_handles array is the useful bridge for identity work, because it links the GitHub account to the other profiles the person chose to associate with it.

From handle to a contactable person

The step that closes the loop is contact. A technical-hiring founder we spoke with framed the whole value in one line, that if you send him an engineer's GitHub, he can send back their personal email, and to him that is trivial. The noreply commit email is no longer the dead end it looks like, because once the handle resolves to a person you enrich that person for a deliverable address.

You now have a name, an employer, a confidence score, and a way to reach them, all keyed off a single GitHub handle.

Pulling repos, languages, and recent activity

The people API gives you the repository count, the organizations, and the verified identity, and that is what a Crustdata rep confirmed on one of these calls. It does not return a per-repository list or a per-language breakdown. For that detail you go to GitHub's own public REST API with the same handle the people API just verified.

Two endpoints do the work. The users repos endpoint returns each repository with its primary language, a fork flag, a pushed_at timestamp, and a star count. The languages endpoint returns the byte breakdown per repository, which is a far better read on what someone writes than the single primary language.

The merged candidate record

You now merge two sources into one record. The people API supplies who the person is, and GitHub supplies what they build.

Comes from the people API

Comes from GitHub's API

Verified person and confidence_score

Per-repository list

Employer, location, contact

Languages by bytes

public_repo_count, org_memberships

pushed_at, stars, fork flag

That merged object is the structured record an agency builder told us was the main job, matching a thin public profile against the candidate's real tech stack so the agent can see the repositories and languages and reference them in outreach.

How do you tell real skill from a profile that just looks active

The contribution graph is the signal people trust least. A busy wall of green can be almost entirely forks, tutorial repositories, and the same standard datasets everyone trains on. The signals that hold up are the ones you can read from the fields you just pulled, so you threshold on them instead of eyeballing a profile.

Original work versus forks

The fork flag separates a repository the person started from one they copied. The code above already drops forks, so what remains is the work they chose to write. A profile that collapses to almost nothing once forks are removed is telling you something useful.

Languages by actual bytes

The byte breakdown from the languages endpoint shows the real weight of each language, where the primary-language label often hides it. An account that reads as a Python developer by repository count can turn out to be mostly Rust by bytes, and that changes who you are talking to.

Recent versus historical activity

The pushed_at timestamp tells you whether the work is live or years old, which matters more now that a single polished repository is easy to fake. What holds up is sustained activity across many months rather than one impressive moment. The agency lead we spoke with wanted exactly this, to know what the person is working on recently rather than what they shipped three jobs ago.

How do you enrich a whole list of candidates at once

A UI handles one profile at a time, which is why you move this into code once you have a list to get through. In code you can run every handle through the same resolution in one pass. There are two ways to wire it up, and both reach the same data.

The MCP path

A Claude Code agent with Crustdata's MCP server configured can call the enrichment for you, so you describe the job in plain language and let the agent resolve each handle, pull the GitHub activity, and assemble the records. It is the fastest way to prototype before you commit engineering time, and it suits a recruiter who would rather not write the orchestration.

The direct-API path

If you want full control you call the API yourself and loop your list through the same functions above. Batch person enrichment accepts many profiles in a single request, so the resolution scales without one call per candidate. One thing to plan for is GitHub's own rate limit on the activity step, which is 60 requests an hour without a token and 5,000 with one, per GitHub's REST API rate-limit docs. For a list in the thousands, use an authenticated token and budget the repository and language calls accordingly.

What about candidates with an empty, private, or mismatched GitHub

An empty GitHub usually just means the person keeps their best work in private repositories, or never bothered maintaining a public profile, so a thin account should not count against anyone. The skepticism cuts the other way too, and we cover the full picture in the sparse-profile guide.

The case to watch is the confident-looking wrong match. A same-name collision can resolve to a plausible person who turns out to be someone else entirely. When the confidence score sits in the middle, disambiguate with a second signal before you act, such as a declared handle that matches, a shared employer, or activity that lines up with where the candidate actually works. A low score is a reason to stop and check rather than a green light.

Sometimes the handle is the only identifier you will ever get, because the person has no usable professional profile at all. In that case the GitHub account is your anchor, and the rest of their footprint lives across papers, talks, and posts that you reach a different way, which we cover in our guide on enriching candidates with a web search API. Crustdata is the data layer for recruiting teams building their own enrichment, and GitHub is one strong input you merge with the others rather than a verdict on its own.

Conclusion

The loop is short once it runs in code. You start with a GitHub handle, resolve it to a verified person with a confidence score, pull their repositories and languages and recent activity, and end with a structured record you can rank and reach out from. Done across a list, it replaces the manual profile-by-profile review that does not scale and the double-messaging that comes from never being sure who you are talking to.

A few practices carry over to any list. Trust the confidence score, and confirm the middle band with a second signal. Read languages by bytes and drop forks before you judge activity. Treat an empty GitHub as missing data rather than a mark against the candidate. Come and see what the resolved records look like across your own list of handles. Book a demo, or start free with 100 credits at crustdata.com.

Frequently asked questions

Do recruiters even look at GitHub, or is it just an accessory on a resume? Recruiters rarely discover candidates on GitHub, and the forums are right that it is overrated as a search channel. It earns its place later, as a qualifier on a candidate you already have, where the repositories and languages confirm what the person actually builds and give you something specific to say in outreach.

How do I find someone's GitHub handle and be sure it is actually theirs? Resolve the handle through a people API that returns the linked person and a confidence score. A score above roughly 0.9 is safe to act on, and a middle score should be confirmed with a second signal like a declared handle or a shared employer before you reach out.

How do I tell real skill from a profile that just looks active? Read the fields rather than the green squares. Drop forked repositories, weigh languages by bytes rather than the primary-language label, and check pushed_at for recent work. A single polished repository means little now that it is easy to fake, while sustained original activity over time is the signal that holds up.

What about strong engineers with an empty or private GitHub? Treat it as missing data. Many strong engineers keep their work in private repositories or skip a public profile entirely, so a thin account tells you little on its own. Merge GitHub with other signals, and when the handle is the only identifier you have, build the rest of the record from their wider footprint.

Can I enrich a whole list of candidates programmatically instead of one by one? Yes, and that is the reason to move off the UI. Run your handles through batch enrichment to resolve each person, then pull GitHub activity per handle with an authenticated token to stay inside the rate limit. For the rubric that turns these signals into a ranked shortlist, see our piece on candidate enrichment APIs.

Data

Delivery Methods

Use Cases

Solutions