How to Build a Portfolio Monitoring System That Watches Between Quarterly Reviews

Portfolio monitoring is a capture problem, not a dashboard problem. Learn how to build a real-time signal layer that tracks exec changes, hiring velocity, and founder posts across your portfolio.

Published

May 10, 2026

Written by

Abhilash Chowdhary

Reviewed by

Chris Pisarski

Read time

7

minutes

How to Build a Portfolio Monitoring System That Watches Between Quarterly Reviews

Every quarter, associates at VC and PE firms manually collect data from portfolio companies before the investment committee meeting. Between those meetings, the portfolio moves, and the fund has no way of knowing until the next review. Portfolio monitoring is a capture problem. The dashboard is fine. Nothing is watching the portfolio between meetings, and that gap is where signal gets lost.

This article walks through how to build a real-time signal capture layer underneath whatever dashboard or LP reporting tool your fund already uses. The architecture works for a four-person generalist fund tracking 30 portfolio companies and for an ops team watching 25,000 companies across venture and public equities. If you have a portfolio list and a Slack channel, you can have the first version running today.

Why monitoring companies is a fundamental data capture problem

Most portfolio monitoring platforms focus on collecting financial KPIs from founders and displaying them in a dashboard for LP reporting. That dashboard already exists at most funds, because LP reporting requires a specific layout and quarterly cadence. However, there's a lot that can change in 3 months and it's imperative LPs know about these changes much sooner than 3 months later.

A head of research at a late-stage firm described it directly: "We're spending a ton of our time automating workflows and trying to find the right sources for continuous monitoring."" The pattern across teams we spoke with was consistent: a portfolio company hires a key exec, raises a down round, or posts a product milestone, and the fund learns about it from the founder weeks later. The dashboard was not the problem. The problem was that nobody, and nothing, was watching the portfolio between meetings.

According to the Data Driven VC Landscape 2025 report, 65% of data-driven VC firms rely on internal tools for the majority of their desk work. The reason is that off-the-shelf platforms solve the display problem (dashboards, LP reports, quarterly formatting) while not entirely solving the data capture problem. When you have 25,000 companies in your funnel and six people on the investment team, manual data capture is not practically scalable. One team monitoring over 8,000 companies reported that data gathering consumed months rather than operating in real time, with spreadsheet-based tracking creating integrity and timeliness issues that compounded each quarter.

That quarterly manual data collection is the capture system, and the build is about replacing it with an automated layer that fires signals the moment something changes. You should not replace the dashboard, because LP reporting already expects a specific layout. What you build underneath it is a live-data capture layer that feeds the dashboard continuously instead of once a quarter.

Six signal layers a portfolio monitoring system needs

The financial KPIs your LP reporting already collects (IRR, MOIC, TVPI, burn rate) are internal metrics that come from the founder. The signals below are external, meaning your fund can track them without asking the founder for anything. These are the six layers that came up most consistently across conversations with teams building portfolio monitoring internally.

People changes. Executive departures, key hires, title changes, and promotions across the portfolio. Every team we spoke with listed this as their top signal. One fund described learning about a VP Engineering departure from a social post two weeks after it happened, while their monitoring platform showed no change at all. Many of these teams originally tracked people changes through custom scrapers, but as one head of technology described, scraping has become "increasingly difficult" and "makes the system really fragile." People search and enrichment APIs replace that fragile scraper layer by resolving profile-level changes through structured API calls.

Hiring velocity by department. A sudden spike in engineering hires often signals a product push or a funding round that has not been announced yet. A drop in sales hiring at a portfolio company can signal a pivot or a cash-flow problem. Several teams we spoke with use department-level hiring data as a revenue proxy when the founder has not shared numbers yet. One fund tracks department-level headcount as a time series, watching when a portfolio company starts a sales department and how the engineering organization grows over each quarter.

Founder and company social posts. This was the most commonly cited unmet need. Founders post about product launches, team changes, fundraising milestones, and cultural shifts on social platforms before they send an update to investors. Tracking these posts across a portfolio gives the fund a continuous narrative feed without waiting for the quarterly email.

Web traffic time series. Monthly website traffic serves as a growth proxy that does not depend on the founder sharing revenue numbers. Several teams we spoke with track web traffic alongside headcount as a paired signal, because traffic trending up while headcount stays flat can indicate efficient growth, while traffic dropping alongside a hiring freeze is a different conversation entirely.

Website content and marketing activity. Whether a portfolio company is actively updating its website, publishing blog posts, and refreshing its messaging is a health signal that surfaces problems early. One growth-stage fund described monitoring portfolio company marketing sites weekly: "If they're not updating anything on their marketing site for days, weeks, months, that gives you a different signal than if they are constantly updating." A portfolio company that goes quiet on its own website is often dealing with an internal problem the fund will hear about at the next quarterly check-in.

Open source and community traction. For portfolio companies with developer-facing products, GitHub repository activity and community growth on Discord or similar platforms provide a real-time product traction signal. One fund tracks day-over-day and week-over-week growth in Discord community size alongside open source repository activity, because these leading indicators move before revenue does.

The four-layer architecture

A portfolio monitoring system has four layers that connect in sequence: a data layer that provides real-time company and people records, an entity resolution layer that maps portfolio identifiers across providers, a watchlist-and-watcher layer that subscribes to changes, and a routing layer that pushes signals into surfaces the partners already use.

Layer 1: Data layer. The data layer provides up-to-date company and people records on demand. When you first set up monitoring, you pull the baseline state for every portfolio company so you have a reference point for future changes. Here is how that looks using the Company Enrichment API:

import requests

CRUSTDATA_API_TOKEN = "your_api_token"
HEADERS = {"Authorization": f"Token {CRUSTDATA_API_TOKEN}"}

portfolio_domains = [
    "companya.com", "companyb.com", "companyc.io"
]

def get_portfolio_baseline(domains: list[str]) -> list[dict]:
    results = []
    for domain in domains:
        resp = requests.post(
            "https://api.crustdata.com/screener/company/enrich",
            headers=HEADERS,
            json={"domain": domain, "fields": [
                "company_name", "headcount",
                "headcount_by_function", "total_open_jobs",
                "last_funding_round_type", "founders", "cxos"
            ]}
        )
        if resp.ok:
            results.append(resp.json())
    return results

baseline = get_portfolio_baseline(portfolio_domains)
import requests

CRUSTDATA_API_TOKEN = "your_api_token"
HEADERS = {"Authorization": f"Token {CRUSTDATA_API_TOKEN}"}

portfolio_domains = [
    "companya.com", "companyb.com", "companyc.io"
]

def get_portfolio_baseline(domains: list[str]) -> list[dict]:
    results = []
    for domain in domains:
        resp = requests.post(
            "https://api.crustdata.com/screener/company/enrich",
            headers=HEADERS,
            json={"domain": domain, "fields": [
                "company_name", "headcount",
                "headcount_by_function", "total_open_jobs",
                "last_funding_round_type", "founders", "cxos"
            ]}
        )
        if resp.ok:
            results.append(resp.json())
    return results

baseline = get_portfolio_baseline(portfolio_domains)
import requests

CRUSTDATA_API_TOKEN = "your_api_token"
HEADERS = {"Authorization": f"Token {CRUSTDATA_API_TOKEN}"}

portfolio_domains = [
    "companya.com", "companyb.com", "companyc.io"
]

def get_portfolio_baseline(domains: list[str]) -> list[dict]:
    results = []
    for domain in domains:
        resp = requests.post(
            "https://api.crustdata.com/screener/company/enrich",
            headers=HEADERS,
            json={"domain": domain, "fields": [
                "company_name", "headcount",
                "headcount_by_function", "total_open_jobs",
                "last_funding_round_type", "founders", "cxos"
            ]}
        )
        if resp.ok:
            results.append(resp.json())
    return results

baseline = get_portfolio_baseline(portfolio_domains)

If your team uses Claude Code with the Crustdata MCP server, the same baseline pull is a single natural-language prompt: "Enrich these 30 portfolio company domains and save the results as a CSV." No Python code required.

Layer 2: Entity resolution. Your portfolio list in the CRM spells company names one way, your data provider spells them another, and your LP reporting tool uses a third variation. The entity resolution layer maps these into canonical identifiers so every downstream system tracks the same company. This is covered in detail in the next section, because it is where most internal builds quietly break.

Layer 3: Watchlist and Watchers. A watchlist is a saved set of portfolio company IDs and key-person IDs. Watchers are webhook subscriptions attached to those IDs that fire when a tracked change occurs. The watcher layer replaces polling with push-based delivery, so your system reacts to changes instead of checking for them on a schedule.

Layer 4: Routing. When a Watcher fires, the webhook payload routes into whatever surfaces the partners already read. For most funds this means three outputs running in parallel: a Slack message in a dedicated portfolio-signals channel, adding a row into the existing dashboard's data (so LP reports pull from a live feed instead of manual entry), and an email digest that batches the previous day's signals into a summary for the weekly updates. The routing layer is the simplest part of the build, because the output format depends entirely on your fund's existing tools. When a signal fires, the webhook handler can also trigger a fresh company enrichment or people enrichment call on the affected record, so the partner reads up-to-date context instead of whatever was cached last quarter.

Setting up Watchers for the portfolio

Watchers turn the polling model into a push model. Instead of running a daily cron job to check whether anything changed across 200 portfolio companies, you create Watcher subscriptions that deliver a webhook payload only when something does change.

The Watcher API supports several event types relevant to portfolio monitoring. For watching specific companies and people in your portfolio, the most useful are:

  • company-watch-linkedin-posts watches a company's social posts, filtered by domain

  • company-watch-press-mentions fires when a company appears in the news

  • company-watch-linkedin-job-postings tracks new job postings at a company

  • linkedin-person-profile-updates tracks job changes, promotions, and profile updates for specific people

  • linkedin-person-post-updates tracks social posts from specific people

The slugs reference specific platforms because the API resolves data from those sources, but your fund receives clean webhook payloads regardless of where the data originates.

You create a separate Watcher for each event type. Each Watcher takes an event type slug, filters that specify which company or person to watch, and a webhook endpoint where payloads get delivered.

Here is how to create Watchers for company social posts and key-person profile changes across a portfolio:

WATCHER_API = "https://api.crustdata.com/watcher/watches"
WEBHOOK_URL = "https://your-fund.com/webhooks/portfolio"

def create_portfolio_watchers(
    company_domains: list[str],
    person_profile_urls: list[str]
):
    watchers = []

    # Watch each portfolio company's social posts (by domain)
    for domain in company_domains:
        resp = requests.post(
            WATCHER_API,
            headers=HEADERS,
            json={
                "event_type_slug": "company-watch-linkedin-posts",
                "event_filters": [
                    {"filter_type": "COMPANY_DOMAIN",
                     "type": "in",
                     "value": [domain]}
                ],
                "notification_endpoint": WEBHOOK_URL,
                "frequency": 1,
                "expiration_date": "2027-01-01"
            }
        )
        if resp.ok:
            watchers.append(resp.json())

    # Watch key people (founders, C-suite) for job changes
    # and promotions (by profile URL)
    resp = requests.post(
        WATCHER_API,
        headers=HEADERS,
        json={
            "event_type_slug": "linkedin-person-profile-updates",
            "event_filters": [
                {"filter_type": "LINKEDIN_PROFILE_URL",
                 "type": "in",
                 "value": person_profile_urls},
                {"filter_type": "FIELDS_TO_TRACK",
                 "type": "in",
                 "value": ["employer_change", "headline"]}
            ],
            "notification_endpoint": WEBHOOK_URL,
            "frequency": 1,
            "expiration_date": "2027-01-01"
        }
    )
    if resp.ok:
        watchers.append(resp.json())

    return watchers
WATCHER_API = "https://api.crustdata.com/watcher/watches"
WEBHOOK_URL = "https://your-fund.com/webhooks/portfolio"

def create_portfolio_watchers(
    company_domains: list[str],
    person_profile_urls: list[str]
):
    watchers = []

    # Watch each portfolio company's social posts (by domain)
    for domain in company_domains:
        resp = requests.post(
            WATCHER_API,
            headers=HEADERS,
            json={
                "event_type_slug": "company-watch-linkedin-posts",
                "event_filters": [
                    {"filter_type": "COMPANY_DOMAIN",
                     "type": "in",
                     "value": [domain]}
                ],
                "notification_endpoint": WEBHOOK_URL,
                "frequency": 1,
                "expiration_date": "2027-01-01"
            }
        )
        if resp.ok:
            watchers.append(resp.json())

    # Watch key people (founders, C-suite) for job changes
    # and promotions (by profile URL)
    resp = requests.post(
        WATCHER_API,
        headers=HEADERS,
        json={
            "event_type_slug": "linkedin-person-profile-updates",
            "event_filters": [
                {"filter_type": "LINKEDIN_PROFILE_URL",
                 "type": "in",
                 "value": person_profile_urls},
                {"filter_type": "FIELDS_TO_TRACK",
                 "type": "in",
                 "value": ["employer_change", "headline"]}
            ],
            "notification_endpoint": WEBHOOK_URL,
            "frequency": 1,
            "expiration_date": "2027-01-01"
        }
    )
    if resp.ok:
        watchers.append(resp.json())

    return watchers
WATCHER_API = "https://api.crustdata.com/watcher/watches"
WEBHOOK_URL = "https://your-fund.com/webhooks/portfolio"

def create_portfolio_watchers(
    company_domains: list[str],
    person_profile_urls: list[str]
):
    watchers = []

    # Watch each portfolio company's social posts (by domain)
    for domain in company_domains:
        resp = requests.post(
            WATCHER_API,
            headers=HEADERS,
            json={
                "event_type_slug": "company-watch-linkedin-posts",
                "event_filters": [
                    {"filter_type": "COMPANY_DOMAIN",
                     "type": "in",
                     "value": [domain]}
                ],
                "notification_endpoint": WEBHOOK_URL,
                "frequency": 1,
                "expiration_date": "2027-01-01"
            }
        )
        if resp.ok:
            watchers.append(resp.json())

    # Watch key people (founders, C-suite) for job changes
    # and promotions (by profile URL)
    resp = requests.post(
        WATCHER_API,
        headers=HEADERS,
        json={
            "event_type_slug": "linkedin-person-profile-updates",
            "event_filters": [
                {"filter_type": "LINKEDIN_PROFILE_URL",
                 "type": "in",
                 "value": person_profile_urls},
                {"filter_type": "FIELDS_TO_TRACK",
                 "type": "in",
                 "value": ["employer_change", "headline"]}
            ],
            "notification_endpoint": WEBHOOK_URL,
            "frequency": 1,
            "expiration_date": "2027-01-01"
        }
    )
    if resp.ok:
        watchers.append(resp.json())

    return watchers

When a Watcher fires, it sends a webhook payload to your endpoint as a POST request. For person profile updates, the payload includes a changes object showing exactly what changed (new positions, removed positions, title updates). For company post watchers, the payload includes the full post text, reactions, and share URL. Your receiver processes the payload and routes it to whatever surfaces the team already reads:

def handle_webhook(request):
    event = request.json()
    route_to_slack(event)
    write_to_dashboard_db(event)
    return {"status": 200}
def handle_webhook(request):
    event = request.json()
    route_to_slack(event)
    write_to_dashboard_db(event)
    return {"status": 200}
def handle_webhook(request):
    event = request.json()
    route_to_slack(event)
    write_to_dashboard_db(event)
    return {"status": 200}

Before waiting for real-world events, you can use the simulation endpoint (/watcher/simulation/watches) to send test payloads to your webhook receiver instantly. This means you validate the full pipeline on day one, without waiting for an actual exec departure to confirm things work.

Entity resolution: the common bottleneck

Entity resolution is where internal portfolio monitoring tools quietly drift wrong inside a quarter. The problem: you pull a company list from your CRM, you pull enriched people records from a real-time API, you join on company name, and your match rate is 64%. The other 36% are not missing. They are the same companies spelled differently across providers.

A data engineer at a multi-stage fund described the core problem: "No good unique identifiers to be able to join these data sets." A three-person fund building their own monitoring system hit the same wall, with no canonical IDs for investors across their deal flow sources. Without canonical IDs, every new data source they added required someone to manually review and match records by hand, and at their volume the backlog grew faster than they could clear it.

The resolver pattern that works has three tiers with a human fallback:

def resolve_company(raw: dict, canonical_db) -> dict:
    # Tier 1: Deterministic match on domain
    if raw.get("domain"):
        hit = canonical_db.lookup_by_domain(raw["domain"])
        if hit:
            return {"match": hit, "confidence": 1.0, "method": "domain"}

    # Tier 2: Deterministic match on profile URL
    if raw.get("profile_url"):
        hit = canonical_db.lookup_by_profile(raw["profile_url"])
        if hit:
            return {"match": hit, "confidence": 1.0, "method": "profile"}

    # Tier 3: Fuzzy match on normalized name + geography
    from rapidfuzz import fuzz
    candidates = canonical_db.search(normalize(raw["name"]))
    best = max(candidates, key=lambda c: fuzz.ratio(
        normalize(raw["name"]), normalize(c["name"])
    ), default=None)

    if best and fuzz.ratio(
        normalize(raw["name"]), normalize(best["name"])
    ) >= 92:
        return {"match": best, "confidence": 0.92, "method": "fuzzy"}

    # Below threshold: queue for human review
    return {"match": None, "confidence": 0.0, "method": "review_queue"}
def resolve_company(raw: dict, canonical_db) -> dict:
    # Tier 1: Deterministic match on domain
    if raw.get("domain"):
        hit = canonical_db.lookup_by_domain(raw["domain"])
        if hit:
            return {"match": hit, "confidence": 1.0, "method": "domain"}

    # Tier 2: Deterministic match on profile URL
    if raw.get("profile_url"):
        hit = canonical_db.lookup_by_profile(raw["profile_url"])
        if hit:
            return {"match": hit, "confidence": 1.0, "method": "profile"}

    # Tier 3: Fuzzy match on normalized name + geography
    from rapidfuzz import fuzz
    candidates = canonical_db.search(normalize(raw["name"]))
    best = max(candidates, key=lambda c: fuzz.ratio(
        normalize(raw["name"]), normalize(c["name"])
    ), default=None)

    if best and fuzz.ratio(
        normalize(raw["name"]), normalize(best["name"])
    ) >= 92:
        return {"match": best, "confidence": 0.92, "method": "fuzzy"}

    # Below threshold: queue for human review
    return {"match": None, "confidence": 0.0, "method": "review_queue"}
def resolve_company(raw: dict, canonical_db) -> dict:
    # Tier 1: Deterministic match on domain
    if raw.get("domain"):
        hit = canonical_db.lookup_by_domain(raw["domain"])
        if hit:
            return {"match": hit, "confidence": 1.0, "method": "domain"}

    # Tier 2: Deterministic match on profile URL
    if raw.get("profile_url"):
        hit = canonical_db.lookup_by_profile(raw["profile_url"])
        if hit:
            return {"match": hit, "confidence": 1.0, "method": "profile"}

    # Tier 3: Fuzzy match on normalized name + geography
    from rapidfuzz import fuzz
    candidates = canonical_db.search(normalize(raw["name"]))
    best = max(candidates, key=lambda c: fuzz.ratio(
        normalize(raw["name"]), normalize(c["name"])
    ), default=None)

    if best and fuzz.ratio(
        normalize(raw["name"]), normalize(best["name"])
    ) >= 92:
        return {"match": best, "confidence": 0.92, "method": "fuzzy"}

    # Below threshold: queue for human review
    return {"match": None, "confidence": 0.0, "method": "review_queue"}

Start with the strongest canonical identifier available. For companies, the domain and profile URL are both deterministic. For people, the profile slug is the strongest single key. Where deterministic keys are missing, fuzzy matching on normalized name plus geography works, but anything below a confidence threshold should go to a review queue rather than auto-merging. The Company Identification API resolves a company from a name, website, or profile URL into a canonical ID, which eliminates much of the fuzzy matching for companies already in the database.

The cost of skipping entity resolution is that your monitoring system shows "no changes" for companies it cannot match, while the real records are accumulating changes under a slightly different name.

Getting started: four concrete first steps

Step 1: Export your portfolio list with canonical identifiers.

Pull your portfolio company list from whatever system holds it (CRM, spreadsheet, deal management tool) and include at least two identifiers per company: the domain and the social profile URL. These are the keys the entity resolver and the Watcher layer need to match records correctly. If you only have company names, run them through the Company Identification API first to get canonical IDs. The output of this step is a clean CSV with one row per portfolio company, where each row has a company name, domain, profile URL, and (optionally) a canonical company ID from your data provider. Teams that skip this step and start with names only will spend their first week debugging entity resolution mismatches instead of receiving signals.

Step 2: Create Watchers for every portfolio company and key-person profile.

Start with two Watcher types: company social post tracking for every portfolio company domain, and person profile update tracking for founders and C-suite executives. These cover the signals that teams building portfolio monitoring internally described as the highest value. For people Watchers, you need the profile URLs of the specific founders, C-suite executives, and key hires you want to track at each portfolio company. Use the People Search API filtered by company to pull the leadership list, then create a Watcher subscription for each. You can add press mention and job posting Watchers later without changing the architecture, because each new event type is just an additional subscription on the same webhook endpoint.

Step 3: Point webhook delivery at one Slack channel the partners already read.

Do not build a custom dashboard for the output. Route Watcher events to a Slack channel that already gets read, so the signal shows up where attention already is. Format the Slack message to include the company name, the signal type, and a one-line summary of what changed, so the partner reading it can decide in five seconds whether to dig deeper. If your fund uses email-based workflows, add a digest that batches Watcher events from the previous 24 hours and sends them before the Monday partner meeting as a second route.

Step 4: Shadow-run alongside your current process for two weeks, then compare.

Keep the manual check-in process running for two weeks while the Watchers deliver in parallel. At the end of two weeks, compare what changes each caught. The Watchers will surface signals the manual process missed entirely such as social posts about product launches mid-week, departures that happened between check-ins, hiring surges that started and plateaued within a single quarter. The comparison gives the team confidence to retire the manual process, because the evidence is specific and timestamped.

To start building with the architecture described above, create an account with 100 free credits or book a walkthrough with the team.

Conclusion

The dashboard stays. LP reporting keeps working exactly as it does today. What changes is what feeds it. Instead of associates collecting data manually every quarter, a signal capture layer watches the portfolio continuously and routes changes to the surfaces your team already reads. The same architecture scales from 30 portfolio companies to 25,000 companies across venture and public equities, because the Watcher layer treats every company as a webhook subscription rather than a manual check-in.

This portfolio monitoring build shares the same data and entity resolution layers described in the broader guide to building deal sourcing, founder discovery, and portfolio monitoring tools. If your fund is also building sourcing or founder discovery, the infrastructure you build here feeds directly into those workflows without duplicating the data layer or the resolver.

Data

Delivery Methods

Use Cases

Solutions