Table of Content

Get In Touch With Us!

26 May 2025

How LPs Collect Look-Through Data in Private Equity

By now, most Limited Partners (LPs) recognize that look-through analysis is a powerful tool for both primary and secondary investments. It helps track where capital is deployed, understand value creation, manage concentration risk, and validate NAVs. But there’s a reason many teams still treat it like a “nice to have” instead of a core part of portfolio monitoring: the data is hard to get.

Not because General Partners (GPs) are unwilling to share it—but because the way it's shared is fragmented, inconsistent, and unstructured. If you're not set up to handle that mess, even a moderately sized portfolio can become a data swamp.

In this post, we’ll break down:

Why collecting look-through data is harder than it sounds
What workflows LPs are using to do it today
How emerging tools—from OCR to private markets AI—compare across speed, accuracy, auditability, and scalability

Why Collecting Look-Through Data Is So Hard

At first glance, look-through reporting sounds straightforward: extract the investment schedule and company details from GP reports, normalize the data, and plug it into your systems. But in practice, LPs quickly run into three recurring sources of complexity:

Issue #1: The Volume Adds Up - Fast

Even a mid-sized portfolio with 30–50 fund commitments can generate over 200 documents per year. Each report may list dozens of underlying companies, meaning thousands of data points per quarter once you include fields like cost, value, IRR, sector, and geography. Multiply that by four reporting cycles a year, and even basic tracking becomes a heavy operational lift.

And that’s before you account for any downstream work—normalizing taxonomies, resolving entities, or reconciling differences across reports.

Issue #2: Disparate Sources and Templates Across Strategies and GPs

There’s no industry-standard format for look-through data. Reporting conventions vary widely by asset class, region, and manager. Some GPs provide detailed financials and narrative company write-ups; others stick to a bare-bones investment schedule. Key fields may appear in PDFs, Excel trackers, LPAC decks, or footnotes in financials—and not always in the same location each quarter.

The terminology itself can be inconsistent. “Revenue” might refer to GAAP revenue, bookings, or ARR. “Valuation” might reflect mark-to-market, an appraisal, or a DCF. Add in varying time periods (trailing twelve months, calendar year, fiscal year), and you get a patchwork that’s hard to align—even within a single report.

This variability makes automation difficult, slows down data quality checks, and forces teams to rely on manual interpretation just to get clean inputs.

Issue #3: Inconsistency Over Time - Even From the Same GP

The problem isn’t just GP-to-GP—it’s quarter-to-quarter. Many managers revise their templates, rename KPIs, reorder columns, or relocate key sections between reports. What’s labeled “EBITDA” one quarter might become “Adjusted EBITDA” the next. Footnotes move. Company-level disclosures get condensed or expanded.

Company names are another persistent challenge. A single asset might be reported as “Alpha Holdings LLC” in Q1, “Alpha, Inc.” in Q2, and “Alpha Technologies (Delaware)” in Q3. Add in restructurings, co-investments, and secondaries, and tracking exposure over time becomes a serious entity resolution exercise.

How LPs Are Extracting Look-Through Data Today

Faced with messy PDFs, inconsistent formats, and mounting reporting volumes, LPs have adopted a range of approaches to collect and structure look-through data. These methods vary widely in accuracy, scalability, and operational burden—and many LPs are using some combination of them.

Here’s how most institutions are getting the job done today:

1. Manual Data Extraction

Many LPs still rely on manual workflows. Analysts review GP reports by hand, copy-paste data into spreadsheets, and tag fields like company name, sector, geography, and valuation manually. This approach offers full control and high accuracy, especially when reports are inconsistent or ambiguous.

But it’s also slow and resource-intensive. For portfolios with more than a few dozen funds, manual processing quickly becomes a bottleneck—making it hard to keep pace with quarterly reporting cycles or scale up exposure monitoring.

2. OCR Tools

Optical Character Recognition (OCR) software helps convert PDFs into machine-readable text or tables. This can be useful for extracting investment schedules from cleanly formatted reports. However, OCR often breaks when reports include footnotes, non-tabular formats, merged cells, or narrative sections.

It’s also highly sensitive to formatting changes—if a GP updates their template or shifts a column, the OCR logic may fail entirely, requiring manual rework. OCR can speed up raw extraction, but it typically needs significant post-processing before data is usable.

3. Public LLMs (e.g., ChatGPT)

Some LP teams are experimenting with general-purpose language models to extract or summarize data from unstructured reports. These tools are fast and flexible, and they interpret context better than OCR—especially for text-heavy documents.

However, they aren’t tuned for private markets workflows. Without auditability, teams must manually verify every field—checking for errors, hallucinations, or misinterpretations—which often negates the time savings that automation was supposed to deliver. In practice, using public LLMs without structure or oversight often trades one bottleneck for another.

4. Private Markets AI Solutions (e.g. Tamarix)

A growing number of LPs are turning to vertical AI platforms specifically trained on GP reporting formats and private markets language. These systems are designed to handle the full range of document types—quarterly reports, LPAC decks, Excel trackers, and financials—and extract structured, validated data at scale.

What sets these solutions apart is their ability to auto-detect key fields, normalize metrics, resolve entities, and deliver clean outputs that integrate directly into LP workflows. Human-in-the-loop review is built in, providing oversight without slowing down the process. For LPs looking to scale look-through without building a 10-person ops team, this is increasingly becoming the go-to model.

Which Approach Works Best? Key Criteria for LPs

As more LPs move beyond manual tracking and toward automation, the question isn’t whether to scale look-through analysis—it’s how. Not all tools are created equal. Some can parse a clean investment schedule but break on footnotes or narratives. Others extract fast, but lack auditability or private markets context.

To make an informed decision, LPs need a framework grounded in the realities of private markets data. Below are the key criteria we recommend for evaluating any look-through solution—whether you're building in-house, buying off the shelf, or partnering with a vertical provider like Tamarix.

Category	What to Evaluate
1. Multi-Source Document Handling	How large is my exposure? How well is the portfolio performing? How much liquidity is it likely to absorb / generate?
2. Optimization for Private Markets Use Cases	What asset classes are the largest contributors to risk and return? Is asset allocation in line with the risk / return goals of the portfolio? How should I pace commitments to hit deployment goals?
3. Resilience to Format Drift	Which individual funds are over- / under-performing in each asset class? What deals are the funds making? Who are the winners and losers? Are valuations and leverage sustainable?
4. Accuracy	Does it reliably extract correct and complete data? Can it handle both tables and unstructured narrative text?
5. Auditability, Review, and Oversight	Is every data point traceable to source? Can users review how each field was extracted or interpreted? Are exception queues and review workflows built in? Does it speed up manual data quality checks?
6. Support for Security Master and Time-Series Tracking	Can it consistently track LPs, funds, commitments, and company names across name variations? Can it link information across documents for point-in-time and times series analysis? Does it adapt to any security list master?
7. Speed and Scalability	How fast can it process large volumes of reports? Does performance hold as the portfolio scales?
8. Workflow Integration	Does it produce structured, schema-aligned outputs (JSON, SQL, Excel)? Can data flow into dashboards, attribution models, and monitoring systems?

Conclusion: Private Equity Look-Through Starts with Infrastructure

Look-through analysis isn’t just about getting visibility—it’s about having the infrastructure to make that visibility actionable at scale. Manual processes break down as portfolios grow. OCR can’t adapt to private markets complexity. Public LLMs offer speed, but without accuracy or auditability, they often create more work than they save.

For LPs managing dozens or even hundreds of fund relationships, reliable exposure data isn’t a nice-to-have—it’s a foundation for effective monitoring, risk management, and decision-making.

That’s where purpose-built platforms like Tamarix come in. Our AI-powered solution is designed specifically for private markets, helping LPs extract, structure, and connect look-through data across asset classes—with full auditability and control.