Why We Pull Data Instead of Querying It: Ampersand, MCP, and the CRM Enrichment Trap

MCP is elegant for point lookups, but the questions that matter in sales span more data than any context window can hold. Here's why we chose continuous data pull over runtime queries, and why Ampersand powers our integration layer.

Subscribe & Share:

Rahul Balakavi, Co-Founder, AmpUp

This is a deep-dive companion to Part 2: The Product Core of our engineering stack series. It covers the full story of how we approach CRM and meeting recorder integration, why we rejected MCP and CRM enrichment approaches, and why we chose Ampersand.

AmpUp is only as good as the data it can access. Our agents coach reps on deals they’re actually working, reference CRM context from live opportunities, and analyze meeting recordings from the tools teams already use. That means integrating with Salesforce , HubSpot , Gong , Chorus , Fireflies , and a growing list of CRM systems and meeting notetakers. The question was never whether to integrate. It was how.

The MCP Temptation

The obvious modern approach is to query data at runtime. MCP (Model Context Protocol) lets an agent call out to external systems on demand: fetch a deal from Salesforce , pull a transcript from Gong , look up a contact in HubSpot . It’s elegant. It’s flexible. And for our use case, it’s fundamentally the wrong architecture.

Here’s why. A sales manager asks: “Which of my reps are consistently losing deals at the negotiation stage, and what patterns do you see in their calls?” To answer this with runtime queries, the agent would need to:

Query the CRM for all closed-lost deals across the team (MCP call to Salesforce/HubSpot)
Filter to deals that stalled or lost at negotiation stage
For each deal, fetch associated meetings (MCP call to Gong/Chorus)
For each meeting, pull the full transcript
Analyze every transcript for patterns

Now do the math. A typical 30-minute sales call produces roughly 5,000—7,000 tokens of transcript. One million tokens, the context window you’re working with, holds approximately 140—150 transcripts. A single rep might have 20—30 meetings per quarter. A team of 10 reps? That’s 200—300 meetings in a single quarter. You’ve already blown past the context window before you’ve even started reasoning. And you’re paying for every token: on the $200/month max tier with most LLM providers, you’d burn through your budget answering a handful of these questions.

Runtime querying via MCP works beautifully for point lookups: “What stage is the Acme deal in?” It falls apart the moment you need to reason across a body of work. And the questions that actually matter in sales, the ones that drive coaching, pipeline reviews, and strategic decisions, are always questions that span multiple deals, multiple meetings, and multiple quarters.

The “Enrich Your CRM” Temptation

The counterargument is obvious: if runtime queries are too expensive, why not push the intelligence back into the CRM? Build a pipeline that enriches Salesforce with MEDDPICC scores, call sentiment, competitive signals, write it all into custom fields and objects, and then query the CRM at runtime with full context already baked in. The data lives where your team already works. No new tool to adopt.

In theory, this is sound. In practice, it means assembling and operating an entire toolchain just to get intelligence into CRM fields. Consider what a real CRM enrichment pipeline actually requires:

Capability	Typical Tool	What You’re Managing
Meeting transcription	Gong , Chorus , Fireflies , Rev AI	API keys, webhook ingestion, transcript normalization across providers
Call analytics & scoring	Gong Engage , Chorus Momentum, custom NLP pipeline	Sentiment models, talk-ratio computation, scoring calibration per team
Deal signal extraction	Clari , custom LLM pipeline	MEDDPICC/BANT parsing, competitive mention detection, prompt engineering
CRM field writes	Workato , Tray.io , custom integration	Field mapping per CRM, write ordering, conflict resolution, rate limiting
Workflow automation	Zapier , Make , n8n	Trigger logic, error handling, retry policies, monitoring dashboards
Data warehousing	Snowflake , BigQuery , Fivetran	Schema design, sync schedules, query costs, access controls

That’s six categories of tooling, each with its own vendor contract, its own API surface, its own failure modes, and its own cost. A mid-market team stitching together Gong + Clari + Workato + Snowflake can easily spend $3,000—5,000/month on tooling alone before a single enriched field lands in Salesforce. And each tool needs someone who understands how to configure it, monitor it, and fix it when it breaks. The “enrich your CRM” approach doesn’t eliminate complexity. It distributes it across half a dozen vendors and makes your sales-ops team the integration layer.

I’ve seen this firsthand. At ThoughtSpot , I watched the sales-ops function evolve from one person managing one tool to a team spanning two continents managing five. It didn’t happen because anyone planned it that way. It happened because each new enrichment need, each new data source, each new workflow demanded its own tooling, its own expertise, its own maintenance. What started as “let’s just add MEDDPICC scores to Salesforce” became a full-blown integration engineering operation. The tooling sprawl is gradual, and by the time you notice, you’re three vendors deep with no clean way to simplify.

The Uncomfortable Truth

The moment you start writing enrichment data back into your CRM, you’ve turned your CRM into a database. And databases come with database problems: schema migrations when your scoring model changes, data integrity issues when a write partially fails, stale records when a sync falls behind, version conflicts when two tools update the same field. These are real software engineering problems, the kind that require on-call rotations, runbooks, and incident reviews to manage well. If you’re on a sales team, or running one, that’s not the job you signed up for. Your CRM should be a place you go to sell, not a system you have to operate.

You also need to design custom objects and fields across every CRM you support (Salesforce custom objects are nothing like HubSpot custom properties). You need an orchestration layer that coordinates writes: update the deal record, then the contact record, then the activity timeline, in the right order, handling partial failures and API rate limits at each step. You need monitoring to catch when a write silently fails, when a field mapping breaks after a CRM admin renames something, when a sync falls out of step. If you have a dedicated sales-ops team or a dev team with CRM integration experience, this can work. Most companies don’t. For most teams, the CRM and meeting recorder are systems of record, not platforms they want to extend with custom engineering.

And there’s a deeper strategic question worth asking honestly: in a landscape where 20+ AI-first CRM providers are gaining real traction, are you confident your current CRM is the one you’ll be on three years from now? Building deep custom infrastructure on top of a specific CRM’s data model is a bet that the CRM is permanent. Every custom object, every enrichment pipeline, every field mapping is a switching cost you’re building for yourself. The tighter you couple your intelligence to your CRM’s schema, the harder it becomes to move when something better comes along.

The questions that matter in sales always span more data than any context window can hold. The answer isn’t a bigger window. It’s smarter data, decoupled from any single system.

The AmpUp Approach: Read Everything, Decide What to Write Back

AmpUp takes a different path. We read raw data from your CRM and meeting recorder, pull it into our own intelligence layer, and do the heavy enrichment on our side. Your CRM stays clean. Your meeting recorder stays as-is. The complexity lives in AmpUp, where we’ve built the infrastructure to handle it.

When you ask a question and get an answer, you decide what happens next. Want to update a Salesforce field based on the analysis? AmpUp can write it back. Want to send a Slack notification to the team? Fire off an email summary to your VP? Push a coaching note to the rep’s activity feed? Those are actions you choose, not enrichment that runs automatically in the background whether anyone looks at it or not. The end user decides what information matters and what actions to take. AmpUp handles the tools and integration plumbing.

This means your CRM isn’t cluttered with fields nobody reads. Your switching costs stay low, because the intelligence lives in AmpUp, not in custom Salesforce objects you’ll need to migrate. And your sales-ops team isn’t debugging a pipeline they didn’t build. Just talk to your admin, connect your CRM and meeting recorder on the Tools page , and start asking questions.

Instead of querying raw data at runtime or building permanent CRM infrastructure, we pull data continuously and enrich it on arrival. When a new meeting recording lands in Gong, or a deal stage changes in Salesforce, that data flows into AmpUp through Ampersand and hits our enrichment pipeline. By the time a user asks a question, the hard work is already done.

Here’s what the enrichment pipeline extracts from every meeting:

Signal Category	What We Extract	Why It Matters
Deal Qualification	MEDDPICC / BANT framework scoring	Instantly see which deals are well-qualified and which have gaps
Conversation Chapters	Discovery, demo, technical deep-dive, pricing, next steps	Jump to the exact part of a call that matters, skip the small talk
Behavioral Signals	Talk-to-listen ratio, question rate, filler words, monologue length	Objective coaching metrics, not subjective manager impressions
Deal Progression	Competitive mentions, objection patterns, champion identification, risk indicators	Surface pipeline risks before the forecast call, not during it

When that sales manager asks about negotiation-stage losses, the agent doesn’t need to load 300 raw transcripts. It queries structured signals: deals tagged as lost-at-negotiation, objection patterns already extracted and categorized, behavioral scores already computed. The answer comes back in seconds, draws on months of data, and costs a fraction of what a runtime approach would.

The Economics

With a runtime MCP approach, answering deep sales questions at enterprise scale would burn through provider API limits within days. The max plan on most LLM providers is around $200/month. With AmpUp’s enrichment-first architecture, we can give you answers to unlimited sales questions at half the price, because the expensive analysis (reading and understanding thousands of transcript pages) happens once during ingestion, not every time someone asks a question. The per-query cost drops by orders of magnitude.

Why Ampersand

Locked In

Integration Platform: Ampersand

CRM Providers: Salesforce , HubSpot (via unified API)

Notetaker Providers: Gong , Chorus , Fireflies (via unified API)

Data Sync: Ampersand webhooks → Inngest event handlers

Storage: Postgres (tenant-isolated via RLS)

Every CRM and notetaker has its own OAuth flow, API quirks, rate limits, and data model. Salesforce’s OAuth dance is a multi-step process with refresh token rotation, org-specific instance URLs, and a permission model that varies by edition. HubSpot’s is simpler but has its own pagination patterns and rate limiting behavior. Gong requires a different auth scheme entirely. Each provider’s API has different conventions for how they represent deals, contacts, activities, and recordings.

Building a direct integration for each provider is a sprint-long project. Maintaining it is worse. When Salesforce deprecates an API version, your integration breaks. When HubSpot changes their rate limit headers, your sync logic needs updating. When a customer uses a CRM you haven’t built for yet, it’s another sprint. We maintained custom OAuth flows for each provider early on, each with its own token refresh logic, error handling, and data mapping. Every provider API update was a fire drill.

Ampersand is a unified integration platform that handles the OAuth dance, token refresh, rate limiting, and data syncing for 100+ SaaS providers. We define what data we need (deals, contacts, activities, recordings) in a manifest, and Ampersand handles the connection, sync, and normalization. Adding a new CRM provider is a configuration change, not a code change.

The Admin Experience: Tools Page

For AmpUp’s customers, connecting their data sources is a self-service operation. Admins navigate to the Tools page, select their CRM (Salesforce , HubSpot ) or meeting recorder (Gong , Chorus , Fireflies ), and complete the OAuth flow. Ampersand powers this entire experience: the provider selection UI, the authorization handshake, and the ongoing data sync. Once connected, data starts flowing automatically. No engineering tickets, no custom setup calls, no waiting for the next release.

This matters more than it sounds. Every integration you support is a market segment you can sell to. When a prospect asks “do you integrate with [their specific CRM]?”, the answer should never be “that’s on our roadmap for Q3.” With Ampersand, we’ve been able to say “yes” to every CRM question we’ve been asked in the last six months. That’s not an engineering decision. That’s a revenue decision that happens to live in the engineering layer.

For detailed setup instructions and supported providers, see our integration documentation .

Ampersand Integration Architecture

The Architecture: Webhooks to Inngest to Intelligence

When Ampersand detects a data change (a deal stage update in Salesforce, a new recording in Gong, a contact field change in HubSpot), it pushes a webhook to our backend. Those webhooks land as Inngest events, which means they get the same durable execution guarantees as the rest of our pipeline: automatic retries, step-level persistence, and full observability in the Inngest dashboard.

The Inngest handler normalizes the incoming data into our internal schema, writes it to Postgres with proper tenant isolation via RLS, and dispatches downstream enrichment. A new deal update triggers signal extraction: competitive intelligence, risk indicators, MEDDPICC gap analysis. A new recording triggers the transcription and chapter-extraction pipeline. The enriched data feeds directly into the agent context, so when a rep starts a pre-meeting debrief, the agent already knows about the deal stage change that happened in Salesforce an hour ago.

# Simplified Inngest handler for CRM data sync
@inngest_client.create_function(
    fn_id="crm-deal-sync",
    trigger=inngest.TriggerEvent(event="ampersand/deal.updated"),
)
async def sync_deal_update(ctx: inngest.Context, step: inngest.Step):
    deal_data = ctx.event.data
 
    # Normalize from provider-specific format to internal schema
    normalized = await step.run("normalize", lambda: normalize_deal(deal_data))
 
    # Write to Postgres (RLS handles tenant isolation)
    await step.run("persist", lambda: upsert_deal(normalized))
 
    # Trigger downstream enrichment (MEDDPICC, signals, risk scoring)
    await step.send_event(inngest.Event(
        name="deal/updated",
        data={"deal_id": normalized.id, "org_id": normalized.org_id},
    ))

What We Evaluated, What We Chose, and the Honest Tradeoff

Ampersand wasn’t the first solution we tried. Before settling on it, we explored two other approaches that looked promising on paper.

Composio caught our attention as a way to simplify API integration for AI agents. It’s well-designed for the MCP/tool-call pattern: your agent needs to read a Salesforce deal or create a HubSpot contact, and Composio handles the auth and API abstraction. But Composio is optimized for runtime tool execution, which is exactly the pattern we’d already decided against. We needed continuous data sync, not on-demand API calls. Composio solves a real problem, just not ours.

Meltano was the other serious contender. It’s an open-source ELT platform built on the Singer tap/target ecosystem, and the model is compelling: a standardized connector for every data source, with a clean separation between extraction and loading. For a data warehousing use case, Meltano is excellent. For our use case, the fit was less clear. We needed tight control over the sync lifecycle (real-time webhooks, not batch ELT runs), a managed OAuth experience for end users (admins connecting their own CRM on a self-service page, not engineers configuring taps), and provider-level features like rate limit management and token refresh that Singer taps handle inconsistently. Meltano is a powerful tool for data teams; we needed an integration platform for a product.

Ampersand hit the intersection we needed: managed OAuth with a self-service end-user experience, real-time webhook-driven sync, and a unified API that abstracted provider differences without hiding them entirely. But what sealed it wasn’t just the feature set. It was the team. Their engineering team is genuinely excellent, the kind of team that responds to a support question with a root cause analysis and a fix in the same thread. When we hit edge cases with Salesforce custom objects or needed webhook delivery guarantees for our Inngest pipeline, their support was fast, technical, and direct. In the early days of adopting any platform, the team behind it matters as much as the product itself. Ampersand’s team earned our confidence repeatedly.

Before Ampersand, we maintained custom OAuth flows for each provider. Each integration had its own token refresh logic, its own error handling, and its own data mapping layer. Every provider API update required a separate investigation and fix. Adding a new provider meant weeks of work. Ampersand collapsed that to a single integration surface. The total implementation for our first provider (Salesforce) took two days. Adding HubSpot took half a day. The ongoing maintenance has been near zero.

You lose some fine-grained control over API calls. If you need very specific API behaviors, custom field mappings that Ampersand doesn’t support, or real-time bidirectional sync with sub-second latency, you’re working around the abstraction. We’ve hit this a few times with advanced Salesforce custom objects that don’t map cleanly to Ampersand’s data model, and the workaround was a thin direct-API layer for those specific objects alongside the Ampersand integration. For 90% of our CRM data needs, the tradeoff is worth it.

← Back to Part 1: The Infrastructure

Deep dive from The Stack We Actually Ship On. Written by Rahul Balakavi, for founders who’ve been there.