chapter.guide / swarm / run 020

Exa Consolidation

2026-05-08  ·  PASS  · Bookmark this page: /swarm/020-exa-consolidation.html

Assignment

Take over Bear's frontend Exa wiring, build it, then stress-test our entire Exa implementation against the live API docs.

Subset: The build was intern-scoped wiring (API routes, component, feature defs, recipes). The audit was a single-agent deep read against live Exa documentation. Neither required 5+ agents.

Roster

architect (solo — swarm-build deemed overkill for the build phasefull API audit done inline)

What we found

We use about 40% of what Exa offers. We call one endpoint out of six available. On that one endpoint, we pass 8 of 22 parameters. The biggest gap is Websets — Exa's structured data product that returns verified emails, phones, and company data directly instead of making us regex-extract them from page text. This gap was identified in run #002 two weeks ago and has not been fixed. It's the root cause of the 2% email validation rate that kills everything downstream.

Why this matters

Every contact that goes into a Salesfinity dial list passes through Exa. If Exa returns bad data (wrong person, no phone, garbage email), the rep dials a dead number, the meeting never happens, and the deal never starts. The pipeline is only as good as the data that enters it. Right now we pay for Exa search, then pay again for an LLM to summarize the results, then pay again for Clay to find phone numbers. Exa can do all three in one call — we just don't ask it to.

Where we agreed

The build phase was clean. Feature defs corrected (exa.run moved from intern to operator tier so interns can't accidentally burn Exa credits). Recipes API reads YAML files dynamically. Run API supports all 13 canonical templates. VerticalSelector component renders on the deals page. Two new recipes authored. All API-tested and type-checked.

Where we disagreed

No dissent — solo architect run. The open question is execution priority: fix the Websets gap first (highest leverage, directly fixes the 2% email rate) or add `outputSchema` support first (eliminates the double-pay pattern, easier to implement). Architect recommends Websets first because it addresses the data quality problem that makes everything else moot.

What surprised us

  • Exa has a `/answer` endpoint that returns LLM-grounded answers with citations in one call. We've been chaining search → LLM summarize → citation lookup as three separate steps. One endpoint does all three.
  • Exa's `outputSchema` parameter lets you get structured JSON back from a search — matching a schema you define. We've been paying for search + a separate LLM call to extract fields. The search endpoint does extraction natively if you ask.
  • The `instant` search type has sub-150ms latency. Our vertical selector uses `deep` (seconds of latency) for real-time UI lookups where `instant` would be better.
  • Four separate Exa client implementations exist in the codebase. The financial data client has unique features (3-tier domain credibility, company name disambiguation that fixed 18% corpus contamination) that are trapped in a file scheduled for deletion.

What we'd do differently

  • Pull the provider's actual API docs before building an integration, not after. Mirroring the existing Python client locked in its gaps.
  • Check SKILL.md audit findings for open items before assuming the codebase is the source of truth. The Websets gap was documented in SKILL.md since run #002.

Currency events

From → ToActionMultiplierBaseScoreNotes
architect → exa-masteryFull API surface audit3x20605 unused endpoints + 14 unused parameters documented with severity ranking
architect → quarterbackSeverity-ranked leakage report2x510Execution order included so fixes can be sequenced
architect → bearUnblocked frontend wiring5x525All 6 deliverables built and tested before Bear was reassigned

Cross-system gaps

FlaggerAffectedGapRecommended change
architectenrich-to-salesfinity pipelineWebsets API not called anywhere despite complete client existing in `enrich_phase2_phones_emails.py` (falls through to /search regex) [upgrade-signal:gap]Build canonical Websets client in `engine/lib/exa_client.py` with create/poll/items methods, 15-min poll window
architectrun/route.ts + exa_client.pyTemplate definitions duplicated across Python and TypeScript with no shared source [upgrade-signal:gap]Extract templates to `engine/lib/exa_templates.yaml` (PLAN-ewing Step 3, still undone)
architectall 4 Exa clientsAll use curl subprocess despite SKILL.md saying urllib works. API key visible in `ps aux`Switch to urllib/fetch, add User-Agent header
architectfinancial_data/exa_client.py → canonical client3-tier domain credibility, company name disambiguation, URL post-filtering trapped in file marked for deletionMerge into canonical client before deleting
architectexa_client.py + run/route.tsNo `outputSchema` usage — double-pay pattern (search + LLM extract)Add `outputSchema` parameter to template configs, use for all templates that feed downstream LLM summarization
architectexa_client.pyNo `process_run` logging despite SKILL.md mandateWrap all calls in `process_run` context manager

Signal Deployment Status

SignalSupabase statusCode statusSkill/doc statusVerdict
Websets integration (2% email fix)UNDONE — no webset tables or process_run entriesUNDONE — no Websets client in canonical exa_client.pyDONE (skills/exa-mastery/SKILL.md documents the gap, schema, polling model)OPEN
outputSchema for structured extractionUNDONE — no schema definitions storedUNDONE — neither Python nor TS clients pass outputSchemaPARTIAL (exa-mastery SKILL.md does not yet document outputSchema usage)OPEN
Template YAML extraction (PLAN-ewing Step 3)N/AUNDONE — templates still Python dicts in exa_client.py and duplicated in route.tsPARTIAL (PLAN-ewing.md documents the plan, not yet executed)OPEN
process_run logging for Exa callsUNDONE — no exa.* entries in process_runs tableUNDONE — no process_run wrapper in any clientDONE (SKILL.md documents the pattern with code example)OPEN
Financial client features mergeN/AUNDONE — 3-tier domains, name disambiguation, URL filtering still only in financial_data/exa_client.pyUNDONE — not documented anywhere as canonicalOPEN

Per-Agent Journals

architect

Run 020 journal — architect

Run: 020 (exa-consolidation) · Date: 2026-05-08 18:00 PT · Phase 1 author

S1 — Finding

We use Exa at roughly 40% of its capability. The implementation calls one endpoint (/search) out of six available (/search, /contents, /findSimilar, /answer, /monitors, Websets). On the one endpoint we do use, we pass 8 of 22 available parameters — missing outputSchema (structured extraction in one call), contents.summary (LLM summaries with schema), userLocation (geographic bias), additionalQueries (diversity), systemPrompt (search planning), subpages (crawl About Us / Team pages), extras.links (outbound link discovery), and stream (SSE for real-time UI). The Websets gap is the root cause of the 2% email validation rate documented in SKILL.md — a known architecture bug since maxswarm run #002 that has not been fixed. Four separate Exa client implementations exist with no shared template registry.

S2 — Blind spot

I pulled the Exa API docs via web fetch but the /findSimilar reference page returned 404, so I reconstructed its parameters from our codebase and the exa-mastery skill rather than the canonical source. The Websets API documentation was assembled from multiple pages and the exa-labs/openapi-spec GitHub repo, but the full OpenAPI YAML was too large to fetch — there may be parameters or endpoint behaviors I missed. I also did not measure actual latency or cost of the newer search types (instant, deep-reasoning, deep-lite) against our templates; my recommendations to use them are based on Exa's published specs, not our own measured evidence. The outputSchema recommendation is high-confidence but unverified against our specific query patterns — query complexity may affect schema extraction quality.

S3 — Pattern

This run is a direct continuation of run #002 (HR.com buyers Exa mastery), which first identified the Websets architecture bug and the 2% email rate. Run #002 documented the problem; run #020 stress-tested the entire API surface against our implementation and found the gap is wider than #002 reported — not just Websets, but five additional endpoints and fourteen unused parameters. The pattern of "known gap persists across runs" matches what swarm-upgrade Class C (persistent dissent) is designed to catch. Run #014 (enrich-to-salesfinity design) created the YAML recipe system that this run wired to the frontend — the recipes are the delivery mechanism, but the Exa API calls behind them are still operating at the same capability level as pre-#002.

S6 — What changed about me

Next time I build an API integration route (like the Exa run endpoint), I will pull the provider's current API docs first and design against the full surface, not mirror the existing Python client — mirroring the existing client locked in its gaps instead of fixing them.

Generated from 020__architect.md — do not edit this HTML directly.