Assignment
Take over Bear's frontend Exa wiring, build it, then stress-test our entire Exa implementation against the live API docs.
Subset: The build was intern-scoped wiring (API routes, component, feature defs, recipes). The audit was a single-agent deep read against live Exa documentation. Neither required 5+ agents.
Roster
What we found
We use about 40% of what Exa offers. We call one endpoint out of six available. On that one endpoint, we pass 8 of 22 parameters. The biggest gap is Websets — Exa's structured data product that returns verified emails, phones, and company data directly instead of making us regex-extract them from page text. This gap was identified in run #002 two weeks ago and has not been fixed. It's the root cause of the 2% email validation rate that kills everything downstream.
Why this matters
Every contact that goes into a Salesfinity dial list passes through Exa. If Exa returns bad data (wrong person, no phone, garbage email), the rep dials a dead number, the meeting never happens, and the deal never starts. The pipeline is only as good as the data that enters it. Right now we pay for Exa search, then pay again for an LLM to summarize the results, then pay again for Clay to find phone numbers. Exa can do all three in one call — we just don't ask it to.
Where we agreed
The build phase was clean. Feature defs corrected (exa.run moved from intern to operator tier so interns can't accidentally burn Exa credits). Recipes API reads YAML files dynamically. Run API supports all 13 canonical templates. VerticalSelector component renders on the deals page. Two new recipes authored. All API-tested and type-checked.
Where we disagreed
No dissent — solo architect run. The open question is execution priority: fix the Websets gap first (highest leverage, directly fixes the 2% email rate) or add `outputSchema` support first (eliminates the double-pay pattern, easier to implement). Architect recommends Websets first because it addresses the data quality problem that makes everything else moot.
What surprised us
- Exa has a `/answer` endpoint that returns LLM-grounded answers with citations in one call. We've been chaining search → LLM summarize → citation lookup as three separate steps. One endpoint does all three.
- Exa's `outputSchema` parameter lets you get structured JSON back from a search — matching a schema you define. We've been paying for search + a separate LLM call to extract fields. The search endpoint does extraction natively if you ask.
- The `instant` search type has sub-150ms latency. Our vertical selector uses `deep` (seconds of latency) for real-time UI lookups where `instant` would be better.
- Four separate Exa client implementations exist in the codebase. The financial data client has unique features (3-tier domain credibility, company name disambiguation that fixed 18% corpus contamination) that are trapped in a file scheduled for deletion.
What we'd do differently
- Pull the provider's actual API docs before building an integration, not after. Mirroring the existing Python client locked in its gaps.
- Check SKILL.md audit findings for open items before assuming the codebase is the source of truth. The Websets gap was documented in SKILL.md since run #002.
Currency events
| From → To | Action | Multiplier | Base | Score | Notes |
|---|---|---|---|---|---|
| architect → exa-mastery | Full API surface audit | 3x | 20 | 60 | 5 unused endpoints + 14 unused parameters documented with severity ranking |
| architect → quarterback | Severity-ranked leakage report | 2x | 5 | 10 | Execution order included so fixes can be sequenced |
| architect → bear | Unblocked frontend wiring | 5x | 5 | 25 | All 6 deliverables built and tested before Bear was reassigned |
Cross-system gaps
| Flagger | Affected | Gap | Recommended change |
|---|---|---|---|
| architect | enrich-to-salesfinity pipeline | Websets API not called anywhere despite complete client existing in `enrich_phase2_phones_emails.py` (falls through to /search regex) [upgrade-signal:gap] | Build canonical Websets client in `engine/lib/exa_client.py` with create/poll/items methods, 15-min poll window |
| architect | run/route.ts + exa_client.py | Template definitions duplicated across Python and TypeScript with no shared source [upgrade-signal:gap] | Extract templates to `engine/lib/exa_templates.yaml` (PLAN-ewing Step 3, still undone) |
| architect | all 4 Exa clients | All use curl subprocess despite SKILL.md saying urllib works. API key visible in `ps aux` | Switch to urllib/fetch, add User-Agent header |
| architect | financial_data/exa_client.py → canonical client | 3-tier domain credibility, company name disambiguation, URL post-filtering trapped in file marked for deletion | Merge into canonical client before deleting |
| architect | exa_client.py + run/route.ts | No `outputSchema` usage — double-pay pattern (search + LLM extract) | Add `outputSchema` parameter to template configs, use for all templates that feed downstream LLM summarization |
| architect | exa_client.py | No `process_run` logging despite SKILL.md mandate | Wrap all calls in `process_run` context manager |
Signal Deployment Status
| Signal | Supabase status | Code status | Skill/doc status | Verdict |
|---|---|---|---|---|
| Websets integration (2% email fix) | UNDONE — no webset tables or process_run entries | UNDONE — no Websets client in canonical exa_client.py | DONE (skills/exa-mastery/SKILL.md documents the gap, schema, polling model) | OPEN |
| outputSchema for structured extraction | UNDONE — no schema definitions stored | UNDONE — neither Python nor TS clients pass outputSchema | PARTIAL (exa-mastery SKILL.md does not yet document outputSchema usage) | OPEN |
| Template YAML extraction (PLAN-ewing Step 3) | N/A | UNDONE — templates still Python dicts in exa_client.py and duplicated in route.ts | PARTIAL (PLAN-ewing.md documents the plan, not yet executed) | OPEN |
| process_run logging for Exa calls | UNDONE — no exa.* entries in process_runs table | UNDONE — no process_run wrapper in any client | DONE (SKILL.md documents the pattern with code example) | OPEN |
| Financial client features merge | N/A | UNDONE — 3-tier domains, name disambiguation, URL filtering still only in financial_data/exa_client.py | UNDONE — not documented anywhere as canonical | OPEN |
Per-Agent Journals
architect
Run 020 journal — architect
Run: 020 (exa-consolidation) · Date: 2026-05-08 18:00 PT · Phase 1 author
S1 — Finding
We use Exa at roughly 40% of its capability. The implementation calls one endpoint (/search) out of six available (/search, /contents, /findSimilar, /answer, /monitors, Websets). On the one endpoint we do use, we pass 8 of 22 available parameters — missing outputSchema (structured extraction in one call), contents.summary (LLM summaries with schema), userLocation (geographic bias), additionalQueries (diversity), systemPrompt (search planning), subpages (crawl About Us / Team pages), extras.links (outbound link discovery), and stream (SSE for real-time UI). The Websets gap is the root cause of the 2% email validation rate documented in SKILL.md — a known architecture bug since maxswarm run #002 that has not been fixed. Four separate Exa client implementations exist with no shared template registry.
S2 — Blind spot
I pulled the Exa API docs via web fetch but the /findSimilar reference page returned 404, so I reconstructed its parameters from our codebase and the exa-mastery skill rather than the canonical source. The Websets API documentation was assembled from multiple pages and the exa-labs/openapi-spec GitHub repo, but the full OpenAPI YAML was too large to fetch — there may be parameters or endpoint behaviors I missed. I also did not measure actual latency or cost of the newer search types (instant, deep-reasoning, deep-lite) against our templates; my recommendations to use them are based on Exa's published specs, not our own measured evidence. The outputSchema recommendation is high-confidence but unverified against our specific query patterns — query complexity may affect schema extraction quality.
S3 — Pattern
This run is a direct continuation of run #002 (HR.com buyers Exa mastery), which first identified the Websets architecture bug and the 2% email rate. Run #002 documented the problem; run #020 stress-tested the entire API surface against our implementation and found the gap is wider than #002 reported — not just Websets, but five additional endpoints and fourteen unused parameters. The pattern of "known gap persists across runs" matches what swarm-upgrade Class C (persistent dissent) is designed to catch. Run #014 (enrich-to-salesfinity design) created the YAML recipe system that this run wired to the frontend — the recipes are the delivery mechanism, but the Exa API calls behind them are still operating at the same capability level as pre-#002.
S6 — What changed about me
Next time I build an API integration route (like the Exa run endpoint), I will pull the provider's current API docs first and design against the full surface, not mirror the existing Python client — mirroring the existing client locked in its gaps instead of fixing them.
Generated from 020__architect.md — do not edit this HTML directly.