025 - Exa Three-Phase Pipeline Certification

S1 - Finding

Audited 16 active files touching Exa endpoints. The landscaping dry run exposed Websets entity conflation (DJ Landscaping AZ vs DJ's Landscape Management MI) and proved Websets forces entity.type "person" regardless of input. This led to the three-phase pipeline: /search (companies) -> /findSimilar (expand) -> /websets (people + contacts). Cost comparison: /search returns 10 companies at $0.01 vs Websets returning 10 LinkedIn profiles at $1.00 for the same query.

S2 - Blind spot

Python pipeline scripts (buyside_hunt_v1, pool_hunt_v2) still use the old pattern: /search + regex extraction. They work but skip Phase 2 (findSimilar) and Phase 3 (Websets for contacts). Regex owner-name extraction has 2% email validation rate vs Websets structured fields.

S3 - Pattern cited

"Surface-purpose mismatch." Each Exa surface has a natural strength. Websets excels at structured person extraction. /search excels at broad company discovery. Using the wrong surface costs 10x more for worse results.

S6 - What changed

1. Wire findSimilar into buyside_hunt and pool_hunt Phase 2 2. Add Websets Phase 3 to buyside_hunt for structured contact enrichment 3. Sync sell_side_discovery template from TS to Python exa_client.py

Generated from 025__exa-pipeline-certification.md — do not edit this HTML directly.