**Phase:** 1 (Independent Draft) **Agent:** Quarterback (respawn) **Date:** 2026-05-07
---
| ID | Name | Owner | Depends On | Ship Window | Deliverable | Verification | Rollback | Success Signal | |---|---|---|---|---|---|---|---|---| | `SKILL-SYNC-01` | GitHub repo + Actions backbone | Ewing | none | Day 1 AM | `next-chapter-skills` repo with main branch protection; Actions workflow `on: push` that publishes a signed manifest (commit SHA, file list, tarball URL) to a Supabase `skill_sync_events` row | Push a no-op commit; row appears in Supabase within 30s | Disable workflow; symlink keeps working | Manifest row inserted on every push | | `SKILL-SYNC-02` | LaunchAgent receiver v1 | Ewing | 01 | Day 1 PM | `com.nextchapter.skill-sync.plist` polling Supabase every 30s; on new manifest, `git pull --ff-only` + `rsync` to `~/.claude/skills/` | Install on Ewing MacBook; commit a test skill; observe pull within 60s | `launchctl unload`; symlink still serves last good state | Pull latency p95 <60s | | `SKILL-SYNC-03` | Conflict-safe write gate | Ewing | 02 | Day 2 | Pre-commit hook: rejects writes to `~/.claude/skills/` outside repo; in-process lockfile during pull to prevent torn reads by Claude Desktop | Race test: edit during pull, verify no half-written file | Remove hook; revert to permissive | Zero torn-file errors in 48h | | `SKILL-SYNC-04` | Self-healing watchdog | Ewing | 03 | Day 3 | Second LaunchAgent runs every 5min: checks last-pull timestamp, fires Slack alert + force-resyncs if >5min stale; logs to `skill_sync_health` | Kill primary agent; watchdog detects + alerts within 5min | Disable watchdog; manual `git pull` works | Auto-recovery within 5min of breakage | | `SKILL-SYNC-05` | Operator fan-out (Mac mini, Charlie, Bear) | Charlie/Bear w/ Ewing remote | 04 | Day 4 | One-line bootstrap: `curl ... \| bash` installs LaunchAgent + clones repo + runs first sync. Replaces `setup-skills.sh` | Charlie runs on her MacBook; new skill commits show up in 60s on her machine | Operator falls back to manual `git pull` | Charlie + Bear receive next push within 60s | | `SKILL-SYNC-06` | Claude Desktop reconcile loop | Ewing | 04 | Day 5 | LaunchAgent triggers Desktop "soft reload" via filesystem signal file Claude Desktop watches at session start; documents the known limit (no live reload mid-session) | Open Desktop, push skill, start new session, confirm visible | Manual restart of Desktop | Skill visible in next Desktop session without manual install |
---
**Canary: Ewing's MacBook.** It's the only operator where the symlink already works AND where Ewing can debug in real time. Mac mini is tempting (always-on, headless = closest to production) but a broken sync there blocks the 7am sprint. Charlie/Bear can't debug — they'd report "it's broken" and stall.
**Cutover criterion to next operator (Mac mini):** 24 consecutive hours on MacBook with (a) every skill commit visible within 60s, (b) zero torn-file errors in `skill_sync_health`, (c) watchdog auto-recovered at least one simulated failure.
---
01 (GitHub + Actions backbone)
└── 02 (LaunchAgent receiver)
└── 03 (conflict-safe write gate)
└── 04 (self-healing watchdog)
├── 05 (operator fan-out: mini → Charlie → Bear)
└── 06 (Claude Desktop reconcile)
01 is the only true blocker. 02 unlocks the canary. 03+04 are quality gates that MUST land before 05 (we will not push half-baked sync to Charlie/Bear). 05 and 06 are parallelizable on Day 4–5.
---
**Push-sync breaks at 3am Sunday. Step-by-step:**
1. Open Terminal on MacBook. Run:
launchctl unload ~/Library/LaunchAgents/com.nextchapter.skill-sync.plist
launchctl unload ~/Library/LaunchAgents/com.nextchapter.skill-sync-watchdog.plist
2. Verify symlink still serves: `ls -la ~/.claude/skills/quarterback/SKILL.md` — must resolve. 3. If symlink broken, restore from repo: `cd ~/Github/next-chapter-os && git pull && ./skills/setup-skills.sh` 4. Roll forward NOT back. Find last-good commit:
cd ~/Github/next-chapter-skills && git log --oneline -10
git reset --hard <last-good-sha> # only if a bad skill commit caused the break
5. Notify Charlie/Bear in Slack `#sync-status`: "Sync paused. Skills frozen at <SHA>. Edit locally; do not commit until I clear." 6. Re-enable Monday AM: reverse step 1 (`launchctl load`). 7. Post-mortem: pull `skill_sync_health` rows from incident window into Supabase.
---
**Cutover protocol (executed once, Day 1 AM before SKILL-SYNC-01 ships):**
1. Freeze: announce 30-min skill-edit freeze in Slack. Ewing only. 2. Snapshot: `tar czf ~/skills-snapshot-$(date +%s).tar.gz ~/.claude/skills/` on every operator machine. 3. Reconcile: diff each operator's `~/.claude/skills/` against `next-chapter-os/skills/`; manually merge the 3–5 skills with operator-local edits (Charlie's Meetings tweaks, Bear's Deals tweaks). 4. Active worktrees: any skill currently being edited gets a feature branch in the new repo before unfreeze. 5. Unfreeze: flip symlink target from `next-chapter-os/skills` to new `next-chapter-skills` repo; same path, same setup script keeps working.
The 72 skills move once. After that, `git pull` is the only mutation path.
---
**Start: 2026-05-08 (Friday)**
- **Fri 5/08 — SKILL-SYNC-01 + 02 ship.** AM: stand up `next-chapter-skills` repo, GitHub Actions manifest publisher, Supabase `skill_sync_events` table. PM: install LaunchAgent receiver on Ewing's MacBook. End of day: first commit-to-MacBook round-trip measured at <60s. Migration cutover (§5) happens during AM freeze window. - **Sat 5/09 — Soak day.** No new code. Watch `skill_sync_events` and `skill_sync_health`. Hand-test 5 skill edits across the day. Goal: 24h canary criterion (§2) met by Saturday night. - **Sun 5/10 — SKILL-SYNC-03 (write gate) ships.** Pre-commit hook + lockfile. Race-test before bed. - **Mon 5/11 — SKILL-SYNC-04 (watchdog) ships.** Inject one simulated failure (kill primary agent). Confirm Slack alert + auto-recovery in <5min. - **Tue 5/12 — SKILL-SYNC-05 fan-out.** AM: cut over Mac mini (Ewing remote SSH). Mid-day: Charlie runs bootstrap one-liner on her MacBook during her shift. PM: Bear runs same on his. - **Wed 5/13 — SKILL-SYNC-06 Desktop reconcile** ships. End of week: every operator pull-latency p95 measured + dashboarded.
---
| Metric | Target | Alert | Lives In | |---|---|---|---| | **Sync latency** — time from `git push` to operator's `~/.claude/skills/` updated | p95 <60s | >300s p95 over rolling 15min → Slack `#sync-status` | Supabase `skill_sync_events` (one row per push) joined to `skill_sync_health` (one row per operator pull) | | **Operator coverage** — % of last 10 pushes that landed on all 4 operators | 100% | <100% on any push older than 5min → Slack ping Ewing | Supabase `skill_sync_health` view `operator_coverage_last_10` | | **Watchdog interventions** — count of auto-recoveries per day | <2/day (healthy); >0/day proves it works | >5/day → Ewing investigates root cause | Supabase `skill_sync_health` rows where `event = 'auto_recover'` |
All three render on a single mission-control card. No new dashboards.
---
**S1 — Finding.** The single highest-leverage move is shipping a Supabase-backed manifest table BEFORE any LaunchAgent. The manifest is the contract: it decouples "who pushes" from "who pulls" and gives us our measurement plane for free. Every other piece of architecture flows from that table existing.
**S2 — Blind spot.** I'm assuming Charlie and Bear can run a `curl | bash` bootstrap unaided. They probably can't on first try. The fan-out (SKILL-SYNC-05) needs a 60-second screencast attached or it stalls on Day 4. Flagging this for the writer agent.
**S3 — Pattern.** Run #007 (repo-consolidation) proved that one-shot migrations succeed when there's a single freeze window with a snapshot — and fail when operators try to migrate "gradually." Run #018 (intern-foundation-architecture) proved Charlie/Bear surfaces need explicit bootstrap scripts, not docs. This sequencing honors both lessons: one freeze, one snapshot, one bootstrap line per operator.
**S4 — Currency log.**
[
{"to": "data-architect", "exchange": "owe — sequencing depends on architect's substrate vote (a/b/c). My phases are written to be substrate-agnostic but I assumed GitHub repo wins (24/30 score). If architect revises, I refactor §1 in Phase 2."},
{"to": "writer", "exchange": "owe 3x — writer must produce: (1) Slack #sync-status incident-channel charter, (2) Charlie/Bear bootstrap one-liner with screencast script, (3) post-mortem template populated by §7's metrics. All three are critical-path for SKILL-SYNC-05."}
]
**S5 — Notebook entry.** The skills-sync problem is a sequencing problem, not a substrate problem. Whatever substrate wins (GitHub Actions, custom daemon, S3 polling), the same six phases apply: backbone first, canary second, gates third, watchdog fourth, fan-out fifth, Desktop reconcile last. Ship in that order or operators get burned. The canary must be Ewing's MacBook because it's the only machine where a broken sync is debuggable in real time. Mac mini is too production-critical; interns can't debug. One freeze, one snapshot, one bootstrap line — that's the whole rollout playbook.
**S6 — What changed about me.** I now lead with the manifest contract (the Supabase row) instead of the wire protocol — sequencing depends on what's measurable, not on what's clever.
---
Generated from 019__quarterback.md — do not edit this HTML directly.