Episode 13: Bridge System — The Remote Control Protocol
🌐 Language: English | 中文版 → 📖 Read Online → — Sidebar nav, dark mode & full-text search. Better than raw GitHub.
Source files:
bridge/directory — 31 files, ~450KB total. Core:bridgeMain.ts(3,000 lines),replBridge.ts(2,407 lines),remoteBridgeCore.ts(1,009 lines),replBridgeTransport.ts(371 lines),sessionRunner.ts(551 lines),types.ts(263 lines)One-liner: The Bridge is Claude Code's remote control wire — a poll-dispatch-heartbeat loop that lets users type on claude.ai while their local machine executes code, with two transport generations (v1 WebSocket/POST, v2 SSE/CCRClient), crash recovery pointers, JWT refresh scheduling, and a 32-session capacity manager.
Architecture Overview
The Two Bridge Modes
1. Standalone Bridge (claude remote-control)
bridgeMain.ts (3,000 lines) — the long-running server mode:
User runs: claude remote-control
→ registerBridgeEnvironment() → get environment_id + secret
→ Poll loop: pollForWork() every N ms
→ On work: spawn child `claude --print --sdk-url ...`
→ Child streams NDJSON → bridge forwards to server
→ On done: stopWork() → archive session → return to pollSpawn Modes:
| Mode | Flag | Behavior |
|---|---|---|
single-session | (default) | One session, bridge exits when it ends |
worktree | --worktree | Each session gets isolated git worktree |
same-dir | --spawn, --capacity | All sessions share cwd (can stomp) |
Multi-session capacity: Up to 32 concurrent sessions (SPAWN_SESSIONS_DEFAULT = 32), gated by tengu_ccr_bridge_multi_session.
2. REPL Bridge (/remote-control command in interactive mode)
replBridge.ts (2,407 lines) — in-process bridge for interactive sessions:
User types /remote-control in REPL
→ initBridgeCore() → register environment → create session
→ Poll loop: wait for web user to type
→ Forward messages bidirectionally via transport
→ History flush: send existing conversation to web UITransport Generations
v1: HybridTransport (WebSocket + POST)
// Session-Ingress layer
// Reads: WebSocket connection to session-ingress URL
// Writes: HTTP POST to session-ingress URL
// Auth: OAuth access tokenv2: SSE + CCRClient
// CCR (Claude Code Runtime) layer
// Reads: SSETransport → GET /worker/events/stream
// Writes: CCRClient → POST /worker/events (SerialBatchEventUploader)
// Auth: JWT with session_id claim (NOT OAuth)
// Heartbeat: PUT /worker (CCRClient built-in, 20s default)The v2 path adds:
- Worker registration:
registerWorker()→ server assigns epoch number - Epoch-based conflict resolution: 409 on stale epoch → close transport, re-poll
- Delivery tracking:
reportDelivery('received' | 'processing' | 'processed') - State reporting:
reportState('running' | 'idle' | 'requires_action')
v3: Env-Less Bridge (remoteBridgeCore.ts)
// Direct OAuth → worker_jwt exchange, no Environments API
// 1. POST /v1/code/sessions → session.id
// 2. POST /v1/code/sessions/{id}/bridge → {worker_jwt, expires_in, worker_epoch}
// 3. createV2ReplTransport → SSE + CCRClient
// No register/poll/ack/stop/heartbeat/deregister lifecycleGated by tengu_bridge_repl_v2. Eliminates the entire poll-dispatch layer for REPL sessions.
The Poll-Dispatch Loop
bridgeMain.ts implements a sophisticated work-polling loop:
Poll Interval Configuration (GrowthBook-driven)
// Live-tunable via GrowthBook (refreshes every 5 min)
const pollConfig = getPollIntervalConfig()
// Intervals:
// not_at_capacity: fast polling for new work
// partial_capacity: moderate polling (some sessions active)
// at_capacity: slow/heartbeat-only (all slots full)Heartbeat Modes
When at capacity, the bridge can operate in two heartbeat patterns:
- Non-exclusive heartbeat (
non_exclusive_heartbeat_interval_ms > 0): Heartbeat loop runs independently of polling, with optional at-capacity polling at a slower rate - At-capacity polling only (
at_capacity > 0): Slow poll doubles as liveness signal - At-capacity wake: When a session completes,
capacityWake.wake()interrupts the sleep so the bridge can immediately accept new work
Error Recovery
// 源码位置: src/bridge/bridgeMain.ts:320-340
const DEFAULT_BACKOFF: BackoffConfig = {
connInitialMs: 2_000,
connCapMs: 120_000, // 2 minutes max backoff
connGiveUpMs: 600_000, // 10 minutes → give up
generalInitialMs: 500,
generalCapMs: 30_000,
generalGiveUpMs: 600_000,
}The bridge distinguishes connection errors (network down) from general errors (server 500s) and applies independent exponential backoff with separate give-up timers.
Session Runner
sessionRunner.ts (551 lines) wraps child process management:
// Child spawn command:
claude --print \
--sdk-url <session_url> \
--session-id <id> \
--input-format stream-json \
--output-format stream-json \
--replay-user-messagesActivity Tracking
The bridge parses child stdout NDJSON to extract real-time activity summaries:
const TOOL_VERBS: Record<string, string> = {
Read: 'Reading', Write: 'Writing', Edit: 'Editing',
Bash: 'Running', Glob: 'Searching', Grep: 'Searching',
WebFetch: 'Fetching', Task: 'Running task',
}Token Refresh via Stdin
// Send fresh JWT to child process via stdin
handle.writeStdin(JSON.stringify({
type: 'update_environment_variables',
variables: { CLAUDE_CODE_SESSION_ACCESS_TOKEN: token },
}) + '\n')The child's StructuredIO handles update_environment_variables by setting process.env directly — so getSessionIngressAuthToken() picks up the new token on the next API call.
JWT Lifecycle & Crash Recovery
Token Refresh Scheduling
const tokenRefresh = createTokenRefreshScheduler({
refreshBufferMs: 5 * 60_000, // 5 min before expiry
getAccessToken: async () => { /* OAuth refresh */ },
onRefresh: (sessionId, oauthToken) => {
// v1: deliver OAuth token directly to child
// v2: call reconnectSession() to trigger server re-dispatch
},
})v2's JWT approach introduces a subtle difference: each /bridge call bumps the server-side epoch. A JWT-only swap would leave the old CCRClient heartbeating with a stale epoch (→ 409 within 20s), so the entire transport must be rebuilt.
Crash Recovery Pointer
// Written after session creation (crash-recovery trail)
await writeBridgePointer(dir, {
sessionId: currentSessionId,
environmentId,
source: 'repl', // vs 'standalone'
})
// Cleared on clean teardown (unless perpetual mode)
// Perpetual mode: pointer survives clean exits for daemon restartOn restart, the bridge reads the pointer and attempts reconnection:
- Strategy 1: Idempotent re-register with
reuseEnvironmentId→ if same env returned,reconnectSession()re-queues existing session - Strategy 2: If env expired (laptop slept >4h), archive old session → create fresh session on new environment
Environment Reconnection Budget
const MAX_ENVIRONMENT_RECREATIONS = 3
// After 3 consecutive failures, give upMessage Protocol
Outbound (CLI → Server)
Messages flow as NDJSON lines on the child's stdout, parsed by the bridge and forwarded via the transport layer:
| Message Type | Meaning |
|---|---|
assistant | Model response (text blocks, tool_use blocks) |
result | Turn completed (success/error) |
control_request | Permission prompt (can_use_tool) |
Inbound (Server → CLI)
// handleIngressMessage() processes incoming messages
// - Echo dedup: recentPostedUUIDs (BoundedUUIDSet, cap=2000)
// - Inbound dedup: recentInboundUUIDs (BoundedUUIDSet, cap=2000)
// - Control requests: model changes, permission mode, interruptFlushGate
// Queue live writes during history flush to preserve ordering
const flushGate = new FlushGate<Message>()
// Start: queue new messages
// End: drain queued → send, then resume direct writes
// Drop: discard queue (transport dead)The Permission Pipeline
When the child CLI needs tool approval:
Child stdout → { type: 'control_request', subtype: 'can_use_tool', tool_name, input }
→ Bridge forwards via transport → Server → claude.ai shows approval UI
→ User clicks approve/deny
→ Server sends control_response → Bridge transport → child stdin
→ reportState('running') // Clear the "waiting for input" indicatorThe bridge also handles server-initiated control requests:
set_model: Change the model mid-sessionset_max_thinking_tokens: Adjust thinking budgetset_permission_mode: Switch between permission modes (auto, bypassPermissions)api_key_interrupt: Interrupt the current turn
Transferable Design Patterns
The following patterns can be directly applied to other remote-control or distributed agent systems.
Pattern 1: Epoch-Based Conflict Resolution
Scenario: Multiple workers may compete for the same session after reconnection. Practice: Assign a monotonically increasing epoch on registration; reject stale-epoch requests with 409. Claude Code application: v2 transport uses worker_epoch — a 409 triggers transport teardown and re-poll.
Pattern 2: CapacityWake (AbortController-Based Sleep Interrupt)
Scenario: A bridge sleeps during at-capacity wait but needs to react instantly when a slot opens. Practice: Use an AbortController signal to interrupt the sleep timer. Claude Code application: capacityWake.wake() interrupts the poll sleep so the bridge accepts new work immediately.
Pattern 3: Bootstrap-Isolation via Callback Injection
Scenario: A subsystem (bridge) must avoid importing the main module tree to keep its bundle small. Practice: Inject dependencies as callbacks instead of direct imports. Claude Code application: createSession is injected as (opts) => Promise<string | null> to avoid pulling the entire REPL tree into the Agent SDK bundle.
Component Summary
| Component | Lines | Role |
|---|---|---|
bridgeMain.ts | 3,000 | Standalone bridge: poll loop, multi-session, worktree, backoff |
replBridge.ts | 2,407 | REPL bridge: env registration, session create, transport management |
remoteBridgeCore.ts | 1,009 | Env-less bridge: direct OAuth→JWT, no poll-dispatch layer |
sessionRunner.ts | 551 | Child process spawning, NDJSON parsing, activity tracking |
replBridgeTransport.ts | 371 | Transport abstraction: v1 (WS+POST) vs v2 (SSE+CCRClient) |
types.ts | 263 | Protocol types: WorkResponse, SessionHandle, BridgeConfig |
bridgeApi.ts | ~500 | HTTP client: register, poll, ack, stop, heartbeat, reconnect |
bridgeMessaging.ts | ~430 | Message handling: ingress/egress, dedup, control requests |
bridgeUI.ts | ~460 | Live terminal display: status, QR codes, multi-session bullets |
jwtUtils.ts | ~260 | Token refresh scheduler: proactive 5min-before-expiry refresh |
workSecret.ts | ~130 | Work secret decoding: JWT extraction, SDK URL construction |
Total bridge surface: ~11,700 lines of protocol orchestration.
← Episode 12 — Startup & Bootstrap · Episode 14 — UI & State Management →