gadget/docs/socket-protocol.md
2026-05-11 20:27:24 -04:00

11 KiB

Gadget Code Socket Protocol Reference

This document serves as a "Cheat Sheet" for AI agents and developers working on the Gadget Code real-time messaging system.

1. Components & Connections

Component Role Protocol Auth Method
gadget-code:web Hub / Router / Server Socket.IO Server N/A
gadget-code:ide Frontend Control Surface Socket.IO Client JWT Token
gadget-drone Worker / AWL Runner Socket.IO Client Drone Registration ID

2. Event Map Overview

Defined in packages/api/src/messages/socket.ts.

IDE -> Web (Client to Server)

  • requestSessionLock: Request to exclusive-lock a drone for a project session.
  • requestWorkspaceMode: Request a mode change (Idle, User, Agent).
  • submitPrompt: Submit a user prompt for agent processing.

Drone -> Web (Client to Server)

  • thinking: Stream reasoning/thought process text.
  • response: Stream natural language response text.
  • toolCall: Emit a specific tool execution event with result.
  • workOrderComplete: Signal that a prompt processing turn is finished.
  • requestCrashRecovery: Inbound from drone on restart if it finds a stalled work order.
  • requestTermination: Acknowledgment from drone that termination request was received.

Web -> Drone (Server to Client)

  • processWorkOrder: Command to start processing a specific prompt/turn.
  • crashRecoveryResponse: Command to discard or retry a stalled work order.
  • requestTermination: Command to immediately terminate the drone process.

Web -> IDE (Server to Client)

  • sessionUpdated: Notify the IDE that a chat session property has changed (e.g. auto-generated name).

3. Core Sequences & Routing

3.1 Prompt Submission Flow

  1. IDE emits submitPrompt(content).
  2. Web (CodeSession.ts):
    • Creates a ChatTurn document (status: processing).
    • Increments the chat session's stats.turnCount.
    • Finds the target DroneSession.
    • Caches the updated session and signals the IDE to enter Processing state.
    • Emits processWorkOrder to the Drone.
    • On first prompt (name is still the default), calls AI API to auto-generate session name.
    • Emits sessionUpdated({ name }) to IDE if the name changed.
  3. Drone (gadget-drone.ts):
    • Writes a local .gadget/work-order.json cache (for crash recovery).
    • Calls AgentService.process().
    • Emits streaming events back to Web.

3.2 Result Streaming Flow

  1. Drone emits thinking(text), response(text), or toolCall(id, name, args, result).
  2. Web (DroneSession.ts):
    • Locates the associated CodeSession via SocketService.getCodeSessionByChatSessionId().
    • Updates the ChatTurn document in MongoDB incrementally.
    • Forwards the event to the IDE.
  3. IDE: Updates the UI in real-time.

3.3 Session Termination

  1. Drone emits workOrderComplete(turnId, success, message).
  2. Web (DroneSession.ts):
    • Sets ChatTurn status to finished or error.
    • Forwards event to IDE.
    • Clears currentTurnId from the drone session.

3.4 Drone Termination Flow

  1. User clicks "Terminate" button in Drone Manager UI.
  2. IDE calls POST /api/v1/drone/registration/:id/terminate.
  3. Web (DroneService.ts):
    • Checks if drone is already offline → returns error if so.
    • Looks up DroneSession via SocketService.getDroneSession().
    • If drone not connected → marks as offline immediately, returns success.
    • Emits requestTermination to drone socket with callback.
    • Starts 10-second timeout.
  4. Web (DroneSession.ts):
    • Receives requestTermination event.
    • Logs the termination request.
    • Forwards requestTermination to drone socket (passthrough).
  5. Drone (gadget-drone.ts):
    • Receives requestTermination from platform.
    • Calls callback with success: true.
    • Sends SIGINT to self, triggering graceful shutdown.
    • Updates status to Offline during shutdown.
  6. Web (DroneService.ts):
    • Drone accepts termination → polls DB every 500ms waiting for Offline status.
    • Drone goes offline → resolves with success.
    • Timeout expires (10s) → forces status to Offline, resolves with success.

4. Message Signatures (TS Reference)

IDE -> Web

type RequestSessionLockMessage = (
  registration: IDroneRegistration,
  project: IProject,
  chatSession: IChatSession,
  cb: (success: boolean, chatSessionId: string) => void
) => void;

type SubmitPromptMessage = (prompt: string) => void;

Web -> Drone

type ProcessWorkOrderMessage = (
  registration: IDroneRegistration,
  project: IProject,
  chatSession: IChatSession,
  turn: IChatTurn,
  cb: (success: boolean, message?: string) => void
) => void;
type RequestTerminationMessage = (
  cb: (success: boolean) => void
) => void;

Web -> IDE

type SessionUpdatedMessage = (
  updates: Partial<IChatSession>
) => void;

Drone -> Web (Streaming)

type ThinkingMessage = (content: string) => void;
type ResponseMessage = (content: string) => void;
type ToolCallMessage = (
  callId: string,
  name: string,
  params: string,   // JSON.stringify
  response: string  // JSON.stringify
) => void;
type WorkOrderCompleteMessage = (
  workOrderId: string,
  success: boolean,
  message?: string
) => void;
type RequestTerminationMessage = (
  cb: (success: boolean) => void
) => void;

5. Session Implementation Guide (Web Server)

The web server (gadget-code:web) implements two wrapper classes in src/lib/:

CodeSession.ts

Manages the IDE socket.

  • Logic: Maps User ID -> Socket ID.
  • Routing: When an IDE sends a message, CodeSession finds the selected drone's DroneSession and forwards the command.

DroneSession.ts

Manages the Drone socket.

  • Logic: Maps Drone Registration ID -> Socket ID.
  • Routing: When a drone streams, DroneSession looks up the chatSessionId in the SocketService index to find the return path to the IDE.
  • Session Lookup: SocketService maintains a droneRegistrationIndex Map that maps registration._idDroneSession for efficient lookup by registration ID.

Session Indexing Architecture

The SocketService maintains multiple indexes for efficient session lookup:

  1. droneSessions: Map<socket.id, DroneSession> - Primary storage by socket ID
  2. droneRegistrationIndex: Map<registration._id, DroneSession> - Lookup by drone registration
  3. codeSessions: Map<socket.id, CodeSession> - Primary storage by socket ID
  4. codeSessionUserIndex: Map<user._id, CodeSession> - Lookup by user ID
  5. chatSessionIndex: Map<chatSessionId, CodeSession> - Reverse lookup from chat session to IDE

All indexes are kept in sync during connection and disconnection.


6. Workspace Crash Recovery

  1. Drone starts -> checks for .gadget/work-order.json.
  2. If found, emits requestCrashRecovery({ workspaceId, turnId, chatSessionId }).
  3. Web (DroneSession.ts):
    • Checks DB for ChatTurn status.
    • If turn is already finished, responds with { action: "discard" }.
    • If turn is processing, responds with { action: "retry" } and schedules a new processWorkOrder after a delay.
  4. Drone: Deletes local cache upon receiving any crashRecoveryResponse.

7. Extending the Protocol

To add a new message:

  1. Add the message type to packages/api/src/messages/ide.ts, drone.ts, or web.ts.
  2. Register it in ClientToServerEvents or ServerToClientEvents in packages/api/src/messages/socket.ts.
  3. Re-export from packages/api/src/index.ts.
  4. Implement the sender (emit) in the Client (ide or drone) or Server (CodeSession/DroneSession).
  5. Implement the handler in the corresponding class or frontend component.
  6. Implement the forward-path routing if needed.

8. Reconnection & Message Queuing

8.1 Problem Statement

When the browser refreshes during work order processing:

  1. Old CodeSession disconnects, but DroneSession continues routing to it
  2. Drone emits events but they go to a disconnected socket
  3. New CodeSession connects but isn't linked to the active chat session
  4. Messages are lost; IDE never receives streaming updates

8.2 Solution Architecture

Three-phase approach:

  1. Redis Message Queue (src/lib/message-queue.ts)

    • Messages enqueued when routing fails (disconnected socket)
    • FIFO ordering with RPUSH/LPOP
    • 30-minute TTL (1800 seconds)
    • Max 1000 messages (drop oldest)
    • Aggregates adjacent thinking/response messages during drain
  2. Redis Tab Lock (src/lib/tab-lock.ts)

    • Prevents concurrent tab access to same chat session
    • 1-minute timeout (requires heartbeat renewal)
    • Includes socket ID and user ID for validation
    • Auto-cleanup of stale locks
  3. Auto-Reconnection (CodeSession.checkAndReestablishActiveSession())

    • On connect, checks for active processing turn in DB
    • If found, attempts to acquire tab lock
    • On success, re-establishes chat session index
    • Drains queued messages from Redis
    • Aggregates and delivers messages to client

8.3 Message Queue Flow

Drone emits thinking() → DroneSession.onThinking()
  ↓
SocketService.getCodeSessionByChatSessionId() throws (disconnected)
  ↓
MessageQueue.enqueue(chatSessionId, { type: 'thinking', args: [...] })
  ↓
[30 minutes later] Queue expires automatically
  OR
[On reconnect] MessageQueue.drain() → aggregateMessages() → deliver

8.4 Tab Lock Flow

IDE connects → CodeSession.register()
  ↓
checkAndReestablishActiveSession()
  ↓
Find active chat session with processing turn
  ↓
TabLock.acquire(chatSessionId, userId, socketId)
  ↓
Success: Register chat session, drain queue, emit status
  OR
Failure: Emit 'tabLockDenied' → IDE navigates away

8.5 Frontend Reconciliation

The frontend handles reconnection gracefully:

  1. Load history first - Fetch chat session and turns from DB
  2. Connect socket - Establish WebSocket connection
  3. Backend auto-reconnects - If processing turn found, backend re-establishes
  4. Receive queued messages - Aggregated messages delivered in order
  5. Handle duplicates - Frontend merges with existing history

8.6 Single Tab Enforcement

Only one tab can control a chat session at a time:

  • First tab acquires Redis lock
  • Subsequent tabs receive tabLockDenied event
  • UI shows "Chat session open in another browser tab"
  • User must navigate away or close the duplicate tab

8.7 Status Indicators

The status bar shows connection state:

  • Connected (green ●) - Socket connected, receiving messages
  • Connecting (yellow ●) - Attempting to connect
  • Error (red ●) - Connection failed
  • Disconnected (gray ●) - No active connection

Status messages inform the user:

  • "Connecting..." - Initial connection
  • "Reconnecting to active session..." - Auto-reconnect in progress
  • "Reconnected" - Successfully reconnected
  • "Chat session is open in another browser tab" - Tab lock denied