From 6591da3496afdbc782264dcc05a233a12052bc76 Mon Sep 17 00:00:00 2001 From: Rob Colbert Date: Wed, 29 Apr 2026 16:09:06 -0400 Subject: [PATCH] socket-protocol kickoff docs --- .opencode/plans/foundation-todo.md | 260 +++++++++ docs/architecture-stats.md | 900 +++++++++++++++++++++++++++++ 2 files changed, 1160 insertions(+) create mode 100644 .opencode/plans/foundation-todo.md create mode 100644 docs/architecture-stats.md diff --git a/.opencode/plans/foundation-todo.md b/.opencode/plans/foundation-todo.md new file mode 100644 index 0000000..c1a3c3f --- /dev/null +++ b/.opencode/plans/foundation-todo.md @@ -0,0 +1,260 @@ +# Foundation Cleanup TODO + +**Date:** April 29, 2026 +**Goal:** Correct known issues, standardize APIs, implement message handlers, and prepare solid foundation with unit tests for Chat Session UI implementation. + +--- + +## Phase 1: Fix Type Errors & Interface Conflicts + +### 1.1 Resolve Duplicate `DroneStatus` Enum +- **File:** `gadget-drone/src/services/platform.ts` +- **Action:** Remove local `DroneStatus` enum, import from `@gadget/api` +- **Status:** ⬜ Pending + +### 1.2 Resolve `IAiProvider` Interface Conflict +- **Files:** + - `packages/api/src/interfaces/ai-provider.ts` (Mongoose document) + - `packages/ai/src/api.ts` (runtime config) +- **Action:** Create mapper in `gadget-drone/src/services/ai.ts` to convert DB model → runtime config +- **Status:** ⬜ Pending + +### 1.3 Fix `ToolCallMessage` Signature +- **File:** `packages/api/src/messages/drone.ts:26-30` +- **Issue:** Missing `callId` parameter required by `IChatToolCall` +- **Action:** Add `callId: string` as first parameter +- **Status:** ⬜ Pending + +### 1.4 Fix `ChatTurnStats` Schema Mismatch +- **File:** `gadget-code/src/models/chat-turn.ts:70-76` +- **Issue:** Schema uses `thinkingTokens`, interface uses `thinkingTokenCount` +- **Action:** Standardize on `thinkingTokenCount` in schema +- **Status:** ⬜ Pending + +### 1.5 Fix `ChatToolCallSchema` Missing `callId` +- **File:** `gadget-code/src/models/chat-turn.ts:31-36` +- **Issue:** Schema doesn't include required `callId` field +- **Action:** Add `callId: { type: String, required: true }` to schema +- **Status:** ⬜ Pending + +--- + +## Phase 2: Implement Prompt Submission Flow + +### 2.1 Implement `CodeSession.onSubmitPrompt()` +- **File:** `gadget-code/src/lib/code-session.ts:58-60` +- **Action:** + - Create `ChatTurn` document with status `Processing` + - Build work order from `ChatSession`, `Project`, `IAiProvider`, prompt + - Find target drone's `DroneSession` + - Emit `processWorkOrder` to drone + - Update `ChatTurn` with drone acknowledgment +- **Missing:** Track `selectedDrone`, `chatSession`, `project` in `CodeSession` +- **Status:** ⬜ Pending + +### 2.2 Add Drone Selection to `CodeSession` +- **File:** `gadget-code/src/lib/code-session.ts` +- **Action:** Add properties and methods to track selected drone, chat session, project +- **Status:** ⬜ Pending + +--- + +## Phase 3: Implement Event Routing (Drone→IDE) + +### 3.1 Add DroneSession Event Handlers +- **File:** `gadget-code/src/lib/drone-session.ts:21-23` +- **Action:** Register handlers for: + - `thinking` + - `response` + - `toolCall` + - `workOrderComplete` +- **Status:** ⬜ Pending + +### 3.2 Implement Routing Logic +- **File:** `gadget-code/src/lib/drone-session.ts` +- **Action:** Implement handlers that: + - Find corresponding `CodeSession` by `chatSessionId` + - Forward event to IDE socket + - Update `ChatTurn` document with new data +- **Status:** ⬜ Pending + +### 3.3 Add `getCodeSessionByChatSessionId()` to `SocketService` +- **File:** `gadget-code/src/services/socket.ts` +- **Action:** Maintain reverse index: `chatSessionId → CodeSession` +- **Status:** ⬜ Pending + +--- + +## Phase 4: Emit Events from AWL + +### 4.1 Pass Socket into `AgentService.process()` +- **File:** `gadget-drone/src/gadget-drone.ts:229` +- **Action:** Pass `this.socket` reference to `AgentService.process()` +- **Status:** ⬜ Pending + +### 4.2 Add Event Emissions to AWL Loop +- **File:** `gadget-drone/src/services/agent.ts:70-98` +- **Action:** Emit streaming events: + - `thinking` when reasoning content arrives + - `response` when text content streams + - `toolCall` after each tool execution + - `workOrderComplete` when loop exits +- **Status:** ⬜ Pending + +### 4.3 Implement Workspace Mode Transitions +- **File:** `gadget-drone/src/services/agent.ts` +- **Action:** + - Emit `requestWorkspaceMode(agent)` before starting AWL + - Wait for acknowledgment + - Emit `requestWorkspaceMode(idle)` when complete +- **Status:** ⬜ Pending + +--- + +## Phase 5: Workspace Persistence (Crash Recovery) + +### 5.1 Create `.gadget/` Directory Structure +- **File:** `gadget-drone/src/gadget-drone.ts` +- **Action:** Create `WorkspaceService` to manage: + - `.gadget/workspace.json` (persistent identity) + - `.gadget/work-order.json` (active work order cache) + - `.gadget/logs/` directory +- **Status:** ⬜ Pending + +### 5.2 Implement Workspace Validation on Startup +- **File:** `gadget-drone/src/gadget-drone.ts:57-93` +- **Action:** Add `validateWorkspace()` method called before registration +- **Status:** ⬜ Pending + +### 5.3 Write Work Order Cache During Processing +- **File:** `gadget-drone/src/gadget-drone.ts:209-229` +- **Action:** Write cache BEFORE processing, remove AFTER completion +- **Status:** ⬜ Pending + +### 5.4 Update Drone Registration to Include `workspaceId` +- **Files:** + - `packages/api/src/interfaces/drone-registration.ts` + - `gadget-drone/src/services/platform.ts` +- **Action:** Add `workspaceId: string` to registration +- **Status:** ⬜ Pending + +### 5.5 Add `workspaceId` to `IChatSession` +- **File:** `packages/api/src/interfaces/chat-session.ts` +- **Action:** Add field for routing retries to correct workspace +- **Status:** ⬜ Pending + +### 5.6 Implement Crash Recovery Handler +- **Files:** + - `gadget-drone/src/gadget-drone.ts` (emit `requestCrashRecovery`) + - `gadget-code/src/lib/drone-session.ts` (handle `crashRecoveryResponse`) +- **Status:** ⬜ Pending + +--- + +## Phase 6: Error Handling & Concurrency + +### 6.1 Add Error Propagation from Drone +- **File:** `gadget-drone/src/gadget-drone.ts:209-229` +- **Action:** Wrap `AgentService.process()` in try/catch, emit error event on failure +- **Status:** ⬜ Pending + +### 6.2 Add Concurrency Control +- **File:** `gadget-drone/src/gadget-drone.ts` +- **Action:** Check `DroneStatus.Busy` before accepting work, reject extras +- **Status:** ⬜ Pending + +### 6.3 Add Timeout & Heartbeat Mechanism +- **File:** `gadget-code/src/lib/drone-session.ts` +- **Action:** Prevent IDE hangs on drone crash +- **Status:** ⬜ Pending + +--- + +## Phase 7: Unit Tests + +### 7.1 Socket Message Handler Tests +- **Location:** `gadget-code/tests/socket-handlers.test.ts` +- **Tests:** + - `CodeSession.onSubmitPrompt()` creates ChatTurn + - `DroneSession` routes events to IDE + - `SocketService` tracks sessions correctly +- **Status:** ⬜ Pending + +### 7.2 Drone Message Handler Tests +- **Location:** `gadget-drone/tests/message-handlers.test.ts` +- **Tests:** + - `onRequestSessionLock` validates registration + - `onProcessWorkOrder` accepts and processes + - Workspace mode transitions work correctly +- **Status:** ⬜ Pending + +### 7.3 Agent Service Tests +- **Location:** `gadget-drone/tests/agent-service.test.ts` +- **Tests:** + - AWL loop emits `thinking`, `response`, `toolCall` events + - Tool calls are executed and responses captured + - `workOrderComplete` emitted on finish +- **Status:** ⬜ Pending + +### 7.4 Workspace Persistence Tests +- **Location:** `gadget-drone/tests/workspace.test.ts` +- **Tests:** + - Workspace validation creates `.gadget/` directory + - Work order cache written/removed correctly + - Crash recovery flow works end-to-end +- **Status:** ⬜ Pending + +### 7.5 Type Mapper Tests +- **Location:** `gadget-drone/tests/type-mappers.test.ts` +- **Tests:** + - `IAiProvider` DB → runtime conversion + - `ToolCallMessage` → `IChatToolCall` conversion +- **Status:** ⬜ Pending + +--- + +## Phase 8: Documentation Cleanup + +### 8.1 Remove Bull Queue References +- **Files:** + - `gadget-drone/docs/agentic-workflow-loop.md` + - `gadget-drone/AGENTS.md` +- **Action:** Remove all Bull queue references, document Socket.IO-only approach +- **Status:** ⬜ Pending + +### 8.2 Update AWL Interface Documentation +- **Files:** + - `gadget-code/docs/agentic-workflow-loop.md` + - `gadget-drone/docs/agentic-workflow-loop.md` +- **Action:** Delete interface definitions, reference `@gadget/api` only +- **Status:** ⬜ Pending + +### 8.3 Update foundation-todo.md +- **Action:** Mark completed items, update as work progresses +- **Status:** ⬜ In Progress + +--- + +## Acceptance Criteria + +By end of this turn: + +- [ ] All TypeScript compilation errors resolved +- [ ] Message handlers implemented for all socket events +- [ ] End-to-end prompt submission flow works (IDE→Web→Drone→Web→IDE) +- [ ] Streaming events (`thinking`, `response`, `toolCall`) routed correctly +- [ ] Workspace persistence implemented for crash recovery +- [ ] Unit tests pass for all implemented functionality +- [ ] Documentation cleaned up and consistent +- [ ] System ready for Chat Session UI implementation + +--- + +## Notes + +- Remain on feature branch throughout implementation +- Commit frequently to git with descriptive messages +- Ask for clarification when encountering ambiguous requirements +- Follow existing code conventions and patterns +- All code must include unit tests +- Keep this document up to date as work progresses diff --git a/docs/architecture-stats.md b/docs/architecture-stats.md new file mode 100644 index 0000000..563769e --- /dev/null +++ b/docs/architecture-stats.md @@ -0,0 +1,900 @@ +# Gadget Code Architecture Review + +**Date:** April 29, 2026 +**Scope:** Socket.IO Communication System for Agentic Workflow Loop + +## Executive Summary + +The Gadget Code architecture is **80% complete** with solid foundations, but has critical gaps preventing end-to-end prompt processing. The Socket.IO infrastructure is properly structured, but message handlers lack implementation, data models have inconsistencies, and the agentic workflow loop cannot yet execute. + +**Primary Blocker:** A prompt submitted from the IDE cannot reach the drone's AgentService for processing, and results cannot flow back to persist in ChatTurn documents. + +### Architectural Decision: Socket.IO Only (No Bull Queue) + +**Decision:** Bull queue will **not** be used. All message routing uses Socket.IO with directed delivery. + +**Rationale:** +- Better performance for real-time agentic workflows +- Eliminates Redis dependency for end users +- Simpler deployment model + +**Recovery Strategy:** Workspace persistence via `.gadget/` directory (see Section 7: Workspace Persistence Architecture). + +--- + +## 1. Architecture Soundness + +### ✅ What's Working Well + +1. **Socket.IO Server Setup** (`gadget-code/src/services/socket.ts`) + - Proper authentication middleware distinguishing Code (IDE) vs Drone sessions + - Session management via `CodeSession` and `DroneSession` classes + - Clean separation of concerns with session types + +2. **Event Interface Definitions** (`packages/api/src/messages/*.ts`) + - `ClientToServerEvents` and `ServerToClientEvents` properly typed + - Message signatures match between IDE↔Web↔Drone + - Callback-based request/response pattern is sound + +3. **Data Model Foundation** (`packages/api/src/interfaces/*.ts`) + - `IChatTurn`, `IChatSession`, `IChatToolCall` capture AWL state + - `WorkspaceMode` enum correctly models mutual exclusion + - Socket routing architecture is correct + +### ❌ Critical Design Issues + +#### Issue 1: Duplicate `DroneStatus` Enum + +**Location:** `packages/api/src/interfaces/drone-registration.ts` vs `gadget-drone/src/services/platform.ts` + +Both files define `DroneStatus` with identical values. The drone imports from its local copy, but `@gadget/api` exports a different type. This causes type mismatches when passing registrations between packages. + +**Fix:** Remove `DroneStatus` from `gadget-drone/src/services/platform.ts` and import from `@gadget/api`. + +#### Issue 2: Conflicting `IAiProvider` Interfaces + +**Location:** +- `packages/api/src/interfaces/ai-provider.ts` defines `IAiProvider extends Document` with `apiType: "ollama" | "openai"` and `models: IAiModel[]` +- `packages/ai/src/api.ts` defines `IAiProvider` with `sdk: "ollama" | "openai"` and no Mongoose dependencies + +**Impact:** `gadget-drone/src/services/agent.ts:74` fails TypeScript compilation: +```typescript +const api = this.getApi(provider); // Error: ObjectId | IAiProvider not assignable +``` + +The `IChatTurn.provider` field is typed as `IAiProvider | Types.ObjectId` (from `@gadget/api`), but `@gadget/ai` expects a different shape. + +**Fix:** +1. Keep `@gadget/api` as the Mongoose document interface (database layer) +2. Keep `@gadget/ai` as the runtime config interface (AI SDK layer) +3. Add a mapper in `gadget-drone/src/services/ai.ts` that converts `IAiProvider | ObjectId` → `IAiProvider` before calling `createAiApi()` + +#### Issue 3: Missing `callId` in Tool Call Message + +**Location:** `packages/api/src/messages/drone.ts:27-30` + +```typescript +export type ToolCallMessage = ( + name: string, + params: string, + response: string, +) => void; +``` + +But `IChatToolCall` in `packages/api/src/interfaces/chat-turn.ts` requires `callId: string`. The socket message doesn't match the persistence model. + +**Fix:** Add `callId: string` as first parameter to `ToolCallMessage`. + +--- + +## 2. Completeness Analysis + +### 2.1 Socket.IO Message Flow + +| Message | IDE→Web | Web→Drone | Drone→Web | Web→IDE | Status | +|---------|---------|-----------|-----------|---------|--------| +| `requestSessionLock` | ✅ Sent | ✅ Routed | ✅ Received | ❌ Not implemented | Partial | +| `requestWorkspaceMode` | ✅ Sent | ✅ Routed | ✅ Received | ❌ Not implemented | Partial | +| `submitPrompt` | ✅ Sent | ❌ Not handled | ❌ Not sent | N/A | **Broken** | +| `processWorkOrder` | N/A | ✅ Sent | ✅ Received | N/A | ✅ Complete | +| `thinking` | N/A | ❌ Not routed | ✅ Sent | ❌ Not emitted | **Broken** | +| `response` | N/A | ❌ Not routed | ✅ Sent | ❌ Not emitted | **Broken** | +| `toolCall` | N/A | ❌ Not routed | ✅ Sent | ❌ Not emitted | **Broken** | + +**Assessment:** The forward path (IDE→Drone) is blocked at `submitPrompt`. The return path (Drone→IDE) has no routing logic. + +### 2.2 gadget-code:web Implementation Gaps + +#### Missing: `submitPrompt` Handler + +**File:** `gadget-code/src/lib/code-session.ts:58-60` + +```typescript +async onSubmitPrompt(content: string): Promise { + this.log.debug("prompt received", { content }); +} +``` + +**Required Implementation:** +1. Create `ChatTurn` document with status `Processing` +2. Build work order from `ChatSession`, `Project`, `IAiProvider`, and prompt +3. Find target drone's `DroneSession` via `SocketService.getDroneSession()` +4. Emit `processWorkOrder` to drone with full context +5. Drone acknowledges → update `ChatTurn` with drone job ID + +#### Missing: Drone→IDE Event Routing + +**File:** `gadget-code/src/lib/drone-session.ts` — no message handlers registered + +**Required Implementation:** + +```typescript +// In DroneSession.register() +this.socket.on("thinking", this.onThinking.bind(this)); +this.socket.on("response", this.onResponse.bind(this)); +this.socket.on("toolCall", this.onToolCall.bind(this)); +this.socket.on("workOrderComplete", this.onWorkOrderComplete.bind(this)); +``` + +Each handler must: +1. Find the corresponding `CodeSession` by `chatSessionId` +2. Forward the event to the IDE socket +3. Update the `ChatTurn` document with new data + +#### Missing: ChatTurn Persistence Updates + +**File:** `gadget-code/src/models/chat-turn.ts` exists but is not being updated during AWL execution. + +**Required:** Create a `TurnUpdateService` that: +- Listens for streaming events (`thinking`, `response`, `toolCall`) +- Applies incremental updates to the active `ChatTurn` +- Handles token counting and duration tracking + +### 2.3 gadget-drone Implementation Gaps + +#### Missing: Work Order Acknowledgment Flow + +**File:** `gadget-drone/src/gadget-drone.ts:209-229` + +```typescript +async onProcessWorkOrder(...) { + const order: IAgentWorkOrder = { ... }; + cb(true); // accepts immediately + AgentService.process(order); // fires without waiting +} +``` + +**Issue:** No error handling if `AgentService.process()` throws. No status update back to web service if processing fails. + +**Fix:** Wrap in try/catch, emit error event to web service on failure. + +#### Missing: Socket Event Emissions + +**File:** `gadget-drone/src/services/agent.ts:70-98` + +The AWL loop has comments `/* emit turn-tool-call socket message */` but no actual `socket.emit()` calls. + +**Required:** Pass socket reference into `AgentService.process()` and emit: +- `thinking` when reasoning content arrives +- `response` when text content streams +- `toolCall` after each tool execution +- `workOrderComplete` when loop exits + +#### Missing: Workspace Mode Management + +**File:** `gadget-drone/src/gadget-drone.ts:168-188` + +`onRequestSessionLock` sets `workspaceMode = User` but never transitions to `Agent` mode before processing. The AWL should: +1. Emit `requestWorkspaceMode(agent)` before starting +2. Wait for acknowledgment +3. Run the loop +4. Emit `requestWorkspaceMode(idle)` when complete + +### 2.4 Data Model Inconsistencies + +#### ChatTurn Schema Mismatch + +**File:** `gadget-code/src/models/chat-turn.ts:70-76` + +Schema defines `stats.thinkingTokens` but interface `IChatTurnStats` in `packages/api/src/interfaces/chat-turn.ts:24` uses `thinkingTokenCount`. + +**Fix:** Standardize on `thinkingTokenCount` in both places. + +#### Missing User Reference in Context Messages + +**File:** `gadget-drone/src/services/agent.ts:101-120` + +```typescript +buildSessionContext(workOrder: IAgentWorkOrder): IContextChatMessage[] { + const user: IUser = workOrder.turn.session.user as IUser; + // ... + messages.push({ + // ... + user: { + _id: user._id.toHexString(), // Breaks if session.user is ObjectId + username: user.email, + displayName: user.displayName, + }, + }); +} +``` + +**Issue:** `workOrder.turn.session` is typed as `IChatSession | Types.ObjectId`. If it's an ObjectId, accessing `.user` fails. + +**Fix:** Populate `session.user` before creating work order, or fetch user separately in drone. + +--- + +## 3. Conflicts and Redundancies + +### 3.1 Documentation Conflicts + +**Work Order Interface Discrepancy:** + +- `gadget-code/docs/agentic-workflow-loop.md:21-52` defines `IWorkOrder` with `provider.apiKey` and `context: IChatMessage[]` +- `gadget-drone/docs/agentic-workflow-loop.md:15-62` defines `IWorkOrder` with `provider.sdk` and `chatSession.context` +- Actual implementation in `gadget-drone/src/services/agent.ts:24-28` uses `IAgentWorkOrder` with `turn: IChatTurn` and `context: IChatTurn[]` + +**Resolution:** Delete both markdown docs' interface definitions. Reference `@gadget/api` interfaces only. Update docs to match `IAgentWorkOrder`. + +### 3.2 Bull Queue vs Socket.IO (Resolved) + +**Documentation states:** +- `gadget-drone/docs/agentic-workflow-loop.md:10-12`: "Each Gadget Drone registered by the User implements a named Bull job queue" +- `gadget-drone/AGENTS.md`: "Queue: Bull queue named `gadget-drone`, job type `prompt`" + +**Reality:** `gadget-drone/src/gadget-drone.ts` uses **Socket.IO** for work order delivery, not Bull. There's no Bull queue setup in the drone. + +**Decision:** ✅ **Option A (Socket.IO only)** — Bull references are legacy and must be removed from all documentation. + +**Recovery from Drone Crash:** Handled via workspace persistence in `.gadget/` directory (see Section 7). When a drone restarts: +1. It validates/creates `.gadget/workspace.json` with workspace UUID +2. Web service reads workspace state to route retry to same directory +3. Agent can resume from last persisted ChatTurn state + +### 3.3 Redundant Service Layers + +**Observation:** `gadget-code/src/services/api-client.ts` exists alongside direct Mongoose model usage. + +**Check:** `gadget-code/src/controllers/api/v1/drone.ts` likely duplicates `DroneService` methods. + +**Action:** Audit API controllers — if they just proxy service methods, remove and call services directly from Socket handlers. + +--- + +## 4. Implementation Roadmap + +### Phase 1: Fix Type Errors (1-2 hours) + +**Task 1.1:** Resolve `IAiProvider` conflict +```bash +# In gadget-drone/src/services/agent.ts +import { IAiProvider as AiProviderConfig } from "@gadget/ai"; +import { IAiProvider as DbAiProvider } from "@gadget/api"; + +// Add mapper +function mapDbProviderToConfig(provider: DbAiProvider | Types.ObjectId): AiProviderConfig { + if (provider instanceof Types.ObjectId) { + throw new Error("Provider must be populated"); + } + return { + _id: provider._id.toHexString(), + name: provider.name, + sdk: provider.apiType, // note: apiType → sdk + baseUrl: provider.baseUrl, + apiKey: provider.apiKey, + }; +} +``` + +**Task 1.2:** Fix `DroneStatus` duplication +```bash +# Delete from gadget-drone/src/services/platform.ts +# Import from @gadget/api instead +``` + +**Task 1.3:** Fix `ChatTurnStats` field names +```bash +# Align schema and interface on thinkingTokenCount +``` + +### Phase 2: Implement Prompt Submission (3-4 hours) + +**Task 2.1:** Implement `CodeSession.onSubmitPrompt()` +```typescript +async onSubmitPrompt(content: string): Promise { + const turn = new ChatTurn({ + createdAt: new Date(), + user: this.user._id, + session: this.chatSession._id, + project: this.project?._id, + provider: this.chatSession.provider, // must populate + llm: this.chatSession.selectedModel, + mode: this.chatSession.mode, + status: ChatTurnStatus.Processing, + prompts: { user: content }, + toolCalls: [], + stats: { /* zeros */ } + }); + await turn.save(); + + const droneSession = SocketService.getDroneSession(this.selectedDrone); + droneSession.socket.emit( + "processWorkOrder", + registration, + this.project, + this.chatSession, + turn, + (success: boolean) => { + if (success) { + turn.status = ChatTurnStatus.Processing; + turn.save(); + } + } + ); +} +``` + +**Task 2.2:** Add drone selection to `CodeSession` +- Track `selectedDrone: IDroneRegistration` +- Track `chatSession: IChatSession` +- Track `project: IProject` + +### Phase 3: Implement Event Routing (3-4 hours) + +**Task 3.1:** Add DroneSession event handlers +```typescript +// In DroneSession.register() +this.socket.on("thinking", (content: string) => this.onThinking(content)); +this.socket.on("response", (content: string) => this.onResponse(content)); +this.socket.on("toolCall", (name, params, response) => + this.onToolCall(name, params, response)); +this.socket.on("workOrderComplete", (turnId, success, message) => + this.onWorkOrderComplete(turnId, success, message)); +``` + +**Task 3.2:** Implement routing logic +```typescript +async onThinking(content: string): Promise { + const codeSession = SocketService.getCodeSessionByChatSessionId( + this.chatSessionId + ); + codeSession.socket.emit("thinking", content); + + // Update ChatTurn + await ChatTurn.findByIdAndUpdate(this.currentTurnId, { + thinking: content + }); +} +``` + +**Task 3.3:** Add `getCodeSessionByChatSessionId()` to `SocketService` +- Maintain reverse index: `chatSessionId → CodeSession` + +### Phase 4: Emit Events from AWL (2-3 hours) + +**Task 4.1:** Pass socket into `AgentService.process()` +```typescript +// In gadget-drone/src/gadget-drone.ts +await AgentService.process(order, this.socket); +``` + +**Task 4.2:** Add emissions to AWL loop +```typescript +// In AgentService.process() +for await (const chunk of response.stream) { + if (chunk.type === "thinking") { + socket.emit("thinking", chunk.content); + } else if (chunk.type === "response") { + socket.emit("response", chunk.content); + } +} + +for (const toolCall of response.toolCalls) { + const result = await executeTool(toolCall); + socket.emit("toolCall", toolCall.name, toolCall.arguments, result); +} + +socket.emit("workOrderComplete", turn._id, true); +``` + +### Phase 5: Workspace Persistence (4-6 hours) ⚠️ **CRITICAL PATH** + +**Task 5.1:** Create `.gadget/` directory structure on drone startup +```typescript +// In gadget-drone/src/gadget-drone.ts, before registration +async validateWorkspace(): Promise { + const gadgetDir = path.join(process.cwd(), '.gadget'); + const workspaceFile = path.join(gadgetDir, 'workspace.json'); + + if (!fs.existsSync(gadgetDir)) { + await fs.promises.mkdir(gadgetDir, { recursive: true }); + } + + let workspaceData: WorkspaceData; + if (fs.existsSync(workspaceFile)) { + // Validate existing workspace + workspaceData = JSON.parse(await fs.promises.readFile(workspaceFile, 'utf-8')); + this.log.info('validated existing workspace', { + workspaceId: workspaceData.workspaceId + }); + } else { + // Create new workspace + workspaceData = { + workspaceId: crypto.randomUUID(), + createdAt: new Date().toISOString(), + projects: [], + chatSession: null, + lockedProject: null, + }; + await fs.promises.writeFile(workspaceFile, JSON.stringify(workspaceData, null, 2)); + this.log.info('created new workspace', { + workspaceId: workspaceData.workspaceId + }); + } + + this.workspaceData = workspaceData; +} +``` + +**Task 5.2:** Write work order cache during processing +```typescript +// In onProcessWorkOrder() +async onProcessWorkOrder(...) { + const workOrderFile = path.join(this.gadgetDir, 'work-order.json'); + + // Write cache BEFORE processing + await fs.promises.writeFile(workOrderFile, JSON.stringify({ + turnId: turn._id.toHexString(), + chatSessionId: chatSession._id.toHexString(), + projectId: project._id.toHexString(), + receivedAt: new Date().toISOString(), + }, null, 2)); + + try { + await AgentService.process(order, this.socket); + } finally { + // Remove cache AFTER completion + await fs.promises.unlink(workOrderFile); + } +} +``` + +**Task 5.3:** Update drone registration to include workspaceId +```typescript +// In PlatformService.register() +interface IDroneDefinition { + hostname: string; + workspaceDir: string; + workspaceId: string; // NEW: persistent workspace identifier +} +``` + +**Task 5.4:** Web service stores workspaceId with ChatSession +```typescript +// In packages/api/src/interfaces/chat-session.ts +export interface IChatSession extends Document { + // ... existing fields ... + workspaceId: string; // NEW: route retries to correct workspace +} +``` + +### Phase 6: End-to-End Test (2 hours) + +**Test Scenario:** +1. Start drone: `pnpm --filter gadget-drone dev` +2. Start web: `pnpm --filter gadget-code dev:backend` +3. Start IDE: `pnpm --filter gadget-code dev:frontend` +4. Login, create project, select drone +5. Submit prompt: "Create a hello world function" +6. Verify: + - ChatTurn created in MongoDB + - Drone receives `processWorkOrder` + - IDE receives `thinking`/`response` events + - ChatTurn updated with results + +**Test Drone Recovery:** +1. Kill drone mid-turn (Ctrl+C) +2. Verify `.gadget/work-order.json` exists with turn data +3. Restart drone in same directory +4. Verify drone reports workspaceId to web service +5. Web service can route retry to same workspace + +--- + +## 5. Risk Assessment + +### High Risk + +1. **No Streaming in @gadget/ai** + - `AiApi.chat()` returns `Promise`, not async iterable + - Cannot stream tokens in real-time without refactoring + - **Mitigation:** Add `streamCallback` parameter (already exists in signature) but implement it in Ollama/OpenAI clients + +2. **No Error Propagation** + - If drone crashes mid-turn, IDE hangs forever + - **Mitigation:** Add timeout + heartbeat mechanism + +3. **No Workspace Persistence Layer** ⚠️ **CRITICAL** + - Drone restart loses all context: which workspace, which projects, which chat session + - Cannot retry work orders without knowing original workspace directory + - **Mitigation:** Implement `.gadget/` directory persistence (see Section 7) + +### Medium Risk + +1. **Session State Not Persisted** + - `CodeSession` and `DroneSession` are in-memory + - Server restart loses all active sessions + - **Mitigation:** Store session state in Redis + +2. **No Concurrency Control** + - Multiple prompts can queue for same drone + - Drone processes one at a time but doesn't reject extras + - **Mitigation:** Check `DroneStatus.Busy` before accepting work + +### Low Risk + +1. **TypeScript Strict Mode Violations** + - Several `any` and missing null checks + - Build passes but runtime errors possible + - **Mitigation:** Enable `noUncheckedIndexedAccess` in drone + +--- + +## 6. Recommended Next Steps + +1. **Fix TypeScript errors** in `gadget-drone/src/services/agent.ts` (Phase 1) +2. **Implement `submitPrompt` handler** (Phase 2, Task 2.1) +3. **Add basic event routing** (Phase 3, minimal viable path) +4. **Test end-to-end** with stubbed tool calls +5. **Iterate** on streaming, error handling, and persistence + +--- + +## Appendix A: File Inventory + +### Core Socket Implementation +- `gadget-code/src/services/socket.ts` — Socket.IO server setup ✅ +- `gadget-code/src/lib/socket-session.ts` — Base session class ✅ +- `gadget-code/src/lib/code-session.ts` — IDE session (partial) +- `gadget-code/src/lib/drone-session.ts` — Drone session (minimal) +- `gadget-drone/src/gadget-drone.ts` — Drone client ✅ + +### Data Models +- `packages/api/src/interfaces/*.ts` — TypeScript interfaces ✅ +- `gadget-code/src/models/*.ts` — Mongoose schemas ✅ +- `gadget-drone/src/models/` — None (drone is stateless) + +### Message Definitions +- `packages/api/src/messages/socket.ts` — Event map ✅ +- `packages/api/src/messages/ide.ts` — IDE→Web messages ✅ +- `packages/api/src/messages/drone.ts` — Drone messages (incomplete) + +### AI Integration +- `packages/ai/src/api.ts` — AI interface ✅ +- `packages/ai/src/ollama.ts` — Ollama client ✅ +- `packages/ai/src/openai.ts` — OpenAI client ✅ +- `gadget-drone/src/services/ai.ts` — AI service wrapper ✅ +- `gadget-drone/src/services/agent.ts` — AWL implementation (partial) + +--- + +## Appendix B: Build Status + +| Package | Build Status | Notes | +|---------|-------------|-------| +| `@gadget/api` | ✅ Passes | Type definitions only | +| `@gadget/ai` | ✅ Passes | AI SDK abstraction | +| `gadget-code` | ✅ Passes | Web server builds | +| `gadget-drone` | ❌ Fails | Type errors in `agent.ts:74,102` | + +**Blocking Errors:** +``` +src/services/agent.ts(74,9): Argument of type 'ObjectId | IAiProvider' + is not assignable to parameter of type 'IAiProvider'. +src/services/agent.ts(102,48): Property 'user' does not exist on type + 'ObjectId | IChatSession'. +``` + +--- + +## 7. Workspace Persistence Architecture + +### 7.1 Design Goals + +1. **No External Dependencies:** End users should not need to run Redis, MongoDB, or other infrastructure just to run `gadget-drone` +2. **Crash Recovery:** When a drone crashes mid-work-order, it must be able to resume in the same workspace with the same project state +3. **Workspace Identity:** Each workspace directory needs a persistent, unique identifier that survives drone restarts +4. **State Visibility:** Both the drone and web service must be able to inspect workspace state at any time + +### 7.2 Directory Structure + +``` +/ +├── .gadget/ +│ ├── workspace.json # Persistent workspace identity & state +│ ├── work-order.json # Active work order cache (deleted when complete) +│ └── logs/ +│ └── drone.log # Drone execution logs +├── / # Project directories managed by this workspace +├── / +└── ... +``` + +### 7.3 File Specifications + +#### `.gadget/workspace.json` + +**Created:** When drone starts in a directory (new or existing workspace) +**Updated:** When chat session lock acquired/released, projects added/removed +**Deleted:** Never (only if user manually deletes workspace) + +```typescript +interface WorkspaceData { + workspaceId: string; // UUID v4, immutable once created + createdAt: string; // ISO 8601 timestamp + hostname: string; // Machine hostname where drone runs + workspaceDir: string; // Absolute path to workspace directory + + // Active session state (null when idle) + chatSession: { + _id: string; // MongoDB ChatSession._id + name: string; // Session name for display + lockedAt: string; // ISO 8601 timestamp + } | null; + + // Project currently being worked on (null when idle) + lockedProject: { + _id: string; // MongoDB Project._id + slug: string; // Project slug (directory name) + gitUrl: string; // Remote git URL + lockedAt: string; // ISO 8601 timestamp + } | null; + + // All projects cloned into this workspace + projects: Array<{ + _id: string; + slug: string; + gitUrl: string; + clonedAt: string; + lastSyncAt: string; + }>; + + // Drone registration (updated each startup) + registration: { + _id: string; // MongoDB DroneRegistration._id + status: string; // Current drone status + registeredAt: string; // ISO 8601 timestamp + } | null; +} +``` + +**Example:** +```json +{ + "workspaceId": "550e8400-e29b-41d4-a716-446655440000", + "createdAt": "2026-04-29T19:30:00.000Z", + "hostname": "rob-dev-machine", + "workspaceDir": "/home/rob/projects/my-gadget-workspace", + "chatSession": { + "_id": "65f8a9b2c3d4e5f6a7b8c9d0", + "name": "Fix authentication bug", + "lockedAt": "2026-04-29T20:15:00.000Z" + }, + "lockedProject": { + "_id": "65f8a9b2c3d4e5f6a7b8c9d1", + "slug": "auth-service", + "gitUrl": "https://github.com/user/auth-service.git", + "lockedAt": "2026-04-29T20:15:00.000Z" + }, + "projects": [ + { + "_id": "65f8a9b2c3d4e5f6a7b8c9d1", + "slug": "auth-service", + "gitUrl": "https://github.com/user/auth-service.git", + "clonedAt": "2026-04-29T19:30:00.000Z", + "lastSyncAt": "2026-04-29T20:15:00.000Z" + } + ], + "registration": { + "_id": "65f8a9b2c3d4e5f6a7b8c9d2", + "status": "busy", + "registeredAt": "2026-04-29T19:30:00.000Z" + } +} +``` + +#### `.gadget/work-order.json` + +**Created:** When `processWorkOrder` message received +**Updated:** Not updated (immutable cache) +**Deleted:** When work order completes (success or error) + +```typescript +interface WorkOrderCache { + turnId: string; // ChatTurn._id for persistence updates + chatSessionId: string; // For routing events back to IDE + projectId: string; // For file operations + workOrderId: string; // Unique ID for this work order instance + receivedAt: string; // ISO 8601 timestamp + prompt: string; // User's prompt (for retry context) + status: 'processing' | 'completed' | 'error'; + error?: string; // Error message if status === 'error' +} +``` + +**Purpose:** If drone crashes while this file exists, the web service knows: +- Which ChatTurn was being processed +- Which workspace to route the retry to +- What prompt needs to be re-processed + +### 7.4 Drone Startup Sequence + +```typescript +// Pseudocode for gadget-drone.ts startup +async start(): Promise { + // Step 1: Validate/create workspace (BEFORE anything else) + await this.validateWorkspace(); + + // Step 2: Get user credentials + const credentials = await this.getUserCredentials(); + + // Step 3: Register with platform (includes workspaceId) + this.registration = await PlatformService.register( + credentials.email, + credentials.password, + process.cwd(), + this.workspaceData.workspaceId, // NEW parameter + ); + + // Step 4: Update workspace.json with registration + this.workspaceData.registration = { + _id: this.registration._id.toHexString(), + status: 'starting', + registeredAt: new Date().toISOString(), + }; + await this.writeWorkspaceData(); + + // Step 5: Connect Socket.IO + await this.connectSocket(); + + // Step 6: Check for incomplete work order (crash recovery) + await this.checkCrashRecovery(); + + // Step 7: Mark as available + await PlatformService.setStatus(DroneStatus.Available); + this.workspaceData.registration!.status = 'available'; + await this.writeWorkspaceData(); +} + +async checkCrashRecovery(): Promise { + const workOrderFile = path.join(this.gadgetDir, 'work-order.json'); + + if (fs.existsSync(workOrderFile)) { + const cache = JSON.parse(await fs.promises.readFile(workOrderFile, 'utf-8')); + + this.log.warn('incomplete work order found - crash recovery needed', { + turnId: cache.turnId, + prompt: cache.prompt, + }); + + // Notify web service that this workspace has pending recovery + this.socket.emit('requestCrashRecovery', { + workspaceId: this.workspaceData.workspaceId, + turnId: cache.turnId, + chatSessionId: cache.chatSessionId, + }); + + // DO NOT delete work-order.json yet - wait for web service instruction + } +} +``` + +### 7.5 Web Service: Crash Recovery Flow + +When web service receives `requestCrashRecovery`: + +1. **Fetch ChatTurn** by `turnId` +2. **Check Turn Status:** + - If `status === 'finished'`: Acknowledge, tell drone to delete cache (turn completed before crash notification) + - If `status === 'processing'`: Queue retry for this workspace +3. **Route Retry:** When retrying, filter drones by `workspaceId` to ensure same workspace handles it +4. **Acknowledge:** Tell drone it can delete `work-order.json` + +```typescript +// In gadget-code/src/lib/drone-session.ts +async onRequestCrashRecovery(data: { + workspaceId: string; + turnId: string; + chatSessionId: string; +}): Promise { + const turn = await ChatTurn.findById(data.turnId); + + if (!turn) { + this.socket.emit('crashRecoveryResponse', { + turnId: data.turnId, + action: 'discard', // Turn doesn't exist, delete cache + }); + return; + } + + if (turn.status === ChatTurnStatus.Finished) { + this.socket.emit('crashRecoveryResponse', { + turnId: data.turnId, + action: 'discard', // Already done, delete cache + }); + return; + } + + // Turn is still processing - mark for retry + turn.status = ChatTurnStatus.Error; + turn.response = 'Drone crashed during processing - retrying'; + await turn.save(); + + this.socket.emit('crashRecoveryResponse', { + turnId: data.turnId, + action: 'retry', + retryDelay: 5000, // Wait 5 seconds before retry + }); + + // Schedule retry (will route to same workspaceId) + setTimeout(() => { + this.retryWorkOrder(turn); + }, 5000); +} +``` + +### 7.6 Workspace-Aware Drone Selection + +When selecting a drone for a work order: + +```typescript +// In gadget-code/src/lib/code-session.ts +async onSubmitPrompt(content: string): Promise { + // ... create ChatTurn ... + + // Prefer drone in same workspace (for continuity) + let targetDrone: DroneSession; + + if (this.chatSession.workspaceId) { + // Try to find drone in same workspace + targetDrone = SocketService.getDroneSessionByWorkspaceId( + this.chatSession.workspaceId + ); + + if (!targetDrone) { + this.log.warn('workspace drone unavailable, selecting alternative'); + // Fall through to any available drone + } + } + + if (!targetDrone) { + // Select any available drone for this user + targetDrone = SocketService.getAvailableDroneForUser(this.user); + } + + // Include workspaceId in work order for persistence + targetDrone.socket.emit('processWorkOrder', { + // ... existing fields ... + workspaceId: this.chatSession.workspaceId, + }); +} +``` + +### 7.7 Implementation Checklist + +- [ ] Create `WorkspaceService` in `gadget-drone/src/services/workspace.ts` +- [ ] Implement `validateWorkspace()` and `writeWorkspaceData()` +- [ ] Update `PlatformService.register()` to accept `workspaceId` +- [ ] Add `workspaceId` field to `IDroneRegistration` interface and model +- [ ] Add `workspaceId` field to `IChatSession` interface and model +- [ ] Implement `work-order.json` cache write/remove in `onProcessWorkOrder()` +- [ ] Implement `requestCrashRecovery` socket handler in drone +- [ ] Implement `crashRecoveryResponse` socket handler in web service +- [ ] Add workspace-aware drone selection in `CodeSession.onSubmitPrompt()` +- [ ] Remove all Bull queue references from documentation + +--- + +**Document Status:** Complete +**Next Review:** After Phase 2 implementation