30 KiB
Gadget Code Architecture Review
Date: April 29, 2026
Scope: Socket.IO Communication System for Agentic Workflow Loop
Executive Summary
The Gadget Code architecture is 80% complete with solid foundations, but has critical gaps preventing end-to-end prompt processing. The Socket.IO infrastructure is properly structured, but message handlers lack implementation, data models have inconsistencies, and the agentic workflow loop cannot yet execute.
Primary Blocker: A prompt submitted from the IDE cannot reach the drone's AgentService for processing, and results cannot flow back to persist in ChatTurn documents.
Architectural Decision: Socket.IO Only (No Bull Queue)
Decision: Bull queue will not be used. All message routing uses Socket.IO with directed delivery.
Rationale:
- Better performance for real-time agentic workflows
- Eliminates Redis dependency for end users
- Simpler deployment model
Recovery Strategy: Workspace persistence via .gadget/ directory (see Section 7: Workspace Persistence Architecture).
1. Architecture Soundness
✅ What's Working Well
-
Socket.IO Server Setup (
gadget-code/src/services/socket.ts)- Proper authentication middleware distinguishing Code (IDE) vs Drone sessions
- Session management via
CodeSessionandDroneSessionclasses - Clean separation of concerns with session types
-
Event Interface Definitions (
packages/api/src/messages/*.ts)ClientToServerEventsandServerToClientEventsproperly typed- Message signatures match between IDE↔Web↔Drone
- Callback-based request/response pattern is sound
-
Data Model Foundation (
packages/api/src/interfaces/*.ts)IChatTurn,IChatSession,IChatToolCallcapture AWL stateWorkspaceModeenum correctly models mutual exclusion- Socket routing architecture is correct
❌ Critical Design Issues
Issue 1: Duplicate DroneStatus Enum
Location: packages/api/src/interfaces/drone-registration.ts vs gadget-drone/src/services/platform.ts
Both files define DroneStatus with identical values. The drone imports from its local copy, but @gadget/api exports a different type. This causes type mismatches when passing registrations between packages.
Fix: Remove DroneStatus from gadget-drone/src/services/platform.ts and import from @gadget/api.
Issue 2: Conflicting IAiProvider Interfaces
Location:
packages/api/src/interfaces/ai-provider.tsdefinesIAiProvider extends DocumentwithapiType: "ollama" | "openai"andmodels: IAiModel[]packages/ai/src/api.tsdefinesIAiProviderwithsdk: "ollama" | "openai"and no Mongoose dependencies
Impact: gadget-drone/src/services/agent.ts:74 fails TypeScript compilation:
const api = this.getApi(provider); // Error: ObjectId | IAiProvider not assignable
The IChatTurn.provider field is typed as IAiProvider | Types.ObjectId (from @gadget/api), but @gadget/ai expects a different shape.
Fix:
- Keep
@gadget/apias the Mongoose document interface (database layer) - Keep
@gadget/aias the runtime config interface (AI SDK layer) - Add a mapper in
gadget-drone/src/services/ai.tsthat convertsIAiProvider | ObjectId→IAiProviderbefore callingcreateAiApi()
Issue 3: Missing callId in Tool Call Message
Location: packages/api/src/messages/drone.ts:27-30
export type ToolCallMessage = (
name: string,
params: string,
response: string,
) => void;
But IChatToolCall in packages/api/src/interfaces/chat-turn.ts requires callId: string. The socket message doesn't match the persistence model.
Fix: Add callId: string as first parameter to ToolCallMessage.
2. Completeness Analysis
2.1 Socket.IO Message Flow
| Message | IDE→Web | Web→Drone | Drone→Web | Web→IDE | Status |
|---|---|---|---|---|---|
requestSessionLock |
✅ Sent | ✅ Routed | ✅ Received | ❌ Not implemented | Partial |
requestWorkspaceMode |
✅ Sent | ✅ Routed | ✅ Received | ❌ Not implemented | Partial |
submitPrompt |
✅ Sent | ❌ Not handled | ❌ Not sent | N/A | Broken |
processWorkOrder |
N/A | ✅ Sent | ✅ Received | N/A | ✅ Complete |
thinking |
N/A | ❌ Not routed | ✅ Sent | ❌ Not emitted | Broken |
response |
N/A | ❌ Not routed | ✅ Sent | ❌ Not emitted | Broken |
toolCall |
N/A | ❌ Not routed | ✅ Sent | ❌ Not emitted | Broken |
Assessment: The forward path (IDE→Drone) is blocked at submitPrompt. The return path (Drone→IDE) has no routing logic.
2.2 gadget-code:web Implementation Gaps
Missing: submitPrompt Handler
File: gadget-code/src/lib/code-session.ts:58-60
async onSubmitPrompt(content: string): Promise<void> {
this.log.debug("prompt received", { content });
}
Required Implementation:
- Create
ChatTurndocument with statusProcessing - Build work order from
ChatSession,Project,IAiProvider, and prompt - Find target drone's
DroneSessionviaSocketService.getDroneSession() - Emit
processWorkOrderto drone with full context - Drone acknowledges → update
ChatTurnwith drone job ID
Missing: Drone→IDE Event Routing
File: gadget-code/src/lib/drone-session.ts — no message handlers registered
Required Implementation:
// In DroneSession.register()
this.socket.on("thinking", this.onThinking.bind(this));
this.socket.on("response", this.onResponse.bind(this));
this.socket.on("toolCall", this.onToolCall.bind(this));
this.socket.on("workOrderComplete", this.onWorkOrderComplete.bind(this));
Each handler must:
- Find the corresponding
CodeSessionbychatSessionId - Forward the event to the IDE socket
- Update the
ChatTurndocument with new data
Missing: ChatTurn Persistence Updates
File: gadget-code/src/models/chat-turn.ts exists but is not being updated during AWL execution.
Required: Create a TurnUpdateService that:
- Listens for streaming events (
thinking,response,toolCall) - Applies incremental updates to the active
ChatTurn - Handles token counting and duration tracking
2.3 gadget-drone Implementation Gaps
Missing: Work Order Acknowledgment Flow
File: gadget-drone/src/gadget-drone.ts:209-229
async onProcessWorkOrder(...) {
const order: IAgentWorkOrder = { ... };
cb(true); // accepts immediately
AgentService.process(order); // fires without waiting
}
Issue: No error handling if AgentService.process() throws. No status update back to web service if processing fails.
Fix: Wrap in try/catch, emit error event to web service on failure.
Missing: Socket Event Emissions
File: gadget-drone/src/services/agent.ts:70-98
The AWL loop has comments /* emit turn-tool-call socket message */ but no actual socket.emit() calls.
Required: Pass socket reference into AgentService.process() and emit:
thinkingwhen reasoning content arrivesresponsewhen text content streamstoolCallafter each tool executionworkOrderCompletewhen loop exits
Missing: Workspace Mode Management
File: gadget-drone/src/gadget-drone.ts:168-188
onRequestSessionLock sets workspaceMode = User but never transitions to Agent mode before processing. The AWL should:
- Emit
requestWorkspaceMode(agent)before starting - Wait for acknowledgment
- Run the loop
- Emit
requestWorkspaceMode(idle)when complete
2.4 Data Model Inconsistencies
ChatTurn Schema Mismatch
File: gadget-code/src/models/chat-turn.ts:70-76
Schema defines stats.thinkingTokens but interface IChatTurnStats in packages/api/src/interfaces/chat-turn.ts:24 uses thinkingTokenCount.
Fix: Standardize on thinkingTokenCount in both places.
Missing User Reference in Context Messages
File: gadget-drone/src/services/agent.ts:101-120
buildSessionContext(workOrder: IAgentWorkOrder): IContextChatMessage[] {
const user: IUser = workOrder.turn.session.user as IUser;
// ...
messages.push({
// ...
user: {
_id: user._id.toHexString(), // Breaks if session.user is ObjectId
username: user.email,
displayName: user.displayName,
},
});
}
Issue: workOrder.turn.session is typed as IChatSession | Types.ObjectId. If it's an ObjectId, accessing .user fails.
Fix: Populate session.user before creating work order, or fetch user separately in drone.
3. Conflicts and Redundancies
3.1 Documentation Conflicts
Work Order Interface Discrepancy:
gadget-code/docs/agentic-workflow-loop.md:21-52definesIWorkOrderwithprovider.apiKeyandcontext: IChatMessage[]gadget-drone/docs/agentic-workflow-loop.md:15-62definesIWorkOrderwithprovider.sdkandchatSession.context- Actual implementation in
gadget-drone/src/services/agent.ts:24-28usesIAgentWorkOrderwithturn: IChatTurnandcontext: IChatTurn[]
Resolution: Delete both markdown docs' interface definitions. Reference @gadget/api interfaces only. Update docs to match IAgentWorkOrder.
3.2 Bull Queue vs Socket.IO (Resolved)
Documentation states:
gadget-drone/docs/agentic-workflow-loop.md:10-12: "Each Gadget Drone registered by the User implements a named Bull job queue"gadget-drone/AGENTS.md: "Queue: Bull queue namedgadget-drone, job typeprompt"
Reality: gadget-drone/src/gadget-drone.ts uses Socket.IO for work order delivery, not Bull. There's no Bull queue setup in the drone.
Decision: ✅ Option A (Socket.IO only) — Bull references are legacy and must be removed from all documentation.
Recovery from Drone Crash: Handled via workspace persistence in .gadget/ directory (see Section 7). When a drone restarts:
- It validates/creates
.gadget/workspace.jsonwith workspace UUID - Web service reads workspace state to route retry to same directory
- Agent can resume from last persisted ChatTurn state
3.3 Redundant Service Layers
Observation: gadget-code/src/services/api-client.ts exists alongside direct Mongoose model usage.
Check: gadget-code/src/controllers/api/v1/drone.ts likely duplicates DroneService methods.
Action: Audit API controllers — if they just proxy service methods, remove and call services directly from Socket handlers.
4. Implementation Roadmap
Phase 1: Fix Type Errors (1-2 hours)
Task 1.1: Resolve IAiProvider conflict
# In gadget-drone/src/services/agent.ts
import { IAiProvider as AiProviderConfig } from "@gadget/ai";
import { IAiProvider as DbAiProvider } from "@gadget/api";
// Add mapper
function mapDbProviderToConfig(provider: DbAiProvider | Types.ObjectId): AiProviderConfig {
if (provider instanceof Types.ObjectId) {
throw new Error("Provider must be populated");
}
return {
_id: provider._id.toHexString(),
name: provider.name,
sdk: provider.apiType, // note: apiType → sdk
baseUrl: provider.baseUrl,
apiKey: provider.apiKey,
};
}
Task 1.2: Fix DroneStatus duplication
# Delete from gadget-drone/src/services/platform.ts
# Import from @gadget/api instead
Task 1.3: Fix ChatTurnStats field names
# Align schema and interface on thinkingTokenCount
Phase 2: Implement Prompt Submission (3-4 hours)
Task 2.1: Implement CodeSession.onSubmitPrompt()
async onSubmitPrompt(content: string): Promise<void> {
const turn = new ChatTurn({
createdAt: new Date(),
user: this.user._id,
session: this.chatSession._id,
project: this.project?._id,
provider: this.chatSession.provider, // must populate
llm: this.chatSession.selectedModel,
mode: this.chatSession.mode,
status: ChatTurnStatus.Processing,
prompts: { user: content },
toolCalls: [],
stats: { /* zeros */ }
});
await turn.save();
const droneSession = SocketService.getDroneSession(this.selectedDrone);
droneSession.socket.emit(
"processWorkOrder",
registration,
this.project,
this.chatSession,
turn,
(success: boolean) => {
if (success) {
turn.status = ChatTurnStatus.Processing;
turn.save();
}
}
);
}
Task 2.2: Add drone selection to CodeSession
- Track
selectedDrone: IDroneRegistration - Track
chatSession: IChatSession - Track
project: IProject
Phase 3: Implement Event Routing (3-4 hours)
Task 3.1: Add DroneSession event handlers
// In DroneSession.register()
this.socket.on("thinking", (content: string) => this.onThinking(content));
this.socket.on("response", (content: string) => this.onResponse(content));
this.socket.on("toolCall", (name, params, response) =>
this.onToolCall(name, params, response));
this.socket.on("workOrderComplete", (turnId, success, message) =>
this.onWorkOrderComplete(turnId, success, message));
Task 3.2: Implement routing logic
async onThinking(content: string): Promise<void> {
const codeSession = SocketService.getCodeSessionByChatSessionId(
this.chatSessionId
);
codeSession.socket.emit("thinking", content);
// Update ChatTurn
await ChatTurn.findByIdAndUpdate(this.currentTurnId, {
thinking: content
});
}
Task 3.3: Add getCodeSessionByChatSessionId() to SocketService
- Maintain reverse index:
chatSessionId → CodeSession
Phase 4: Emit Events from AWL (2-3 hours)
Task 4.1: Pass socket into AgentService.process()
// In gadget-drone/src/gadget-drone.ts
await AgentService.process(order, this.socket);
Task 4.2: Add emissions to AWL loop
// In AgentService.process()
for await (const chunk of response.stream) {
if (chunk.type === "thinking") {
socket.emit("thinking", chunk.content);
} else if (chunk.type === "response") {
socket.emit("response", chunk.content);
}
}
for (const toolCall of response.toolCalls) {
const result = await executeTool(toolCall);
socket.emit("toolCall", toolCall.name, toolCall.arguments, result);
}
socket.emit("workOrderComplete", turn._id, true);
Phase 5: Workspace Persistence (4-6 hours) ⚠️ CRITICAL PATH
Task 5.1: Create .gadget/ directory structure on drone startup
// In gadget-drone/src/gadget-drone.ts, before registration
async validateWorkspace(): Promise<void> {
const gadgetDir = path.join(process.cwd(), '.gadget');
const workspaceFile = path.join(gadgetDir, 'workspace.json');
if (!fs.existsSync(gadgetDir)) {
await fs.promises.mkdir(gadgetDir, { recursive: true });
}
let workspaceData: WorkspaceData;
if (fs.existsSync(workspaceFile)) {
// Validate existing workspace
workspaceData = JSON.parse(await fs.promises.readFile(workspaceFile, 'utf-8'));
this.log.info('validated existing workspace', {
workspaceId: workspaceData.workspaceId
});
} else {
// Create new workspace
workspaceData = {
workspaceId: crypto.randomUUID(),
createdAt: new Date().toISOString(),
projects: [],
chatSession: null,
lockedProject: null,
};
await fs.promises.writeFile(workspaceFile, JSON.stringify(workspaceData, null, 2));
this.log.info('created new workspace', {
workspaceId: workspaceData.workspaceId
});
}
this.workspaceData = workspaceData;
}
Task 5.2: Write work order cache during processing
// In onProcessWorkOrder()
async onProcessWorkOrder(...) {
const workOrderFile = path.join(this.gadgetDir, 'work-order.json');
// Write cache BEFORE processing
await fs.promises.writeFile(workOrderFile, JSON.stringify({
turnId: turn._id.toHexString(),
chatSessionId: chatSession._id.toHexString(),
projectId: project._id.toHexString(),
receivedAt: new Date().toISOString(),
}, null, 2));
try {
await AgentService.process(order, this.socket);
} finally {
// Remove cache AFTER completion
await fs.promises.unlink(workOrderFile);
}
}
Task 5.3: Update drone registration to include workspaceId
// In PlatformService.register()
interface IDroneDefinition {
hostname: string;
workspaceDir: string;
workspaceId: string; // NEW: persistent workspace identifier
}
Task 5.4: Web service stores workspaceId with ChatSession
// In packages/api/src/interfaces/chat-session.ts
export interface IChatSession extends Document {
// ... existing fields ...
workspaceId: string; // NEW: route retries to correct workspace
}
Phase 6: End-to-End Test (2 hours)
Test Scenario:
- Start drone:
pnpm --filter gadget-drone dev - Start web:
pnpm --filter gadget-code dev:backend - Start IDE:
pnpm --filter gadget-code dev:frontend - Login, create project, select drone
- Submit prompt: "Create a hello world function"
- Verify:
- ChatTurn created in MongoDB
- Drone receives
processWorkOrder - IDE receives
thinking/responseevents - ChatTurn updated with results
Test Drone Recovery:
- Kill drone mid-turn (Ctrl+C)
- Verify
.gadget/work-order.jsonexists with turn data - Restart drone in same directory
- Verify drone reports workspaceId to web service
- Web service can route retry to same workspace
5. Risk Assessment
High Risk
-
No Streaming in @gadget/ai
AiApi.chat()returnsPromise<IAiChatResponse>, not async iterable- Cannot stream tokens in real-time without refactoring
- Mitigation: Add
streamCallbackparameter (already exists in signature) but implement it in Ollama/OpenAI clients
-
No Error Propagation
- If drone crashes mid-turn, IDE hangs forever
- Mitigation: Add timeout + heartbeat mechanism
-
No Workspace Persistence Layer ⚠️ CRITICAL
- Drone restart loses all context: which workspace, which projects, which chat session
- Cannot retry work orders without knowing original workspace directory
- Mitigation: Implement
.gadget/directory persistence (see Section 7)
Medium Risk
-
Session State Not Persisted
CodeSessionandDroneSessionare in-memory- Server restart loses all active sessions
- Mitigation: Store session state in Redis
-
No Concurrency Control
- Multiple prompts can queue for same drone
- Drone processes one at a time but doesn't reject extras
- Mitigation: Check
DroneStatus.Busybefore accepting work
Low Risk
- TypeScript Strict Mode Violations
- Several
anyand missing null checks - Build passes but runtime errors possible
- Mitigation: Enable
noUncheckedIndexedAccessin drone
- Several
6. Recommended Next Steps
- Fix TypeScript errors in
gadget-drone/src/services/agent.ts(Phase 1) - Implement
submitPrompthandler (Phase 2, Task 2.1) - Add basic event routing (Phase 3, minimal viable path)
- Test end-to-end with stubbed tool calls
- Iterate on streaming, error handling, and persistence
Appendix A: File Inventory
Core Socket Implementation
gadget-code/src/services/socket.ts— Socket.IO server setup ✅gadget-code/src/lib/socket-session.ts— Base session class ✅gadget-code/src/lib/code-session.ts— IDE session (partial)gadget-code/src/lib/drone-session.ts— Drone session (minimal)gadget-drone/src/gadget-drone.ts— Drone client ✅
Data Models
packages/api/src/interfaces/*.ts— TypeScript interfaces ✅gadget-code/src/models/*.ts— Mongoose schemas ✅gadget-drone/src/models/— None (drone is stateless)
Message Definitions
packages/api/src/messages/socket.ts— Event map ✅packages/api/src/messages/ide.ts— IDE→Web messages ✅packages/api/src/messages/drone.ts— Drone messages (incomplete)
AI Integration
packages/ai/src/api.ts— AI interface ✅packages/ai/src/ollama.ts— Ollama client ✅packages/ai/src/openai.ts— OpenAI client ✅gadget-drone/src/services/ai.ts— AI service wrapper ✅gadget-drone/src/services/agent.ts— AWL implementation (partial)
Appendix B: Build Status
| Package | Build Status | Notes |
|---|---|---|
@gadget/api |
✅ Passes | Type definitions only |
@gadget/ai |
✅ Passes | AI SDK abstraction |
gadget-code |
✅ Passes | Web server builds |
gadget-drone |
❌ Fails | Type errors in agent.ts:74,102 |
Blocking Errors:
src/services/agent.ts(74,9): Argument of type 'ObjectId | IAiProvider'
is not assignable to parameter of type 'IAiProvider'.
src/services/agent.ts(102,48): Property 'user' does not exist on type
'ObjectId | IChatSession'.
7. Workspace Persistence Architecture
7.1 Design Goals
- No External Dependencies: End users should not need to run Redis, MongoDB, or other infrastructure just to run
gadget-drone - Crash Recovery: When a drone crashes mid-work-order, it must be able to resume in the same workspace with the same project state
- Workspace Identity: Each workspace directory needs a persistent, unique identifier that survives drone restarts
- State Visibility: Both the drone and web service must be able to inspect workspace state at any time
7.2 Directory Structure
<workspace-directory>/
├── .gadget/
│ ├── workspace.json # Persistent workspace identity & state
│ ├── work-order.json # Active work order cache (deleted when complete)
│ └── logs/
│ └── drone.log # Drone execution logs
├── <project-slug-1>/ # Project directories managed by this workspace
├── <project-slug-2>/
└── ...
7.3 File Specifications
.gadget/workspace.json
Created: When drone starts in a directory (new or existing workspace)
Updated: When chat session lock acquired/released, projects added/removed
Deleted: Never (only if user manually deletes workspace)
interface WorkspaceData {
workspaceId: string; // UUID v4, immutable once created
createdAt: string; // ISO 8601 timestamp
hostname: string; // Machine hostname where drone runs
workspaceDir: string; // Absolute path to workspace directory
// Active session state (null when idle)
chatSession: {
_id: string; // MongoDB ChatSession._id
name: string; // Session name for display
lockedAt: string; // ISO 8601 timestamp
} | null;
// Project currently being worked on (null when idle)
lockedProject: {
_id: string; // MongoDB Project._id
slug: string; // Project slug (directory name)
gitUrl: string; // Remote git URL
lockedAt: string; // ISO 8601 timestamp
} | null;
// All projects cloned into this workspace
projects: Array<{
_id: string;
slug: string;
gitUrl: string;
clonedAt: string;
lastSyncAt: string;
}>;
// Drone registration (updated each startup)
registration: {
_id: string; // MongoDB DroneRegistration._id
status: string; // Current drone status
registeredAt: string; // ISO 8601 timestamp
} | null;
}
Example:
{
"workspaceId": "550e8400-e29b-41d4-a716-446655440000",
"createdAt": "2026-04-29T19:30:00.000Z",
"hostname": "rob-dev-machine",
"workspaceDir": "/home/rob/projects/my-gadget-workspace",
"chatSession": {
"_id": "65f8a9b2c3d4e5f6a7b8c9d0",
"name": "Fix authentication bug",
"lockedAt": "2026-04-29T20:15:00.000Z"
},
"lockedProject": {
"_id": "65f8a9b2c3d4e5f6a7b8c9d1",
"slug": "auth-service",
"gitUrl": "https://github.com/user/auth-service.git",
"lockedAt": "2026-04-29T20:15:00.000Z"
},
"projects": [
{
"_id": "65f8a9b2c3d4e5f6a7b8c9d1",
"slug": "auth-service",
"gitUrl": "https://github.com/user/auth-service.git",
"clonedAt": "2026-04-29T19:30:00.000Z",
"lastSyncAt": "2026-04-29T20:15:00.000Z"
}
],
"registration": {
"_id": "65f8a9b2c3d4e5f6a7b8c9d2",
"status": "busy",
"registeredAt": "2026-04-29T19:30:00.000Z"
}
}
.gadget/work-order.json
Created: When processWorkOrder message received
Updated: Not updated (immutable cache)
Deleted: When work order completes (success or error)
interface WorkOrderCache {
turnId: string; // ChatTurn._id for persistence updates
chatSessionId: string; // For routing events back to IDE
projectId: string; // For file operations
workOrderId: string; // Unique ID for this work order instance
receivedAt: string; // ISO 8601 timestamp
prompt: string; // User's prompt (for retry context)
status: 'processing' | 'completed' | 'error';
error?: string; // Error message if status === 'error'
}
Purpose: If drone crashes while this file exists, the web service knows:
- Which ChatTurn was being processed
- Which workspace to route the retry to
- What prompt needs to be re-processed
7.4 Drone Startup Sequence
// Pseudocode for gadget-drone.ts startup
async start(): Promise<void> {
// Step 1: Validate/create workspace (BEFORE anything else)
await this.validateWorkspace();
// Step 2: Get user credentials
const credentials = await this.getUserCredentials();
// Step 3: Register with platform (includes workspaceId)
this.registration = await PlatformService.register(
credentials.email,
credentials.password,
process.cwd(),
this.workspaceData.workspaceId, // NEW parameter
);
// Step 4: Update workspace.json with registration
this.workspaceData.registration = {
_id: this.registration._id.toHexString(),
status: 'starting',
registeredAt: new Date().toISOString(),
};
await this.writeWorkspaceData();
// Step 5: Connect Socket.IO
await this.connectSocket();
// Step 6: Check for incomplete work order (crash recovery)
await this.checkCrashRecovery();
// Step 7: Mark as available
await PlatformService.setStatus(DroneStatus.Available);
this.workspaceData.registration!.status = 'available';
await this.writeWorkspaceData();
}
async checkCrashRecovery(): Promise<void> {
const workOrderFile = path.join(this.gadgetDir, 'work-order.json');
if (fs.existsSync(workOrderFile)) {
const cache = JSON.parse(await fs.promises.readFile(workOrderFile, 'utf-8'));
this.log.warn('incomplete work order found - crash recovery needed', {
turnId: cache.turnId,
prompt: cache.prompt,
});
// Notify web service that this workspace has pending recovery
this.socket.emit('requestCrashRecovery', {
workspaceId: this.workspaceData.workspaceId,
turnId: cache.turnId,
chatSessionId: cache.chatSessionId,
});
// DO NOT delete work-order.json yet - wait for web service instruction
}
}
7.5 Web Service: Crash Recovery Flow
When web service receives requestCrashRecovery:
- Fetch ChatTurn by
turnId - Check Turn Status:
- If
status === 'finished': Acknowledge, tell drone to delete cache (turn completed before crash notification) - If
status === 'processing': Queue retry for this workspace
- If
- Route Retry: When retrying, filter drones by
workspaceIdto ensure same workspace handles it - Acknowledge: Tell drone it can delete
work-order.json
// In gadget-code/src/lib/drone-session.ts
async onRequestCrashRecovery(data: {
workspaceId: string;
turnId: string;
chatSessionId: string;
}): Promise<void> {
const turn = await ChatTurn.findById(data.turnId);
if (!turn) {
this.socket.emit('crashRecoveryResponse', {
turnId: data.turnId,
action: 'discard', // Turn doesn't exist, delete cache
});
return;
}
if (turn.status === ChatTurnStatus.Finished) {
this.socket.emit('crashRecoveryResponse', {
turnId: data.turnId,
action: 'discard', // Already done, delete cache
});
return;
}
// Turn is still processing - mark for retry
turn.status = ChatTurnStatus.Error;
turn.response = 'Drone crashed during processing - retrying';
await turn.save();
this.socket.emit('crashRecoveryResponse', {
turnId: data.turnId,
action: 'retry',
retryDelay: 5000, // Wait 5 seconds before retry
});
// Schedule retry (will route to same workspaceId)
setTimeout(() => {
this.retryWorkOrder(turn);
}, 5000);
}
7.6 Workspace-Aware Drone Selection
When selecting a drone for a work order:
// In gadget-code/src/lib/code-session.ts
async onSubmitPrompt(content: string): Promise<void> {
// ... create ChatTurn ...
// Prefer drone in same workspace (for continuity)
let targetDrone: DroneSession;
if (this.chatSession.workspaceId) {
// Try to find drone in same workspace
targetDrone = SocketService.getDroneSessionByWorkspaceId(
this.chatSession.workspaceId
);
if (!targetDrone) {
this.log.warn('workspace drone unavailable, selecting alternative');
// Fall through to any available drone
}
}
if (!targetDrone) {
// Select any available drone for this user
targetDrone = SocketService.getAvailableDroneForUser(this.user);
}
// Include workspaceId in work order for persistence
targetDrone.socket.emit('processWorkOrder', {
// ... existing fields ...
workspaceId: this.chatSession.workspaceId,
});
}
7.7 Implementation Checklist
- Create
WorkspaceServiceingadget-drone/src/services/workspace.ts - Implement
validateWorkspace()andwriteWorkspaceData() - Update
PlatformService.register()to acceptworkspaceId - Add
workspaceIdfield toIDroneRegistrationinterface and model - Add
workspaceIdfield toIChatSessioninterface and model - Implement
work-order.jsoncache write/remove inonProcessWorkOrder() - Implement
requestCrashRecoverysocket handler in drone - Implement
crashRecoveryResponsesocket handler in web service - Add workspace-aware drone selection in
CodeSession.onSubmitPrompt() - Remove all Bull queue references from documentation
Document Status: Complete
Next Review: After Phase 2 implementation