- Implemented reasoning effort setting in SESSION panel of Chat Sessio View - Removed all ability to "sign up" for an account
5.5 KiB
Reasoning Effort
Status: ✅ IMPLEMENTED
Last Updated: May 8, 2026
Overview
Reasoning effort controls how much an AI model "thinks" before responding. Models with reasoning capabilities (like DeepSeek-R1, QwQ, OpenAI o1/o3) can produce internal chain-of-thought tokens before generating their final answer. The reasoning effort setting lets users balance between speed and thoroughness.
User Setting
The reasoning effort is configured per chat session via a dropdown in the Session sidebar:
| Value | Effect |
|---|---|
| Off | No thinking output. Model responds immediately. |
| Low | Minimal thinking. Faster responses, less depth. |
| Medium | Balanced thinking. Default reasoning depth. |
| High | Maximum thinking. Slower but more thorough. |
The dropdown is disabled when the selected model does not have hasThinking: true in its capabilities.
Data Flow
User selects "High" in Reasoning dropdown
→ PUT /api/v1/chat-sessions/:id { reasoningEffort: "high" }
→ Stored in MongoDB ChatSession.reasoningEffort
→ When creating a turn:
ChatTurn.reasoningEffort = ChatSession.reasoningEffort (snapshotted)
→ Drone receives work order with populated turn
→ agent.ts reads turn.reasoningEffort, maps "off" → false
→ Passes to AiService.chat() as params.reasoning
→ Provider SDK receives the appropriate parameter
Provider Mapping
Each AI provider uses a different parameter name for reasoning effort. The @gadget/ai abstraction handles the translation:
| Provider | Parameter | Values |
|---|---|---|
| Ollama | think |
false, "low", "medium", "high" |
| OpenAI | reasoning_effort |
"low", "medium", "high" |
Mapping Logic (in gadget-drone/src/services/agent.ts)
const reasoningEffort = turn.reasoningEffort || "off";
const reasoning: boolean | "low" | "medium" | "high" =
reasoningEffort === "off" ? false : reasoningEffort;
"off"→false(disables thinking entirely)"low"→"low"(minimal thinking)"medium"→"medium"(balanced)"high"→"high"(maximum thinking)
Ollama Implementation (packages/ai/src/ollama.ts)
const response = await this.client.chat({
model: model.modelId,
messages,
stream: true,
think: model.params.reasoning, // boolean | "low" | "medium" | "high"
});
When think is false, the Ollama SDK disables thinking. When set to a string level, the model allocates corresponding effort.
OpenAI Implementation (packages/ai/src/openai.ts)
const response = await this.client.chat.completions.create({
model: model.modelId,
messages,
tools,
stream: true,
...(typeof model.params.reasoning === "string"
? { reasoning_effort: model.params.reasoning }
: {}),
});
The reasoning_effort parameter is only passed when the value is a string ("low", "medium", "high"). When false, the parameter is omitted — standard non-reasoning models would reject it.
Streaming Thinking Content
When reasoning effort is enabled and the model produces thinking tokens, they are streamed back in real-time:
- Provider SDK emits thinking tokens in stream chunks
- Provider implementation (
ollama.ts/openai.ts) maps them toIAiStreamChunkwithtype: 'thinking' - Drone forwards via Socket.IO as
thinking(content)events - Frontend renders thinking content in distinct muted blocks
Thinking Chunk Handling
Ollama:
if (chunk.message.thinking) {
await streamCallback({
type: 'thinking',
data: chunk.message.thinking,
});
}
OpenAI:
if ('reasoning' in delta && delta.reasoning) {
await streamCallback({
type: 'thinking',
data: delta.reasoning as string,
});
}
Type Definitions
ReasoningEffort (in packages/api/src/interfaces/chat-session.ts)
export type ReasoningEffort = "off" | "low" | "medium" | "high";
IAiModelConfig.params.reasoning (in packages/ai/src/api.ts)
params: {
reasoning: boolean | "high" | "medium" | "low";
// ...
}
Note: The IAiModelConfig type uses boolean | "high" | "medium" | "low" (no "off"). The "off" value from the user-facing setting is mapped to false before reaching the AI provider layer.
Mongoose Schema
ChatSession (gadget-code/src/models/chat-session.ts):
reasoningEffort: {
type: String,
enum: ["off", "low", "medium", "high"],
default: "off",
}
ChatTurn (gadget-code/src/models/chat-turn.ts):
reasoningEffort: {
type: String,
enum: ["off", "low", "medium", "high"],
default: "off",
}
Model Capability Detection
The hasThinking capability is detected during model probing:
- Ollama: checks if model capabilities array includes
"reasoning" - OpenAI: checks if model features include
"reasoning_effort"or fallback detection by model ID (o1,o3,reasoning)
The frontend uses this capability flag to enable/disable the Reasoning dropdown.
Related Documentation
- Streaming Responses — How thinking tokens are streamed to the IDE
- Socket Protocol — Socket.IO event definitions
- Architecture — Overall system architecture