gadget/packages/ai
Rob Colbert 07a760c7b5 feat: add numPredict, numCtx, maxCompletionTokens to model config pipeline
Fixes premature AI API response truncation by propagating inference
parameters through the entire probe → storage → runtime → API call chain.

Root cause: Ollama defaults num_predict to 128 tokens and num_ctx to
4096, silently truncating output and context. We never overrode these.

Changes:
- IAiModelSettings: add numPredict, maxCompletionTokens fields
- IDroneModelConfig: moved from gadget-drone to @gadget/api (shared),
  expanded with numPredict, numCtx, maxCompletionTokens params
- IAiModelConfig.params: add numPredict, numCtx, maxCompletionTokens
- IAiModelProbeResult.settings: add numPredict, maxCompletionTokens
- AiModelSettingsSchema (Mongoose): add numPredict, maxCompletionTokens
- Ollama extractSettings(): extract num_predict from model parameters
- Ollama generate()/chat(): pass options: { num_ctx, num_predict }
- OpenAI all three create() calls: add max_completion_tokens
- web-cli.ts onProviderProbe(): compute numPredict (-1 for Ollama)
  and maxCompletionTokens (contextWindow for OpenAI) during probe
- agent.ts main + subagent loops: read model settings from provider
  cached models, build IDroneModelConfig with stored params
- ai.ts: remove local IDroneModelConfig, import from @gadget/api
- chat-session.ts: add new params to title generation call
- Tests: update all fixtures with new params, all 19 tests pass

Defaults when model settings unavailable:
- numPredict: -1 (Ollama unlimited - generate until natural stop)
- numCtx: 131072 (128k - covers most modern models)
- maxCompletionTokens: 16384 (16k - reasonable OpenAI default)
2026-05-11 13:50:19 -04:00
..
src feat: add numPredict, numCtx, maxCompletionTokens to model config pipeline 2026-05-11 13:50:19 -04:00
types created by merging gadget-code and gadget-drone 2026-04-28 09:20:37 -04:00
.gitignore cleanup 2026-05-07 00:59:15 -04:00
package.json streaming response fixes (Ollama) 2026-05-08 02:02:17 -04:00
README.md more documentation and progress working towards a usable socket protocol 2026-04-29 15:23:03 -04:00
tsconfig.json created by merging gadget-code and gadget-drone 2026-04-28 09:20:37 -04:00
vitest.config.ts streaming response fixes (Ollama) 2026-05-08 02:02:17 -04:00

@gadget/ai

Gadget Code's AI API abstraction layer. Provides a single internal API contract for calling AI providers (Ollama, OpenAI) without consumer code knowing which provider is configured.

Principles

  1. One interface, all providers. Consumer code calls createAiApi() once and holds the resulting AiApi. It never checks provider.sdk again.
  2. All AI SDK knowledge is contained here. No consumer imports ollama or openai SDKs directly.
  3. Responses are normalized. All provider responses are translated to Gadget Code's internal interface types before returning.

Usage

import { createAiApi } from "@gadget/ai";

const provider = {
  _id: "local-ollama",
  name: "Local Ollama",
  sdk: "ollama",
  baseUrl: "http://localhost:11434",
  apiKey: "",
};

const modelConfig = {
  provider,
  modelId: "llama3.2",
  params: {
    reasoning: false,
    temperature: 0.8,
    topP: 0.9,
    topK: 40,
  },
};

const ai = createAiApi(provider, logger);

const result = await ai.generate(modelConfig, {
  prompt: "Explain what this code does",
  systemPrompt: "You are a code reviewer.",
});
console.log(result.response);
console.log(result.stats.duration.text); // formatted, e.g. "00:00:02"

API

Factory

createAiApi(provider, logger?) — Returns an AiApi instance for the given provider. logger is optional and defaults to a no-op logger. Pass your own logger to receive debug output.

AiApi

Abstract base class. Currently implemented:

  • OllamaAiApi — Ollama provider
  • OpenAiApi — OpenAI provider (stubbed)

ai.generate(model, options, streamCallback?)

Single-prompt generation. Returns IAiGenerateResponse.

ai.chat(model, options, streamCallback?)

Chat with conversation history. Pass options.context for multi-turn对话. Returns IAiChatResponse.

Interfaces

All interfaces are exported for use by consumers:

  • IAiProvider — AI provider configuration
  • IAiModelConfig — Model + runtime parameters
  • IAiGenerateOptions / IAiGenerateResponse
  • IAiChatOptions / IAiChatResponse — includes tool_calls for function-calling models
  • IAiInferenceStats — token counts and duration (both raw seconds number and formatted text string)
  • IAiLogger — injectable logger interface (debug, info, warn, error)

Providers

Ollama

Configured via IAiProvider with sdk: "ollama". Uses the ollama npm package. Handles streaming responses and normalizes Ollama-specific response fields (thinking tokens, token counts, duration).

OpenAI

Configured via IAiProvider with sdk: "openai". Stubbed — chat() and generate() throw "Not yet implemented". Implement by wiring the openai npm package following the same pattern as OllamaAiApi.

Duration Formatting

The library uses numeral to provide a consistent formatted duration string (stats.duration.text) in hh:mm:ss format. The raw nanosecond value is also returned in stats.duration.seconds for consumers that need the raw number.

Adding a New Provider

  1. Create packages/ai/src/<provider>.ts — extend AiApi, implement all abstract methods
  2. Update packages/ai/src/index.ts — add the new class to the createAiApi factory switch
  3. Update this README

No consumer code changes required.