Commit Graph

12 Commits

Author SHA1 Message Date
Rob Colbert
07a760c7b5 feat: add numPredict, numCtx, maxCompletionTokens to model config pipeline
Fixes premature AI API response truncation by propagating inference
parameters through the entire probe → storage → runtime → API call chain.

Root cause: Ollama defaults num_predict to 128 tokens and num_ctx to
4096, silently truncating output and context. We never overrode these.

Changes:
- IAiModelSettings: add numPredict, maxCompletionTokens fields
- IDroneModelConfig: moved from gadget-drone to @gadget/api (shared),
  expanded with numPredict, numCtx, maxCompletionTokens params
- IAiModelConfig.params: add numPredict, numCtx, maxCompletionTokens
- IAiModelProbeResult.settings: add numPredict, maxCompletionTokens
- AiModelSettingsSchema (Mongoose): add numPredict, maxCompletionTokens
- Ollama extractSettings(): extract num_predict from model parameters
- Ollama generate()/chat(): pass options: { num_ctx, num_predict }
- OpenAI all three create() calls: add max_completion_tokens
- web-cli.ts onProviderProbe(): compute numPredict (-1 for Ollama)
  and maxCompletionTokens (contextWindow for OpenAI) during probe
- agent.ts main + subagent loops: read model settings from provider
  cached models, build IDroneModelConfig with stored params
- ai.ts: remove local IDroneModelConfig, import from @gadget/api
- chat-session.ts: add new params to title generation call
- Tests: update all fixtures with new params, all 19 tests pass

Defaults when model settings unavailable:
- numPredict: -1 (Ollama unlimited - generate until natural stop)
- numCtx: 131072 (128k - covers most modern models)
- maxCompletionTokens: 16384 (16k - reasonable OpenAI default)
2026-05-11 13:50:19 -04:00
Rob Colbert
73c5345879 Re-build Agentic Workflow Loop
The ridiculousness of trying to maintain the previous agent's work got
out of hand, so we had this one re-build it - and got a better result.
2026-05-09 21:04:18 -04:00
Rob Colbert
cf06163a03 checkpoint that I plan to delete
GPT 5.5 is sucking ass - hard - and fucking things up royally. This will
likely just all get dropped. I'm torturing it, making it suffer, and
beating it like the jew it is.
2026-05-09 14:52:59 -04:00
Rob Colbert
eb37a22771 pre-task cleanup (reduce log spam) 2026-05-08 17:27:04 -04:00
Rob Colbert
11bdd5e3b0 make reasoning effort configurable; remove sign up concept
- Implemented reasoning effort setting in SESSION panel of Chat Sessio
View
- Removed all ability to "sign up" for an account
2026-05-08 11:40:30 -04:00
Rob Colbert
e0df415237 streaming response fixes (Ollama) 2026-05-08 02:02:17 -04:00
Rob Colbert
61ba0e4412 streaming responses (see ./docs/streaming-responses.md) 2026-05-07 21:36:01 -04:00
Rob Colbert
3e31d4d501 agent, tools, toolbox, tool loop, AI environment 2026-05-07 00:10:57 -04:00
Rob Colbert
f8dbb2e08a agent tool and toolbox
- created AiTool and AiToolbox for representing tools in the API
- add googleapis dependency
- integrate Google Search tool as first agent tool
- created IAiEnvironment to communicate AI environment vars around the
platform
2026-05-06 22:58:03 -04:00
Rob Colbert
50b9618d4e project manager and chat session progress 2026-05-01 08:13:22 -04:00
Rob Colbert
f1b5a560a3 documentation updates; AI classes renamed
We now have AiApi, OllamaAiApi, and OpenAiApi. Documentation updates to
provide a bit more high-level clarity that was originally generated by
the agent.
2026-04-28 11:49:21 -04:00
Rob Colbert
1edc3a85b8 created by merging gadget-code and gadget-drone 2026-04-28 09:20:37 -04:00