AI agent

Controlling agent cost

Pick models per task, cap context, set monthly budget — three knobs to keep bills sane.

Why bills explode

"I just asked it to organize a few docs and it cost $20" — because the agent calls tools more than once. A complex task may run 50+ tool-call rounds, each shipping the current context (tens of thousands of tokens) back to the model. Sonnet at $3/M input × 50 rounds × 30K context = 1.5M input tokens = $4.50 for one task.

The fix is not "use the agent less" — it is "configure routing properly": cheap models for simple work, strong models for hard problems, concurrency caps so you don't blow the budget in one shot.

Three core habits

  • Simple tasks → haiku / mini (10× cheaper)
  • AI fields default to manual refresh, not auto
  • Set a monthly cap in Settings → Billing

Model routing (Pro)

Pro unlocks mounting multiple providers at once and routing per task type. For example: code edits → sonnet, summaries → haiku, deep research → opus.

{
  "routing": {
    "default": "anthropic:claude-sonnet-4-5",
    "rules": [
      {
        "when": { "tool": ["Read", "grep", "glob"] },
        "use": "anthropic:claude-haiku-4"
      },
      {
        "when": { "subagent": "research-summarizer" },
        "use": "openai:gpt-4o-mini"
      },
      {
        "when": { "context_tokens_gt": 80000 },
        "use": "anthropic:claude-opus-4"
      }
    ]
  }
}

Hard budget gates

  • Settings → Billing → Monthly cap: hit it and the agent halts
  • Per-session cap: stop automatically when a chat uses up its budget
  • Per-tool-call cap: reject a call whose output exceeds X tokens
  • Pair with a Stop hook to log token usage to a table for monthly review

Shrinking context size

The longer a chat lives, the bigger the context, the higher the cost. Three moves help immediately: use @ to reference precisely instead of dumping a folder, start a New Chat periodically, and offload long work to subagents so the parent stays slim.

Keep .kition/agent.md under 2KB — it loads into the system prompt every turn. Put project knowledge into searchable docs rather than stuffing it here.

Related articles

Ready when you are.

Kition is a local-first AI workspace. Markdown documents, structured tables, and an AI agent — running on your own machine, against the model provider you choose.