How I Run OpenClaw Like a Business

Most people optimize prompts. I optimize systems.

When I ran OpenClaw on a Raspberry Pi with a premium model always on, the assistant felt magical and my bill felt reckless. I could hit roughly $25/day without trying. That is not a model problem. That is an operating model problem.

So I changed the objective:

Get the highest quality outcome for the lowest total token spend.

Not quality per token. Quality per dollar.

Why this framing matters

If you only chase the strongest model, you overpay for routine work. If you only chase cheap tokens, quality collapses when reasoning gets hard. The move is to route work by task difficulty and business value.

Think of it like riding an old motorcycle on long mountain rides. The machine will rewards riders who understand maintenance, constraints, and terrain. OpenClaw is similar. If you understand context, memory, and routing, you get durability and performance.

What I learned about OpenClaw cost mechanics

1) Context engineering beats prompt heroics

OpenClaw injects workspace bootstrap files into the context window on every turn, including AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, and optional memory files.

That means these files are recurring input tokens, not just one-time setup.

Practical implication: concise files → unit economics.

2) Token physics are simple enough to operate with

A practical English rule of thumb is:

1 token is roughly 4 characters
1 token is roughly 0.75 words

Not exact, but operationally useful for planning context budgets.

3) Heartbeats are a real spend lever

OpenClaw heartbeats run every 30 minutes by default. If your use case does not need high-frequency proactive behavior, widening this interval can cut passive token burn.

I run a longer interval for tighter cost control.

The strategy that worked for me

I moved from "best model always" to tiered model routing.

Tier 1 (high stakes reasoning): premium models for architecture, difficult synthesis, critical decisions.

OpenAI Codex and Anthropic Claude (Sonnet/Opus) via OAuth Log in (OAuth allows you to essentially run OpenClaw on your monthly subscription)

Tier 2 (daily writing and summaries): mid-cost models with strong quality consistency

A practical order for me: StepFun 3.5 -> MiniMax M2.5 → Kimi K2.5, based on reliability and price-performance.

Tier 3 (routine transforms and lightweight ops): low-cost fast models

For trivial tasks, GPT mini and Gemini Flash class models are excellent value for summarizing, cleanup, and lightweight drafting.

Note: I also keep a fallback chain per task so I can degrade cost before I degrade output quality.

A simple operating playbook

Trim recurring context first: Keep bootstrap files sharp, current, and short.
Classify tasks before execution: Hard reasoning, medium synthesis, lightweight transforms.
Assign a default model per class: Add one fallback model in each class.
Track input vs output spend separately: Verbose prompts and verbose completions leak cost in different ways.
Tune heartbeat cadence intentionally: Proactive loops are useful, but they are not free.

Bottom line

The advantage is not "who has the biggest model". The advantage is operational discipline: context hygiene, model routing, and cost-aware defaults.