OpenClaw's phantom token problem: why your agent burns money while idle
By Linas Valiukas · April 8, 2026
You closed the laptop. You went to bed. Your OpenClaw agent had no tasks scheduled, no messages to process, nothing to do. You woke up to a $47 API bill.
This isn't hypothetical. It's one of the most upvoted posts on r/openclaw this month. And the comments are full of people saying "same."
When Anthropic's subscription still covered OpenClaw, none of this mattered. The meter wasn't running. Now that everyone's on pay-as-you-go pricing, these background processes have a dollar sign attached. And most people don't know they're running.
The safeguard compaction bug
This one's a confirmed bug, not a design choice. OpenClaw's default compaction.mode is set to "safeguard", which triggers an LLM API call roughly every 30 minutes on all active sessions. The idea is to compress long conversations so they don't get out of hand.
The problem: it fires on idle sessions too. Sessions with zero conversation messages. The code checks whether there's anything worth compacting after making the API call, not before. So the call gets made, billed, and then discarded.
Each unnecessary call burns about 33,600 prompt tokens. That's 48 wasted API calls per day, per instance. On Anthropic's Flash pricing, that's $1.26/day. On Sonnet, more. On a system that's doing literally nothing, that's $38/month in pure waste.
GitHub issue #34935 documents this in detail. The fix is straightforward - check for empty sessions before calling the API. It hasn't shipped yet.
Every message triggers 4-5 API calls (you see one)
Send your agent a message. You get one reply. Behind the scenes, OpenClaw made four or five separate API calls: one for the actual response, one to generate a conversation title, one for tags, one for follow-up question suggestions, and one for autocomplete.
These auxiliary calls each carry context. They're not free. Disabling them drops your per-message API calls from 4-5 down to 1. That's a 60-80% reduction in per-message cost that most people never realize is available.
It gets worse with integrations. One user running a BlueBubbles connection hit 8 million tokens in a single hour. Chat integrations that relay messages can trigger these background calls in cascades.
Cron jobs that never forget
Since version 2026.2.17, cron jobs reuse existing sessions instead of starting fresh ones. That sounds efficient. It isn't.
Every cron run appends its output to the same transcript. The next run pays prompt tokens for all the irrelevant output from previous runs. A job that checks your email every 30 minutes accumulates an enormous context window over the course of a day. By evening, each run is re-sending thousands of lines of old email summaries just to check if there's anything new.
One Reddit user running 30 cron jobs on an e-commerce setup described it perfectly: "You're paying for an ever-growing context window that's 80% stale tool outputs from 3 hours ago." Their sessions were growing to 500KB-1MB - that's 200,000+ tokens resent on every single API call.
The cron jobs keep the session "active" too, so it never hits idle timeout. The context just grows and grows until you manually reset it or the model's context window caps out.
Memory dreaming at 3 AM
OpenClaw has a feature called "dreaming" - a three-phase background process (Light Sleep, REM Sleep, Deep Sleep) that consolidates short-term memory into long-term storage. It runs as a managed cron job, typically at 3 AM.
When enabled, each phase makes multiple LLM round-trips processing a 7-day lookback window of conversations. Your agent is literally thinking about its memories while you sleep. On a per-token billing model, that costs real money.
Dreaming is opt-in and disabled by default, which is good. But if you turned it on back when subscriptions covered everything, it's still running. Check your cron configuration.
The workspace bootstrap tax
Every API call - not just your messages, but heartbeats, cron runs, compaction checks, everything - injects a set of workspace bootstrap files. AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md. Together they total roughly 35,600 tokens.
That overhead gets multiplied by every background process. If your agent makes 100 API calls a day between heartbeats, cron jobs, compaction, and auxiliary calls, you're sending 3.5 million tokens of workspace boilerplate alone. At Sonnet pricing ($3/M input), that's $10.50/day in bootstrap overhead.
A well-cached system would handle this cheaply. But OpenClaw's heartbeat interval (default 30 minutes) often exceeds Anthropic's prompt cache TTL, so each heartbeat pays full price for tokens the cache just evicted.
You can't see what's eating your budget
Maybe the most frustrating part: OpenClaw has no first-class mechanism to show which background process is consuming your tokens. You can see total usage on Anthropic's dashboard, but there's no breakdown by heartbeat vs. cron vs. compaction vs. dreaming vs. actual conversations.
Issue #14377 has been requesting per-job usage logging for months. It's still open. Without it, you're debugging your API bill by trial and error - disabling things one at a time and watching whether the bill drops.
One user on r/openclaw put it bluntly: "I asked my agent why it keeps burning tokens and it tells me that it isn't... but it is."
The real numbers
Here's what people are actually losing to phantom token burn, pulled from Reddit and GitHub over the past two weeks:
| What happened | Cost | Cause |
|---|---|---|
| Agent "doing nothing" overnight | $47 | Retry loop on failed email triage |
| Heartbeat only, no user interaction | $5-7/day | Opus heartbeats with full context |
| Idle system, default settings | $38/month | Safeguard compaction bug |
| Email check every 5 minutes | $50/day | Misconfigured heartbeat interval |
| "Simple scheduled tasks" overnight | $200 in 12 hours | 5.7M tokens from loop bug |
| 53M tokens in one week, personal assistant | ~$265 | 23KB MEMORY.md loaded on every turn |
The community consensus, backed by multiple optimization guides and GitHub issues: 60-90% of typical OpenClaw token spend has nothing to do with actual work you requested.
How to stop the bleeding
If you're self-hosting and want to keep your bill under control, here's what actually works:
- Disable auxiliary API calls. Turn off title generation, tag generation, follow-up suggestions, and autocomplete. Cuts per-message calls from 4-5 to 1.
- Use a cheap model for heartbeats. Route heartbeat checks to Haiku or a free model through OpenRouter. Opus for heartbeats is, as one Redditor put it, "lighting money on fire."
- Force daily session resets. Set a short idle timeout (10-15 minutes) and schedule a forced session clear at a fixed time. This prevents cron jobs from accumulating unbounded context.
- Trim your workspace files. Keep MEMORY.md under a few KB. Archive old content. Every line gets injected into every API call, including heartbeats.
- Set
requests_limiton cron jobs. A hard cap per session ensures a stuck loop can't run forever. This is the difference between a $5 mistake and a $200 one. - Check if dreaming is enabled. If you turned it on during the subscription era, it's still running. Disable it or route it to a cheap model.
- Set an API budget cap. Anthropic's console lets you set spending limits. Do this before anything else. A runaway agent with no cap is how you get the $3,600 story.
That's a lot of knobs to tune. And you'll need to keep tuning them - OpenClaw ships 13 releases a month, and any update can change default behavior.
Or don't manage it yourself
The phantom token problem exists because OpenClaw's defaults optimize for capability, not cost. That made sense when subscriptions absorbed the bill. It doesn't anymore.
A managed service handles all of this: heartbeat intervals tuned to hit cache windows, context pruning that doesn't let sessions balloon, model routing that sends background tasks to cheap models, session lifecycle management that prevents unbounded growth. You don't wake up to surprise bills because someone else already solved these problems.
TryOpenClaw.ai runs your OpenClaw agent for
Founder of TryOpenClaw.ai. Software engineer writing about OpenClaw, self-hosting trade-offs, and what non-technical users actually need from an AI assistant. About the author →
Try it right now
This is just one example - OpenClaw adapts to whatever you need. Describe any workflow in plain language and it figures out the rest. Pay $1 for a full 24-hour trial, pick your messaging app, and start chatting with your own instance in under 60 seconds. Love it? $39/mo. Not for you? Walk away - we delete everything.
Try OpenClaw for $124h full access. No commitment. Cancel anytime.