OpenClaw logo
TryOpenClaw
Try for $1

OpenClaw's phantom token problem: why your agent burns money while idle

By Linas Valiukas · April 8, 2026

You closed the laptop. You went to bed. Your OpenClaw agent had no tasks scheduled, no messages to process, nothing to do. You woke up to a $47 API bill.

This isn't hypothetical. It's one of the most upvoted posts on r/openclaw this month. And the comments are full of people saying "same."

When Anthropic's subscription still covered OpenClaw, none of this mattered. The meter wasn't running. Now that everyone's on pay-as-you-go pricing, these background processes have a dollar sign attached. And most people don't know they're running.

The safeguard compaction bug

This one's a confirmed bug, not a design choice. OpenClaw's default compaction.mode is set to "safeguard", which triggers an LLM API call roughly every 30 minutes on all active sessions. The idea is to compress long conversations so they don't get out of hand.

The problem: it fires on idle sessions too. Sessions with zero conversation messages. The code checks whether there's anything worth compacting after making the API call, not before. So the call gets made, billed, and then discarded.

Each unnecessary call burns about 33,600 prompt tokens. That's 48 wasted API calls per day, per instance. On Anthropic's Flash pricing, that's $1.26/day. On Sonnet, more. On a system that's doing literally nothing, that's $38/month in pure waste.

GitHub issue #34935 documents this in detail. The fix is straightforward - check for empty sessions before calling the API. It hasn't shipped yet.

Every message triggers 4-5 API calls (you see one)

Send your agent a message. You get one reply. Behind the scenes, OpenClaw made four or five separate API calls: one for the actual response, one to generate a conversation title, one for tags, one for follow-up question suggestions, and one for autocomplete.

These auxiliary calls each carry context. They're not free. Disabling them drops your per-message API calls from 4-5 down to 1. That's a 60-80% reduction in per-message cost that most people never realize is available.

It gets worse with integrations. One user running a BlueBubbles connection hit 8 million tokens in a single hour. Chat integrations that relay messages can trigger these background calls in cascades.

Cron jobs that never forget

Since version 2026.2.17, cron jobs reuse existing sessions instead of starting fresh ones. That sounds efficient. It isn't.

Every cron run appends its output to the same transcript. The next run pays prompt tokens for all the irrelevant output from previous runs. A job that checks your email every 30 minutes accumulates an enormous context window over the course of a day. By evening, each run is re-sending thousands of lines of old email summaries just to check if there's anything new.

One Reddit user running 30 cron jobs on an e-commerce setup described it perfectly: "You're paying for an ever-growing context window that's 80% stale tool outputs from 3 hours ago." Their sessions were growing to 500KB-1MB - that's 200,000+ tokens resent on every single API call.

The cron jobs keep the session "active" too, so it never hits idle timeout. The context just grows and grows until you manually reset it or the model's context window caps out.

Memory dreaming at 3 AM

OpenClaw has a feature called "dreaming" - a three-phase background process (Light Sleep, REM Sleep, Deep Sleep) that consolidates short-term memory into long-term storage. It runs as a managed cron job, typically at 3 AM.

When enabled, each phase makes multiple LLM round-trips processing a 7-day lookback window of conversations. Your agent is literally thinking about its memories while you sleep. On a per-token billing model, that costs real money.

Dreaming is opt-in and disabled by default, which is good. But if you turned it on back when subscriptions covered everything, it's still running. Check your cron configuration.

The workspace bootstrap tax

Every API call - not just your messages, but heartbeats, cron runs, compaction checks, everything - injects a set of workspace bootstrap files. AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md. Together they total roughly 35,600 tokens.

That overhead gets multiplied by every background process. If your agent makes 100 API calls a day between heartbeats, cron jobs, compaction, and auxiliary calls, you're sending 3.5 million tokens of workspace boilerplate alone. At Sonnet pricing ($3/M input), that's $10.50/day in bootstrap overhead.

A well-cached system would handle this cheaply. But OpenClaw's heartbeat interval (default 30 minutes) often exceeds Anthropic's prompt cache TTL, so each heartbeat pays full price for tokens the cache just evicted.

You can't see what's eating your budget

Maybe the most frustrating part: OpenClaw has no first-class mechanism to show which background process is consuming your tokens. You can see total usage on Anthropic's dashboard, but there's no breakdown by heartbeat vs. cron vs. compaction vs. dreaming vs. actual conversations.

Issue #14377 has been requesting per-job usage logging for months. It's still open. Without it, you're debugging your API bill by trial and error - disabling things one at a time and watching whether the bill drops.

One user on r/openclaw put it bluntly: "I asked my agent why it keeps burning tokens and it tells me that it isn't... but it is."

The real numbers

Here's what people are actually losing to phantom token burn, pulled from Reddit and GitHub over the past two weeks:

What happened Cost Cause
Agent "doing nothing" overnight $47 Retry loop on failed email triage
Heartbeat only, no user interaction $5-7/day Opus heartbeats with full context
Idle system, default settings $38/month Safeguard compaction bug
Email check every 5 minutes $50/day Misconfigured heartbeat interval
"Simple scheduled tasks" overnight $200 in 12 hours 5.7M tokens from loop bug
53M tokens in one week, personal assistant ~$265 23KB MEMORY.md loaded on every turn

The community consensus, backed by multiple optimization guides and GitHub issues: 60-90% of typical OpenClaw token spend has nothing to do with actual work you requested.

How to stop the bleeding

If you're self-hosting and want to keep your bill under control, here's what actually works:

That's a lot of knobs to tune. And you'll need to keep tuning them - OpenClaw ships 13 releases a month, and any update can change default behavior.

Or don't manage it yourself

The phantom token problem exists because OpenClaw's defaults optimize for capability, not cost. That made sense when subscriptions absorbed the bill. It doesn't anymore.

A managed service handles all of this: heartbeat intervals tuned to hit cache windows, context pruning that doesn't let sessions balloon, model routing that sends background tasks to cheap models, session lifecycle management that prevents unbounded growth. You don't wake up to surprise bills because someone else already solved these problems.

TryOpenClaw.ai runs your OpenClaw agent for $39/month. Fixed price. No phantom tokens, no compaction bugs, no 3 AM dreaming charges. Your agent just works.

LV

Linas Valiukas

Founder of TryOpenClaw.ai. Software engineer writing about OpenClaw, self-hosting trade-offs, and what non-technical users actually need from an AI assistant. About the author →

Try it right now

This is just one example - OpenClaw adapts to whatever you need. Describe any workflow in plain language and it figures out the rest. Pay $1 for a full 24-hour trial, pick your messaging app, and start chatting with your own instance in under 60 seconds. Love it? $39/mo. Not for you? Walk away - we delete everything.

Try OpenClaw for $1

24h full access. No commitment. Cancel anytime.