Can OpenClaw really answer phone calls now?

Yes. OpenClaw 2026.4.24 ships a Voice Call plugin with Twilio realtime audio, Gemini Live as a backend voice provider, and an openclaw_agent_consult tool that lets the live call hand off to the full OpenClaw agent for tool-backed answers. You install the plugin, run voicecall setup, and bind a Twilio number.

How do I test it without making a real phone call?

Run voicecall smoke. It's a dry-run by default and walks the entire stack - Twilio credentials, webhook reachability, audio codec compatibility, Gemini Live token mint, openclaw_agent_consult bridge - without placing a real call. You only get billed Twilio charges if you pass --live to actually dial.

What's the openclaw_agent_consult tool?

It's a realtime tool exposed inside live voice sessions. The voice loop on the call runs lightweight, but when the caller asks something that needs your full agent's tools or memory, the realtime model calls openclaw_agent_consult and the full OpenClaw agent answers asynchronously. The voice model speaks the answer back to the caller. It's how phone calls get access to your IMAP, calendar, CRM, and skill catalog without burning the realtime budget on every turn.

Is this safe to point at a business phone number?

Not on the defaults. The voice plugin can call any tool your agent can call. Before you bind a Twilio number to a public phone number, set agents.defaults.contextInjection: 'never' for the voice agent, restrict the tool list to read-only operations (no send-email, no calendar-write, no payment skills), and run voicecall smoke with the live tool list to confirm. Most setups also gate the agent_consult handoff behind a configurable allowlist of caller IDs.

What does this actually cost to run?

Three line items: Twilio (around $0.013/min for voice + $1/mo per phone number in the US), Gemini Live realtime audio (priced per audio minute, currently roughly $0.50 per minute of bidirectional speech), and your normal OpenClaw model spend on whichever provider answers the agent_consult handoffs. A typical small-business missed-call assistant taking 30 calls a month at 90 seconds each lands around $25-35/month in pure usage.

OpenClaw can now answer your phone (Voice Call setup with Twilio, the safe way)

By Linas Valiukas · April 27, 2026

Your OpenClaw agent can answer your phone. Not "OpenClaw triggers a Twilio webhook that triggers an n8n flow that triggers a TTS." Actually answer the phone. As of v2026.4.24 there's a bundled Voice Call plugin, a Gemini Live realtime backend, and a voicecall setup command. You bind a Twilio number, dial it from your phone, and the agent picks up.

This is the kind of feature that sounds like a demo and turns out to be one config edit away from running in your kitchen. It also sounds like a feature you absolutely should not point at a public phone number on the defaults. Both are true. Here's how to set it up and how to keep it from doing something embarrassing.

What ships in 2026.4.24

The release bundles four pieces that together make voice calls work end to end:

A Voice Call plugin. Bundled. Ships with voicecall setup and a voicecall smoke command that's a dry run by default - you have to pass --live to actually place a test call.
Twilio realtime transport. The plugin treats Twilio as a first-class voice transport, alongside Chrome WebRTC for browser-side calls. Paired-node Chrome support is in too, for the people running BlackHole/SoX audio bridges.
A Gemini Live backend voice provider. Bidirectional audio plus function-call support. It's the audio brain of the call: turning speech into intent, taking text back, speaking it. It's not your full OpenClaw agent. It's a fast, narrow voice model with a hand on a phone.
An openclaw_agent_consult tool. Exposed inside the live voice session. When the caller asks something the voice model can't answer on its own, it calls this tool and the full OpenClaw agent steps in - reads memory, runs skills, hits the CRM - and the voice model relays the answer.

The same plumbing powers the new Google Meet plugin. Different transport (Meet instead of phone), same realtime voice loop, same agent-consult handoff. If you set up Voice Call you've basically set up Meet too.

The setup, in order

Get a Twilio number with voice enabled. A US local number runs $1.15/month. International is more. Buy it in the Twilio console, note the SID.
Get a Google AI Studio API key with Gemini Live access. Free tier covers test calls. You'll need it on the production billing tier for anything past the rate limits.
Make sure your OpenClaw gateway is reachable from Twilio. Twilio webhooks need to reach your gateway over HTTPS. If you're self-hosting at home, that means a proper reverse proxy or a Cloudflare tunnel. Localhost won't work.
Run voicecall setup. The command walks you through pasting in the Twilio account SID, auth token, phone number, and the Gemini Live key. It writes them into the secrets store, configures the webhook URL on the Twilio number, and registers the voice plugin with the gateway.
```
openclaw voicecall setup
```
Dry-run with voicecall smoke. This is the bit you actually want to run before you let anyone dial in.
```
openclaw voicecall smoke
```
Walks the whole stack: Twilio creds valid, webhook reachable, audio codecs negotiated, Gemini Live token mint succeeds, openclaw_agent_consult bridge resolves. Default mode reports problems and exits without dialing. Pass --live only when you've fixed everything the dry run flags.
Dial your number. If smoke is green, calling the Twilio number should now connect to a Gemini Live voice loop. The voice model says hello, you talk, it answers. First few exchanges will feel canned - that's the realtime model working without your full agent's context. Ask it something specific to your data and watch latency jump as openclaw_agent_consult kicks in.

What the agent_consult handoff actually does

The thing that makes this useful instead of a toy is the bridge between the voice model and your full agent. Worth understanding because the latency and cost profile of voice calls flows from it.

Gemini Live is fast and cheap per minute, but it doesn't have your IMAP password, your CRM token, your memory files, or your skills. So it runs the conversational layer - turn-taking, intent detection, tone - and when the caller asks something specific, it calls openclaw_agent_consult like a tool. That tool call goes to your normal OpenClaw agent through the gateway, which does the actual work: search the calendar, look up the customer in HubSpot, check whether the package shipped. The result comes back as text, the voice model reads it out.

Two consequences:

Latency varies a lot. Voice-only turns are ~300ms. Turns that bounce through agent_consult into a slow tool (CRM API, IMAP search, skill execution) can hit 4-8 seconds. The voice model handles this with filler ("Let me check that for you") but the seam is real.
Cost scales with consult depth, not call length. A 10-minute call where the caller chats about the weather costs Gemini Live audio minutes plus nothing else. A 2-minute call that triggers seven CRM lookups costs the same minutes plus seven full agent runs.

Three settings to change before you let anyone call in

The defaults are a starting point. They are not safe defaults. Before you bind this to a public number:

Disable bootstrap context injection on the voice agent. The voice agent shouldn't see your full CLAUDE.md / MEMORY.md on every turn. 2026.4.24 added agents.defaults.contextInjection precisely for this:

[agents.voicecall]
contextInjection = "never"
allowed_tools = ["calendar.read", "crm.lookup", "openclaw_agent_consult"]

Lock the tool list to read-only. The voice agent gets to read your calendar, look up customers, search docs. It does not get email.send, calendar.write, payment skills, or anything that mutates state. If a real action needs to happen, the voice agent's job is to take a message; the followup happens after the call, with you in the loop.

Gate agent_consult by caller ID until you trust it. The Twilio webhook payload includes the caller's number. The voice plugin lets you set an allowlist:

[plugins.voicecall.policy]
agent_consult.allowlist = ["+15555550123", "+15555550199"]
agent_consult.outside_allowlist = "decline"

Calls from outside the allowlist still get answered. They just can't reach the full agent. Useful while you're tuning prompts and figuring out which questions you actually want the agent fielding.

The actual use case: missed-call replies

The shape of small-business adoption that's going to win this year isn't "an AI receptionist that handles everything." It's "an AI that picks up when you can't, takes a structured message, and texts you the highlights before the caller hangs up." The pattern most r/openclaw users are landing on:

Caller dials the business number. Voicemail forwards to your Twilio number after 4 rings.
Voice agent picks up. Greets, asks who's calling and what they need. Reads back what it heard.
If the caller is asking something the agent can answer (hours, location, simple FAQ), openclaw_agent_consult resolves it from the knowledge base.
If the caller needs you specifically, the agent says you'll be in touch, captures the callback number, and ends the call.
Post-call, the OpenClaw agent runs a write-up skill: structured summary, sentiment, urgency, the caller's number, a draft text reply. Pings you on WhatsApp or Telegram.

This works because every action that touches the outside world (sending texts, scheduling things, charging cards) happens after the call, with a human approving. The voice agent's only job during the call is being a polite, competent listener. The hard work runs through the same small-business automation patterns people are already using through messaging.

What it costs

Line item	Per minute	30-call month (90s avg)
Twilio voice (US inbound)	$0.0085	$0.38
Twilio number rental	flat	$1.15
Gemini Live realtime audio	~$0.50	$22.50
Agent consults (~3 per call, Sonnet)	~$0.015 each	$1.35
Post-call write-up skill	~$0.02 each	$0.60
Total		~$26/month

Gemini Live is the line item that scales fastest. If you route calls to a cheaper realtime provider when one becomes available, that table changes a lot. If you're on a Pro Anthropic subscription with the CLI workaround in place, the agent_consult line item drops toward zero until you hit the session ceiling.

Things that will go wrong

Twilio webhook timeouts. If your gateway takes more than 15 seconds to respond, Twilio drops the call. Slow agent_consults will take you over the line. Set browser.actionTimeoutMs appropriately and pre-warm the consult model on plugin start.
Audio codec mismatches. Twilio negotiates μ-law by default; Gemini Live wants Opus. The smoke test catches this. Don't skip it.
The voice model improvising your business. Without the contextInjection lockdown, Gemini Live will happily invent your business hours, claim you do services you don't, quote prices you didn't authorize. Lock the tool list and write a tight system prompt that explicitly says "if you don't know, say you'll get the human in touch."
International calls. Twilio's E.164 handling needs a country prefix. The default smoke test only validates US numbers.
Hold music silence. If agent_consult takes more than 2-3 seconds, the call goes silent. The voice plugin has a filler_phrases setting. Use it.

The Google Meet variant

The same plumbing answers a different transport: Google Meet. Bundled in 2026.4.24 too. googlemeet doctor --oauth walks the personal Google auth, recover_current_tab picks up an already-open Meet without opening a duplicate, and the artifact/attendance exports pull conference records, recordings, transcripts, and smart notes out as markdown.

What this means: the agent that picks up your phone calls can also drop into your Meet, take notes, and post a summary to your CRM. The same agent_consult handoff applies. The same contextInjection lockdown applies, and matters even more in a meeting where it'll happily start summarizing things you didn't say.

Or skip the Twilio account, the Gemini key, and the smoke test

On TryOpenClaw.ai, voice answering is a toggle. We provision the phone number, run the voice plugin against pooled Gemini Live capacity, ship safe defaults (read-only tools, contextInjection off, caller-ID gating on), and route post-call follow-ups through your messaging app of choice. You don't see the Twilio config, you don't see the smoke test, and you don't have to debug a 15-second webhook timeout at 9pm on a Friday.

Flat $39/month. Your phone gets answered.

Linas Valiukas

Founder of TryOpenClaw.ai. Software engineer writing about OpenClaw, self-hosting trade-offs, and what non-technical users actually need from an AI assistant. About the author →

Try it right now

This is just one example - OpenClaw adapts to whatever you need. Describe any workflow in plain language and it figures out the rest. Pay $1 for a full 24-hour trial, pick your messaging app, and start chatting with your own instance in under 60 seconds. Love it? $39/mo. Not for you? Walk away - we delete everything.

Try OpenClaw for $1

24h full access. No commitment. Cancel anytime.