โ† All posts
๐Ÿฆž OpenClaw2026-04-18ยท12 min

I Hosted OpenClaw in Production and Hit 10 Walls. Here's Every Fix.

I spent 2 weeks building a hosted OpenClaw platform. Not just running the one-liner โ€” actually deploying it as a multi-tenant service where users create agents on demand. Docker, Azure OpenAI, Chromium, SearXNG โ€” the whole stack.

I hit every possible wall. Here's the list, in order, with the actual fixes that worked.

Wall 1: The 8-Minute Cold Start

First deploy โ€” the agent took 8 minutes to start. The openclaw onboard command runs at container start, downloading models and configuring the provider. Users would click "Create Agent" and stare at a spinner for 8 minutes.

Fix: Pre-bake openclaw onboard into the Docker image during build. The onboard runs once during docker build, not at container start. Cold start dropped from 8 minutes to ~90 seconds.

Wall 2: Out-of-Memory Disconnects

Users would be mid-conversation and get disconnected (WebSocket code 1006). Checked the logs โ€” the container was OOMing. OpenClaw with GPT-4.1 and browser tools peaks at ~760MB RAM. I had it set to 1GB.

Fix: 2GB RAM limit, 4GB swap. No more OOM kills.

Wall 3: The Device Pairing Nightmare

OpenClaw has a security feature where browser connections need to be "paired" โ€” you approve the device from the CLI. Great for personal use. Awful for a SaaS where users connect from random browsers.

I tried writing a script that auto-approves devices. Didn't work โ€” openclaw devices approve approves the CLI's own device, not the browser.

Fix: dangerouslyDisableDeviceAuth: true in config. Skips device pairing entirely. Combined with password auth so there's still access control.

Wall 4: The CRLF Disaster

I develop on Windows. The Dockerfile used a heredoc (COPY --chmod=755 <<'ENTRYPOINT'...) for the startup script. Built fine locally. Deployed to the Linux server. Container crashes with exit code 127 โ€” "command not found."

Spent an hour debugging. The problem: the heredoc captured Windows \r\n line endings. The shebang #!/bin/sh\r\n has an invisible \r that makes Linux unable to find /bin/sh\r. The file command showed "ASCII text executable, with CRLF line terminators" โ€” but only if you think to check.

Fix: Moved the script to a separate .sh file, converted to LF, and COPY it instead of using heredocs.

Wall 5: web_search Wasn't Available

Installed SearXNG in a Docker container. Set the SEARXNG_BASE_URL env var. Agent still only had web_fetch, not web_search. Spent hours trying different config keys โ€” tools.search.provider, search.provider, tools.webSearch โ€” all rejected by config validation.

The actual config path (found by reading the docs three times):

  • โ—plugins.entries.searxng.config.webSearch.baseUrl
  • โ—tools.web.search.provider: searxng
  • โ—The SEARXNG_BASE_URL env var

You need all three. Missing any one = silent fallback to no web search. No error. No warning. The agent just doesn't have the tool.

Wall 6: Browser SSRF Policy

Installed Chromium. Set browser.enabled: true. Agent tried to visit example.com. Error: "Navigation blocked: strict browser SSRF policy requires an IP-literal URL."

OpenClaw blocks hostname-based browser navigation by default as a security measure. Reasonable for personal use. Breaking for a SaaS where the whole point is browsing the web.

Fix: Two config keys together:

  • โ—browser.ssrfPolicy.hostnameAllowlist: ["*"]
  • โ—browser.ssrfPolicy.dangerouslyAllowPrivateNetwork: true

Also needed Playwright installed (not just Chromium) โ€” for the navigate command.

Wall 7: Chromium Installed But Browser Still Didn't Work

Had Chromium in the image. SSRF policy fixed. Agent said "CDP websocket not reachable after start." Turns out OpenClaw uses Playwright's bundled headless shell, not the system Chromium.

Fix: node /app/node_modules/playwright-core/cli.js install chromium during Docker build. This installs the Playwright-specific Chromium headless shell alongside the system Chromium.

Wall 8: Reddit Blocks Everything

Built a whole Reddit monitoring use case. Agent tries to browse Reddit โ€” 403. Tries web_fetch โ€” 403. Tries the .json endpoints โ€” 403. Reddit blocks all automated access from server IPs.

Even web_search finds Reddit posts via Google/Bing, but the posts indexed by search engines are weeks/months old. Useless for "what happened today" monitoring.

Fix: No fix. Gave up on Reddit monitoring. Replaced it with a news brief that searches actual news sites (which don't block server requests). Sometimes the right fix is admitting the approach doesn't work.

Wall 9: The Agent Makes Up URLs

Asked the agent to research a topic. Got back beautiful-looking output with 10 article links. Clicked the first one โ€” 404. Second โ€” doesn't exist. Third โ€” leads to the wrong article. About 40% of URLs were hallucinated.

Fix: Two things:

  1. 1.A system prompt rule: "You MUST call web_search before writing ANY content. Every URL must come from a web_search result. If a URL is example.com or made up, you have FAILED the task."
  2. 2.Structured output formats โ€” markdown tables with column headers force the agent to actually search and fill cells instead of generating plausible-sounding text.

This dropped hallucination from ~40% to under 5%.

Wall 10: Agent Returns a Plan Instead of Results

Asked the agent to research 5 subreddits. Got back: "I'll proceed methodically. First I'll search... then I'll analyze... sit tight while I work!" That was the entire response. The agent told me what it planned to do instead of doing it.

The /v1/chat/completions API returns the first response from the agent loop. If the agent's first message is "here's my plan," that's what you get. The actual work happens in subsequent loop iterations that you never see.

Fix: System prompt: "Complete the entire task and return the FINAL result only. Do NOT return a plan or progress updates."

The Takeaway

The infra side of agents is genuinely harder than the AI side. The model works great out of the box. Getting web search, browser control, scheduled tasks, and persistent memory all configured correctly โ€” that's where the real engineering happens.

If you're building on top of OpenClaw or any agent framework, I hope this saves you some time. Every wall above cost me hours. The fixes are all one-liners โ€” but finding them wasn't.

Skip the setup walls. Try it in 60 seconds โ€” no Docker, no CLI, no config files.

Debug your OpenClaw agent

See every tool call, token, and dollar. Auto-diagnosis with fix suggestions. Free.