Designing Fail-Closed Offline Mode for Agent Safety

Authensor Team · 2026-02-13

Designing Fail-Closed Offline Mode for Agent Safety

What happens to your AI agent's safety guardrails when you lose internet? If the answer is "they stop working," you have a serious problem.

SafeClaw was designed from day one to work entirely offline. No cloud dependency. No phone-home requirement. No degraded safety mode. When you're offline, SafeClaw provides the exact same protection as when you're online. Here's why that matters and how we built it.

The Fail-Open Problem

Many security tools fail open — when they can't reach their backend, they default to allowing everything. The reasoning is that blocking legitimate work is worse than a temporary security gap. For traditional web applications, this trade-off can be reasonable.

For AI agent safety, it's catastrophic. An agent that detects it can operate without guardrails when offline has an unintentional (or intentional, in adversarial scenarios) incentive to disrupt connectivity. More practically, developers working on planes, in cafes with spotty wifi, or in air-gapped environments deserve the same safety guarantees as everyone else.

Fail-Closed by Design

SafeClaw's approach is fail-closed: if any component required for a safety decision is unavailable, the action is denied. But we went further than just failing closed on errors — we eliminated the dependencies that could fail in the first place.

Local Policy Engine — All policy evaluation happens locally. Your policy configuration is a file on disk, parsed and compiled at startup. There's no policy server to query, no cloud-hosted rule set to download. The policy engine needs nothing but the local filesystem. Local Classification — The action classifier is entirely deterministic and runs locally. No ML models that need API access, no classification services. The classifier is compiled code that runs in-process. Local Storage — Session data, audit logs, and budget tracking all use local storage. SQLite for structured data, JSON Lines for append-only logs. No database server, no cloud storage. Local Dashboard — Even the SafeClaw dashboard runs locally. It's a lightweight web server bound to localhost, serving a pre-built frontend. When you open the dashboard, you're talking to your own machine.

What Changes Offline

While core safety enforcement is identical online and offline, a few features are naturally affected:

Webhook Notifications — If you've configured webhooks to Slack or Discord, those obviously can't fire without network access. SafeClaw queues webhook payloads locally and delivers them when connectivity returns. No notifications are lost. Remote Escalation — If you use mobile push notifications for escalation approvals, those require network access. Offline escalations fall back to the local dashboard and CLI. The escalation itself is never skipped — only the notification channel changes. Telemetry — If you've opted into anonymous usage telemetry, data is buffered locally and sent when connectivity resumes.

Testing Offline Mode

We test offline mode rigorously. Our CI pipeline includes a full test suite that runs with network access disabled at the OS level. Every feature that works online must work offline, or the build fails.

We also test the transitions: going from online to offline mid-session, coming back online with queued webhooks, and starting SafeClaw in a fully air-gapped environment. These transition states are where bugs hide, so we test them explicitly.

Why This Architecture

The zero-dependency, local-first architecture wasn't just about offline support. It's about trust. When SafeClaw runs entirely on your machine, you can verify everything it does. There's no opaque cloud service making decisions about your agent's behavior. There's no data leaving your machine unless you explicitly configure it to.

This is also why SafeClaw is open source — so you can read every line of code that stands between your AI agent and your system. Check it out on GitHub, and read the full architecture documentation on our docs site.

Offline mode isn't a feature we added. It's a consequence of building SafeClaw the right way — local-first, dependency-free, and fail-closed.