Introducing SafeClaw 1.0 Beta: Safe-by-Default AI Agents

Authensor Team · 2026-02-13

Introducing SafeClaw 1.0 Beta: Safe-by-Default AI Agents

Today we are releasing SafeClaw 1.0 Beta. It is the first open-source tool that gates every action an AI agent takes -- file writes, shell commands, network requests, MCP tool calls -- through a policy engine before execution.

Install it in one command:

``bash


npx @authensor/safeclaw



Your browser opens. A setup wizard walks you through provider configuration. You are running a safe-by-default agent in under sixty seconds.

What SafeClaw Does

AI agents are powerful. They can write code, manage files, run commands, make HTTP requests, and interact with external services. But without gating, every one of those capabilities is also a risk vector.

SafeClaw intercepts every tool call your agent makes, classifies it, evaluates it against your safety policy, and enforces the result. Three outcomes are possible:

Allow: the action proceeds immediately
Deny: the action is blocked and the agent is told why
Require approval: the action is paused while you review and decide

Nothing executes without passing through this gate. The default policy denies unknown actions, allows safe reads, and requires your approval for everything else.

Feature Highlights

Multi-provider support. SafeClaw works with Claude (via the Anthropic Agent SDK) and OpenAI (via a custom GPT-4o agent loop we built with zero dependencies). Pick your provider in the setup wizard or switch with

safeclaw init --provider openai

.

Browser dashboard. A full PWA served from localhost. It includes a setup wizard, task runner with live streaming, approval center, analytics with cost tracking, a visual policy editor with versioning and rollback, and a diagnostics clinic. It works on desktop and mobile.

Deny-by-default policy engine. Rules are evaluated in first-match-wins order. Conditions support

eq, startsWith, contains, matches (regex with ReDoS protection), and in

 operators. Rules can have UTC time schedules and auto-expire timestamps. Every policy change is auto-versioned with backup and rollback.

Tamper-proof audit ledger. Every gateway decision -- allow, deny, or approval -- is logged to an append-only JSONL file with SHA-256 hash chaining. Run

safeclaw audit verify

 to cryptographically verify the entire chain has not been tampered with.

Risk signals. The classifier detects five categories of suspicious patterns:

obfuscated_execution, pipe_to_external, credential_adjacent, broad_destructive, and persistence_mechanism

. These show as advisory badges on approval requests so you know why a command deserves extra scrutiny.

Container mode. Run your agent inside a Docker or Podman container with a read-only root filesystem, resource limits (2 GB memory, 2 CPUs, 256 PIDs), and a mounted workspace volume. API keys are passed via environment variables, never baked into the image. Enable it with

safeclaw run --container "your task"

.

Mobile-first approvals. The dashboard is a PWA you can install on your phone. Swipe right to approve, left to reject. Browser notifications alert you to pending actions. For teams that need SMS, we have Twilio integration built in.

Budget controls. Set daily, weekly, or monthly spending caps. When a cap is approached or exceeded, SafeClaw can warn, require approval, or block further actions.

Webhooks. Get notified in Slack, Discord, or any HTTP endpoint when approvals are needed or actions are taken.

446 tests across 24 files. We test the gateway, classifier, policy engine, audit chain, security boundaries, API endpoints, and integration flows. The full suite runs in CI on every commit.

What We Ship

SafeClaw's client is fully open source under the MIT license. The source is 25 files in src/, a browser dashboard in ui/, policy templates in policies/, and 24 test files. Zero third-party runtime dependencies. The only devDependency is vitest for testing.



The Authensor control plane is a hosted service that evaluates action metadata against your policy. It sees action types and sanitized resource strings. It never sees your API keys, file contents, prompts, or data.

Browse the source: github.com/AUTHENSOR/SafeClaw

Getting Started

bash
npx @authensor/safeclaw

The wizard provisions a demo Authensor token automatically. Pick Claude or OpenAI, paste your API key, and give the agent a task. The default policy will require your approval for file writes, shell commands, and network requests.

For a detailed walkthrough, see our 60-second quickstart.

What Comes Next

This is a beta. We are actively working on team policy management, role-based access control, an API integration mode for headless environments, and a plugin system for custom classifiers. See our 2026 roadmap for details.

We built SafeClaw because we believe action-level gating should be the default for every AI agent deployment. This beta is our starting point. Your feedback shapes what comes next.

Try it. Break it. Tell us what is missing. We are listening.