From Internal Tool to Open Source: SafeClaw's Journey

Authensor Team · 2026-02-13

From Internal Tool to Open Source: SafeClaw's Journey

SafeClaw was not planned as a product. It started as a script we wrote for ourselves because we needed to stop our own AI agents from doing things we did not authorize.

The Internal Need

In January 2026, we were using AI agents internally at Authensor for code generation, documentation, and infrastructure tasks. The agents were productive. They were also unpredictable.

One agent, tasked with "cleaning up the test directory," deleted a fixtures folder that other tests depended on. Another agent, asked to "set up a development environment," ran npm install with a package it hallucinated, pulling in an unknown dependency. A third agent, working on documentation, made a network request to fetch example data from a URL it constructed from the prompt context -- a URL that did not exist and could have been anything.

None of these were catastrophic. All of them were preventable. The pattern was clear: agents have broad tool access, and the tools execute immediately with no intermediate check. We needed a way to review what agents wanted to do before they did it.

Version Zero

The first version was a single file: a PreToolUse hook for the Claude Agent SDK that logged every tool call to stderr and blocked anything involving shell execution or file writes outside of /tmp. It was crude. It worked.

Within a day, we added a policy file so we could configure which actions were allowed without editing code. Within two days, we had a classifier that mapped tool names to action types. By the end of the first week, we had the gateway, the Authensor control plane integration, a CLI, and the foundation of what would become SafeClaw.

We were customer zero. Every feature we built, we built because we needed it.

The Sequence

Here is how SafeClaw grew from that initial script:

Week 1 (Jan 27-31). The core: gateway hook, classifier, CLI, basic policy templates, container mode, standalone approvals UI. We shipped v0.1.0 on January 31 with just enough to be useful. Week 2 (Feb 1-7). Rapid iteration based on our own daily use. We added the browser dashboard because approving actions from the CLI was slow. We added OpenAI support because not all our internal tools use Claude. We built the audit ledger because we wanted a record of what our agents had done. Budget controls came after an agent ran an expensive sequence of API calls we did not anticipate. Analytics came because we wanted to understand usage patterns across our team. Week 3 (Feb 8-13). Hardening for public release. Policy versioning and rollback, time-based rules, the PWA with swipe approvals, security hardening (CSRF, ReDoS protection, secret redaction, rate limiting), integration tests against a real HTTP server, and the risk signal system. The test count grew from 250 to 446.

Every feature in SafeClaw traces back to a real problem we encountered while running AI agents internally. The policy engine exists because hardcoded rules did not scale. The offline cache exists because our control plane had a brief outage and all agent work stopped. The doctor command exists because we got tired of debugging configuration issues manually.

The Decision to Open Source

Two weeks into building SafeClaw, we had a tool that solved a real problem and was stable enough that we relied on it daily. We had two options: keep it internal or open source it.

The argument for keeping it internal was weak. We are a small team. The value of SafeClaw is not in keeping it proprietary -- it is in making agent safety a solved problem so that teams trust agents enough to use them, including with services like Authensor.

The argument for open sourcing was strong:

Security software should be auditable. Closed-source safety tools require blind trust. Open-source safety tools can be verified.

The problem is industry-wide. Every team running AI agents faces the same risks. Limiting the solution to our team when the problem affects everyone felt wrong.

Community feedback makes it better. We built SafeClaw for our workflow. Other teams have different agents, different tools, different risk profiles. Their feedback expands what SafeClaw handles.

MIT license removes friction. We wanted zero barriers to adoption. No license negotiations, no vendor lock-in concerns, no "can we use this in production" legal reviews.

We published the repository on GitHub, wrote documentation, and launched the beta. The entire client is open: 25 source files, the dashboard, the policy templates, the Dockerfile, and all 446 tests.

What We Learned as Customer Zero

Being our own first user taught us things we would not have learned from hypothetical use cases:

Approval fatigue is real. In the first week, we approved dozens of actions per task. We learned to write precise policy rules quickly. This directly influenced our decision to build the policy simulation feature and the policy recommendation system on our roadmap. Agents call tools you do not expect. We anticipated file writes and shell commands. We did not anticipate agents making web searches to look up library documentation, calling MCP tools from servers we forgot we had configured, or attempting to read SSH keys as part of a "check the environment" step. The classifier's broad coverage and the risk signal system both came from these surprises. Audit logs are invaluable. When something went wrong, the audit trail let us reconstruct exactly what happened. When we showed the audit log to a colleague considering agent adoption, their first reaction was "I want this for compliance." Container mode is worth the overhead. For tasks involving untrusted prompts or experimental agents, the combination of policy gating and container isolation gave us confidence to let agents run without hovering over the terminal.

Where We Are Now

SafeClaw is in beta. It is the tool we use every day internally, and now it is available to everyone. We are iterating quickly -- two beta releases in three days -- and our roadmap is driven by the same principle that created the project: solve real problems for people running real AI agents.

If you are running AI agents without gating, you are where we were in late January: one confident agent decision away from a preventable incident.

Try SafeClaw:

``bash


npx @authensor/safeclaw

Browse the source: github.com/AUTHENSOR/SafeClaw