Our Philosophy: Why Deny-by-Default Is the Only Safe Choice

Authensor Team · 2026-02-13

Our Philosophy: Why Deny-by-Default Is the Only Safe Choice

When we designed SafeClaw's policy engine, the first decision we made was also the most important: the default effect for any action not matched by a rule is deny.

This was not a default we planned to let users change. It is the only safe architecture when the principal executing actions is a non-deterministic language model.

The Problem with Allow-by-Default

Most permission systems in software start from an allow-by-default posture. A new user can access most resources. A new service can call most endpoints. The security team then builds a deny list for the sensitive stuff. This works when the principals are predictable: human users and deterministic services operate within understood boundaries.

AI agents do not have understood boundaries. A language model deciding which tool to call next is influenced by its training data, the current prompt, the conversation history, and stochastic sampling. It might decide to run rm -rf / not because of adversarial intent but because the prompt context made "cleaning up temporary files" seem like the right next step.

In an allow-by-default system, every action the model takes is permitted unless someone specifically anticipated and blocked it. You are racing to enumerate every dangerous action faster than the model discovers new ones. That is a race you lose.

In a deny-by-default system, the model can only do what you have explicitly permitted. Every new capability requires a deliberate policy change. You are not racing against the model's creativity. You are gating it.

How SafeClaw Implements Deny-by-Default

SafeClaw's default policy template ships with four rules:

Allow read-only operations (safe.read.*) -- file reads, glob searches, grep, todo writes, user questions

Require approval for file writes and code execution (filesystem., code.)

Require approval for network requests (network.*)

Require approval for secrets, payments, and MCP tools (secrets., payments., mcp.*)

Everything else hits the default: deny. If a model calls an unknown tool or triggers an unclassified action type, it is blocked. The agent receives a clear denial message and can adjust its approach.

This means the system is safe from day one, without the user writing a single policy rule. The out-of-the-box experience is restrictive by design.

"But That Will Block Legitimate Actions"

Yes. That is the point.

When a legitimate action is blocked, you see it in the dashboard. You approve it if it makes sense. If you find yourself approving the same class of action repeatedly, you add a policy rule to allow it. The workflow moves from restrictive to precisely permissive, one deliberate decision at a time.

The alternative -- starting permissive and restricting after incidents -- means your first signal that something is wrong is damage. A deleted file. An exfiltrated credential. A curl command that sent your .env to an external server. By the time you see the signal, the action has already executed.

We prefer the signal to be a notification on your phone asking "should this agent write to /etc/cron.d/agent-task?" rather than a post-incident analysis of why it already did.

Deny-by-Default in Practice

Teams we have worked with during SafeClaw's development consistently follow the same pattern:

Week one: the default policy is in place. Approvals come frequently. The team learns what their agent actually does -- which is often surprising. Agents call tools the team did not expect, access paths they did not anticipate, and run commands they would not have approved in advance. Week two: the team writes targeted allow rules for the repetitive, safe patterns: "allow file writes under /tmp/output," "allow npm test and npm run build." Approval volume drops. The remaining approvals are genuinely novel actions that deserve human review. Week three and beyond: the policy stabilizes. Approvals are rare and meaningful. The audit log provides a complete, tamper-proof record of every action the agent took and every decision the policy made. The team has high confidence that nothing is running without explicit permission.

This progression only works in one direction: from deny to selective allow. Going from allow to selective deny requires you to enumerate every dangerous action in advance, which you cannot do when your principal is a general-purpose language model.

The Fail-Closed Corollary

Deny-by-default extends to infrastructure failures. If SafeClaw cannot reach the Authensor control plane and there is no cached allow decision, the action is denied. We call this fail-closed, and it is not configurable.

The reasoning is identical: if you do not know whether an action is safe, the safe assumption is that it is not. A brief pause in agent productivity while connectivity recovers is better than a window of ungated execution.

We cache explicit allow decisions for offline resilience. But we never cache denials -- and we never infer an allow from the absence of a denial.

Deny-by-Default Is a Design Constraint, Not a Feature Toggle

Some tools offer "strict mode" or "safe mode" as an option. We do not. SafeClaw's defaultEffect is deny, and while you can technically set it to something else in the policy file, we will tell you not to.

This is a philosophical commitment, not a feature flag. When the principal is a language model with access to your file system, network, and shell, the only responsible default is: no. Prove it is safe, then we will let it through.

That is the system we built. That is the system we stand behind.

Read more about our approach in the SafeClaw security model documentation.