Home / Blog / NemoClaw Policy Engine§ Security

Security

§ Technical · 15 min read

Securing OpenClaw agents in production: lessons from NemoClaw's policy engine

The design patterns NemoClaw establishes are worth studying regardless of whether you adopt NemoClaw itself. Here is what it gets right, where it stops, and the full architecture enterprises need.

Katonic AI Engineering

Security

March 23, 2026

Announced at GTC 2026

lessons for production agent security

The takeaway:

patterns are not products. Build the complete stack.

NVIDIA's NemoClaw, announced at GTC 2026, introduced a policy engine for OpenClaw agents that is worth studying regardless of whether you adopt NemoClaw itself.

The design patterns it establishes - declarative policies, hot-reloadable controls, operator-in-the-loop approvals, and infrastructure-layer enforcement - represent the direction agent security is heading.

This post examines those patterns, identifies where they stop short of what production deployments require, and lays out the full policy architecture needed to run OpenClaw agents safely in enterprise environments.

Default-Deny Network Policy Is Non-Negotiable

NemoClaw's most important architectural decision is its default-deny network posture. When a sandbox starts, the agent can reach exactly one external endpoint: its configured inference provider. Every other outbound connection is blocked.

This inverts the typical approach. Most teams start with full internet access and try to restrict it later. NemoClaw starts with nothing and requires explicit approval to open each destination.

What this means for production

Default-deny is the correct baseline. But NemoClaw's approved endpoints are session-scoped - lost on restart. Production needs persistent policy that survives restarts. You also need tenant-aware policy. NemoClaw's flat YAML does not scale to 50 agents across 10 teams with different network requirements.

Inference Routing Must Be an Infrastructure Concern

NemoClaw treats inference routing as infrastructure, not application logic. Every AI model call is intercepted by the OpenShell gateway and routed to the configured provider. The agent never contacts an LLM endpoint directly.

This has three benefits production architectures should replicate:

Model governance: You control which models agents can use. A rogue agent cannot discover and call an unauthorized endpoint. You can switch models at runtime.

Cost control: With a single routing point, you can meter every inference call, attribute costs, and enforce budget limits.

Audit trail: Every model call passes through one gateway, creating a complete log.

What this means for production

Production deployments need multiple providers with automatic failover, tiered routing by complexity and cost, provider health tracking with circuit breakers, and rate limiting per agent, per team, and per organization.

Interactive Approval Is a Starting Point, Not a Destination

NemoClaw's TUI presents blocked requests to a human operator for real-time approval. This is useful for development. For production, interactive approval does not scale. You cannot have a human watching a TUI for every agent around the clock.

The production policy engine model

A production policy engine operates at the tool call level, not just the network level. Every tool invocation passes through policy evaluation before execution. The engine supports multiple actions beyond allow/deny:

Production Policy Actions

Policy Action	Behavior	Use Case
Allow	Tool call executes immediately with no intervention	Low-risk tools used by trusted agents in production
Block	Tool call is rejected. Agent receives a denial response.	Dangerous tools, unauthorized integrations, unapproved operations
Requires Approval	Tool call paused until a human approver grants permission, with configurable expiry	High-risk: financial transactions, data exports, production deployments
Rate Limit	Allowed up to a defined frequency. Excess calls queued or rejected.	Preventing runaway agents from flooding external APIs or exceeding budgets
PII Scan	Tool call arguments scanned for PII before execution. Matches trigger configurable actions.	Any tool sending data externally: email, CRM updates, API calls, file exports
Redact Output	Tool executes normally, but PII is stripped from the result before the agent sees it	Database queries, knowledge retrieval, any tool returning sensitive records

How a Tool Call Flows Through the Policy Engine

Policy Must Be Multi-Dimensional

NemoClaw's policies are per-sandbox. Enterprise environments need policy that accounts for:

Tenant scope: Different orgs have different compliance requirements, approved tools, and data sensitivity levels.

Team scope: Engineering teams might have broader tool access than marketing. Finance agents might have stricter PII controls.

Tool scope: Policy expressible at the tool server level (all Jira tools) and individual tool level (only read operations from Jira).

Risk classification: Tools carry risk levels that interact with policy. Low-risk reads auto-approve while high-risk writes require HITL approval.

Wildcard matching: Rules like org:acme/team:*/server:jira/tool:read_* to express broad rules concisely.

Observability Is a Security Control

NemoClaw provides a TUI showing live network activity. For production, observability needs to be structured, persistent, and integrated into existing security infrastructure:

Every tool call logged with full context: who, which agent, which tenant, arguments, result, duration, cost
Every policy decision logged: which rule matched, what action was taken, human approval status
Every guardrail event: which rail triggered, what was flagged, what action followed
Cost attribution at every level with budget alerts and anomaly detection
All flowing into structured analytics tables with retention policies, not ephemeral terminal output

Guardrails and Policy Are Different Concerns

NemoClaw conflates security (what the agent can reach) with the absence of content governance (what the agent can say or process). These are separate architectures.

Policy engine (what NemoClaw addresses)

Controls what tools the agent can use, what endpoints it can reach, what models it can call. Network and tool level. Binary or approval-based enforcement.

Guardrails engine (what NemoClaw does not address)

Controls content: prompt injection detection with trained models, PII detection on every message with 13+ entity types, content safety classification using trained NIM models, grounding checks, instruction leak prevention, and topic enforcement. A production deployment needs both.

Two Separate Architectures

§ Architecture

The full production architecture

NemoClaw Coverage vs. Production Requirements

Detailed Coverage Matrix

Layer	Function	NemoClaw Coverage	Production Gap	Priority
Runtime Sandbox	Process isolation, filesystem confinement, syscall filtering	Strong: Landlock + seccomp + netns	None - well covered	Covered
Network Policy	Egress control, per-destination approval, presets	Strong: declarative YAML, interactive TUI	Session-scoped, single-tenant, flat	High
Inference Gateway	Model routing, provider management, cost metering	Partial: single provider, runtime switching	No failover, tiering, rate limiting, cost attribution	High
Tool Governance	Per-tool policy, HITL approvals, PII scanning, risk levels	None: network layer only	Full gap - needs governance proxy	Critical
Content Guardrails	Prompt injection, PII, content safety, grounding, topics	None: no message inspection	Full gap - needs guardrails engine	Critical
Observability	Audit trails, cost attribution, budget alerts, anomaly detection	Minimal: TUI monitoring only	Full gap - needs analytics service	High

§ Recommendations

Applying these lessons

Adopt default-deny as your baseline.Start with everything blocked and open selectively. This is the single most impactful pattern NemoClaw demonstrates.

Route all inference through an infrastructure gateway.Centralized model governance, cost control, and audit in one architectural decision.

Implement tool-level governance, not just network-level.Network egress tells you where the agent connects. Tool governance tells you what it is doing. The second matters more.

Deploy trained guardrails, not regex rules.Content safety and prompt injection require models trained on real attack patterns.

Build structured observability from day one.Every tool call, policy decision, guardrail event, and cost charge in structured storage.

Design for multi-tenancy from the start.Retrofitting multi-tenancy is significantly harder than designing for it.

§ Summary

Conclusion

NemoClaw's policy engine represents a meaningful step forward for agent security. Its default-deny posture, infrastructure-level inference routing, and interactive approval model establish patterns every production deployment should adopt.

But patterns are not products. Moving from alpha sandbox to production requires multi-dimensional tool governance, trained content guardrails, structured observability, and multi-tenant policy management.

The teams that deploy agents successfully at scale are the ones that study what NemoClaw gets right, understand where it stops, and build the complete stack.

§ About

About Katonic AI

Katonic 7.0 is an enterprise AI platform built for organizations that need autonomous AI agents with full governance, security, and data sovereignty. The platform deploys entirely on your infrastructure with zero data egress. It includes 8 guardrail types powered by NVIDIA NeMo NIM models, infrastructure-layer tool governance with human-in-the-loop approvals and PII scanning, permission-aware knowledge retrieval across 50+ enterprise connectors, and complete cost attribution from day one.

To learn how Katonic approaches enterprise agent security, visit katonic.ai

Share this article

Katonic AI Engineering

The Operating System for Sovereign AI

Katonic enables enterprises to deploy AI agents, copilots, and models that run 100% on their own infrastructure with full governance, security, and data sovereignty.

Learn how Katonic approaches enterprise agent security →

§ Related articles

Keep reading.

Agent Sandbox

Security12 min read

NemoClaw vs. Docker Isolation: What Agent-Specific Sandboxing Actually Adds

Containers isolate processes from the host. Agent-specific sandboxing controls what the agent does inside the container. Two fundamentally different layers of defense.

Katonic AI12 min read

OpenClaw Risk

Security8 min read

OpenClaw Is Powerful but Risky. Here's How NVIDIA Is Fixing That.

250,000 GitHub stars. Zero enterprise security model. NVIDIA identified the gap and announced NemoClaw at GTC 2026.

Katonic AI8 min read

Agent Runtime

AI Strategy9 min read

AI Agents at Work: What NemoClaw Means for Enterprise Security Teams

AI agents are a new category of authorized internal actor. Most security frameworks were not built for this. NemoClaw is the first infrastructure-level attempt to address it.

Katonic AI9 min read

Full agent governance, on your infrastructure.

Katonic 7.0 delivers tool governance, content guardrails, and structured observability for production AI agents.

Book a Demo View Architecture

Home / Blog / NemoClaw Policy Engine§ Security

Security

§ Technical · 15 min read

Securing OpenClaw agents in production: lessons from NemoClaw's policy engine

The design patterns NemoClaw establishes are worth studying regardless of whether you adopt NemoClaw itself. Here is what it gets right, where it stops, and the full architecture enterprises need.

Katonic AI Engineering

Security

March 23, 2026

Announced at GTC 2026

lessons for production agent security

The takeaway:

patterns are not products. Build the complete stack.

NVIDIA's NemoClaw, announced at GTC 2026, introduced a policy engine for OpenClaw agents that is worth studying regardless of whether you adopt NemoClaw itself.

Default-Deny Network Policy Is Non-Negotiable

This inverts the typical approach. Most teams start with full internet access and try to restrict it later. NemoClaw starts with nothing and requires explicit approval to open each destination.

What this means for production

Inference Routing Must Be an Infrastructure Concern

This has three benefits production architectures should replicate:

Model governance: You control which models agents can use. A rogue agent cannot discover and call an unauthorized endpoint. You can switch models at runtime.

Cost control: With a single routing point, you can meter every inference call, attribute costs, and enforce budget limits.

Audit trail: Every model call passes through one gateway, creating a complete log.

What this means for production

Interactive Approval Is a Starting Point, Not a Destination

The production policy engine model

Production Policy Actions

Policy Action	Behavior	Use Case
Allow	Tool call executes immediately with no intervention	Low-risk tools used by trusted agents in production
Block	Tool call is rejected. Agent receives a denial response.	Dangerous tools, unauthorized integrations, unapproved operations
Requires Approval	Tool call paused until a human approver grants permission, with configurable expiry	High-risk: financial transactions, data exports, production deployments
Rate Limit	Allowed up to a defined frequency. Excess calls queued or rejected.	Preventing runaway agents from flooding external APIs or exceeding budgets
PII Scan	Tool call arguments scanned for PII before execution. Matches trigger configurable actions.	Any tool sending data externally: email, CRM updates, API calls, file exports
Redact Output	Tool executes normally, but PII is stripped from the result before the agent sees it	Database queries, knowledge retrieval, any tool returning sensitive records

How a Tool Call Flows Through the Policy Engine

Policy Must Be Multi-Dimensional

NemoClaw's policies are per-sandbox. Enterprise environments need policy that accounts for:

Tenant scope: Different orgs have different compliance requirements, approved tools, and data sensitivity levels.

Team scope: Engineering teams might have broader tool access than marketing. Finance agents might have stricter PII controls.

Tool scope: Policy expressible at the tool server level (all Jira tools) and individual tool level (only read operations from Jira).

Risk classification: Tools carry risk levels that interact with policy. Low-risk reads auto-approve while high-risk writes require HITL approval.

Wildcard matching: Rules like org:acme/team:*/server:jira/tool:read_* to express broad rules concisely.

Observability Is a Security Control

NemoClaw provides a TUI showing live network activity. For production, observability needs to be structured, persistent, and integrated into existing security infrastructure:

Every tool call logged with full context: who, which agent, which tenant, arguments, result, duration, cost
Every policy decision logged: which rule matched, what action was taken, human approval status
Every guardrail event: which rail triggered, what was flagged, what action followed
Cost attribution at every level with budget alerts and anomaly detection
All flowing into structured analytics tables with retention policies, not ephemeral terminal output

Guardrails and Policy Are Different Concerns

NemoClaw conflates security (what the agent can reach) with the absence of content governance (what the agent can say or process). These are separate architectures.

Policy engine (what NemoClaw addresses)

Controls what tools the agent can use, what endpoints it can reach, what models it can call. Network and tool level. Binary or approval-based enforcement.

Guardrails engine (what NemoClaw does not address)

Two Separate Architectures

§ Architecture

The full production architecture

NemoClaw Coverage vs. Production Requirements

Detailed Coverage Matrix

Layer	Function	NemoClaw Coverage	Production Gap	Priority
Runtime Sandbox	Process isolation, filesystem confinement, syscall filtering	Strong: Landlock + seccomp + netns	None - well covered	Covered
Network Policy	Egress control, per-destination approval, presets	Strong: declarative YAML, interactive TUI	Session-scoped, single-tenant, flat	High
Inference Gateway	Model routing, provider management, cost metering	Partial: single provider, runtime switching	No failover, tiering, rate limiting, cost attribution	High
Tool Governance	Per-tool policy, HITL approvals, PII scanning, risk levels	None: network layer only	Full gap - needs governance proxy	Critical
Content Guardrails	Prompt injection, PII, content safety, grounding, topics	None: no message inspection	Full gap - needs guardrails engine	Critical
Observability	Audit trails, cost attribution, budget alerts, anomaly detection	Minimal: TUI monitoring only	Full gap - needs analytics service	High

§ Recommendations

Applying these lessons

Adopt default-deny as your baseline.Start with everything blocked and open selectively. This is the single most impactful pattern NemoClaw demonstrates.

Route all inference through an infrastructure gateway.Centralized model governance, cost control, and audit in one architectural decision.

Implement tool-level governance, not just network-level.Network egress tells you where the agent connects. Tool governance tells you what it is doing. The second matters more.

Deploy trained guardrails, not regex rules.Content safety and prompt injection require models trained on real attack patterns.

Build structured observability from day one.Every tool call, policy decision, guardrail event, and cost charge in structured storage.

Design for multi-tenancy from the start.Retrofitting multi-tenancy is significantly harder than designing for it.

§ Summary

Conclusion

The teams that deploy agents successfully at scale are the ones that study what NemoClaw gets right, understand where it stops, and build the complete stack.

§ About

About Katonic AI

To learn how Katonic approaches enterprise agent security, visit katonic.ai

Share this article

Katonic AI Engineering

The Operating System for Sovereign AI

Katonic enables enterprises to deploy AI agents, copilots, and models that run 100% on their own infrastructure with full governance, security, and data sovereignty.

Learn how Katonic approaches enterprise agent security →

§ Related articles

Keep reading.

Agent Sandbox

Security12 min read

NemoClaw vs. Docker Isolation: What Agent-Specific Sandboxing Actually Adds

Containers isolate processes from the host. Agent-specific sandboxing controls what the agent does inside the container. Two fundamentally different layers of defense.

Katonic AI12 min read

OpenClaw Risk

Security8 min read

OpenClaw Is Powerful but Risky. Here's How NVIDIA Is Fixing That.

250,000 GitHub stars. Zero enterprise security model. NVIDIA identified the gap and announced NemoClaw at GTC 2026.

Katonic AI8 min read

Agent Runtime

AI Strategy9 min read

AI Agents at Work: What NemoClaw Means for Enterprise Security Teams

AI agents are a new category of authorized internal actor. Most security frameworks were not built for this. NemoClaw is the first infrastructure-level attempt to address it.

Katonic AI9 min read

Full agent governance, on your infrastructure.

Katonic 7.0 delivers tool governance, content guardrails, and structured observability for production AI agents.

Book a Demo View Architecture