Agent Guardrails
Deterministic safety enforcement for autonomous agent execution. Prevents destructive commands, credential leaks, and runaway loops through infrastructure-level controls that agents cannot bypass.
Concepts
PreToolUse and PostToolUse hooks that intercept tool calls before and after execution.How It Works
Guardrails operate at three layers:
1. Bash Command Blocking
The PreToolUse hook on Bash matches commands against a deny-list of dangerous patterns:
| Pattern | Example | Reason |
|---|---|---|
| rm -rf / or ~ | rm -rf /home | Recursive deletion |
| chmod 777 | chmod -R 777 /var | World-writable permissions |
| curl | sh | curl example.com | bash | Piping remote content to shell |
| git push --force | git push -f origin main | Force push to remote |
| mkfs.* | mkfs.ext4 /dev/sda1 | Formatting filesystems |
| Fork bombs | :(){ :|:& };: | Process explosion |
| shutdown, reboot | shutdown -h now | Host shutdown |
When a command is blocked, the agent sees a clear denial message with the reason. The event is logged to /logs/guardrails.jsonl.
2. Credential File Protection
The PreToolUse hook on Edit, Write, and NotebookEdit blocks modifications to sensitive paths:
.env, .env.* — Environment files with secrets.mcp.json — MCP server configuration~/.ssh/*, ~/.aws/*, ~/.gcp/* — Cloud and SSH credentials~/.claude/settings.json — Claude Code settings (hook configuration)/opt/trinity/* — Platform guardrail files3. Credential Leak Detection
The PostToolUse hook on Bash scans command output for leaked credentials:
| Pattern | Example Prefix |
|---|---|
| Anthropic API keys | sk-ant-... |
| OpenAI API keys | sk-proj-... |
| GitHub PATs | ghp_..., github_pat_... |
| AWS access keys | AKIA... |
| Slack tokens | xoxb-..., xoxp-... |
| Google API keys | AIza... |
Matches are logged (pattern name only, not the actual value) for security review.
4. Turn Limits
Every Claude Code invocation enforces a maximum turn count via --max-turns:
| Mode | Default | Range |
|---|---|---|
| Chat | 50 turns | 1-500 |
| Task/Headless | 20 turns | 1-500 |
This prevents runaway loops that burn through API credits.
Per-Agent Configuration
Owners can tighten guardrails for specific agents. Overrides are additive — you can add more restrictions but cannot remove baseline protections.
Available Overrides
| Field | Type | Description |
|---|---|---|
| max_turns_chat | int (1-500) | Max turns for chat mode |
| max_turns_task | int (1-500) | Max turns for headless tasks |
| execution_timeout_sec | int (60-7200) | Execution time limit |
| extra_bash_deny | list (max 50) | Additional bash patterns to block |
| extra_path_deny | list (max 50) | Additional paths to protect |
| disallowed_tools | list (max 50) | Claude Code tools to disable |
Configure via UI
Open the agent detail page
Go to the Config tab
Expand Guardrails section
Adjust settings and save
Restart the agent to apply changes
Guardrails API
| Endpoint | Method | Description |
|---|---|---|
| /api/agents/{name}/guardrails | GET | Get per-agent guardrails config |
| /api/agents/{name}/guardrails | PUT | Set per-agent guardrails overrides |
After updating guardrails, stop and start the agent to apply changes. The container is recreated with the new configuration.
For Agents
Guardrails are enforced at the infrastructure layer. Agents cannot:
/opt/trinity/hooks/ is root-owned)~/.claude/settings.json (protected path)--max-turns limits--dangerously-skip-permissions protections (hooks still fire)When a tool call is blocked, the agent receives a structured error and can acknowledge the denial and try an alternative approach.