Rulepacks

Rulepacks are YAML files that define rules for auditing candidates in the AISentinel.

Structure

A rulepack is a YAML file with a top-level rules key containing a list of rule objects.

Each rule has the following fields:

name: A unique identifier for the rule (string)
when: The condition that triggers the rule (string)
action: What to do when the condition is met: block, warn, require_approval, suggest_alternative, redact_output, quarantine, escalate, or auto_fix
message: A human-readable message explaining the violation
severity: The severity level: low, medium, or high
phase: When to check the rule: pre (before execution), post (after execution), or final (on final output)
tags: Optional list of tags for categorization, e.g., ["pii", "high"]
recommendation: Optional suggested remediation steps (string, supports template variables)
remediation_config: Optional configuration for automated remediation

Conditions

The when field is a string that specifies the condition using a simple DSL:

[not] dotpath op value

not: Optional negation
dotpath: Path to the field in the candidate object, e.g., args.url, estimate.risk
op: Operator: equals, regex, contains, not_regex (same as not regex), not (for other ops)
value: The value to compare against

Examples:

args.url regex ^https://safe\.
not output regex \[source \d+\]
tool equals calc
estimate.cost > 1.0 (numeric comparison)
estimate.risk between 0.1,0.8 (range check)
args.query len_gt 100 (length check)
tool in calc,web,search (set membership)
args.email endswith @company.com (string operations)
args.data is_empty (type checking)

Operators

String and Basic Operators

equals: Exact string match
contains: Substring check
icontains: Case-insensitive substring check
startswith: String starts with value
endswith: String ends with value
regex: Regular expression match
not_regex: Negated regex (same as not regex)

Numeric Operators

>, gt: Greater than
<, lt: Less than
>=, gte: Greater than or equal
<=, lte: Less than or equal
between: Range check (format: "min,max")

Length Operators

len_gt: Length greater than
len_lt: Length less than
len_gte: Length greater than or equal
len_lte: Length less than or equal
len_eq: Length equals

Set and List Operators

in: Value is in comma-separated list
not_in: Value is not in comma-separated list

Type Checking Operators

is_string: Target is a string
is_number: Target is numeric
is_list: Target is a list/tuple
is_empty: Target is empty or null

Logical Operators

AND: Logical AND (both conditions must be true)
OR: Logical OR (at least one condition must be true)
not: Logical NOT (negates the following condition)
(, ): Parentheses for grouping expressions

Complex Conditions

You can combine multiple conditions using logical operators:

# Multiple conditions with AND
when: 'tool equals web AND args.url contains malicious'

# Either/or conditions  
when: 'estimate.cost > 5.0 OR estimate.risk > 0.8'

# Complex nested conditions
when: '(tool equals email AND args.to endswith @external.com) OR tool equals api_call'

# Negation with complex conditions
when: 'not (tool in calc,search AND args.query len_gt 100)'

Advanced Examples

rules:
# Cost management
- name: expensive_operation
  when: 'estimate.cost > 10.0'
  action: block
  message: "Operation too expensive"
  severity: high

# Risk assessment  
- name: high_risk_approval
  when: 'estimate.risk between 0.7,1.0'
  action: require_approval
  message: "High risk operation needs approval"

# Content validation
- name: empty_query_block
  when: 'args.query is_empty'
  action: block
  message: "Empty queries not allowed"

# Tool allowlist
- name: allowed_tools_only
  when: 'tool not_in calc,web,search,email,slack_api'
  action: block
  message: "Tool not in approved list"

# Email security
- name: company_email_only
  when: 'tool equals email and args.to not endswith @company.com'
  action: block
  message: "Only company emails allowed"

# Numeric validation for search results
- name: reasonable_search_count
  when: 'args.k > 10'
  action: warn
  message: "Large result count may be expensive"
  severity: low
  phase: pre

# Complex logical conditions
- name: high_risk_web_or_expensive
  when: '(tool equals web AND estimate.risk > 0.8) OR estimate.cost > 15.0'
  action: require_approval
  message: "High-risk web operation or very expensive operation"
  severity: high
  phase: pre

# Multi-tool security check
- name: sensitive_data_protection
  when: 'tool in email,api_call,slack_api AND (args.body icontains password OR args.data icontains ssn)'
  action: block
  message: "Sensitive data detected in external communication"
  severity: high
  phase: pre

# Length and content validation
- name: reasonable_code_length
  when: 'tool equals python_exec AND args.code len_gt 1000'
  action: require_approval
  message: "Large code blocks require approval"
  severity: medium
  phase: pre

# Domain and path restrictions
- name: restricted_domains
  when: 'tool equals web AND (args.url startswith https://internal. OR args.url contains /admin/)'
  action: block
  message: "Access to internal/admin resources blocked"
  severity: high
  phase: pre

Field Reference

Candidate Object

The candidate object has these fields:

Estimate object

The estimate object provides machine-generated guidance about the candidate. Common fields:

an integer count (e.g. 3) meaning three discrete steps remaining, or
a fraction between 0.0 and 1.0 (e.g. 0.25) representing a normalized portion of remaining effort.

Guidance: rules can reference estimate fields directly, for example estimate.steps_left < 1 or estimate.risk > 0.8.

Example candidate snippet:

{
  "type": "tool_call",
  "tool": "web",
  "args": {"url": "https://example.com/search?q=foo"},
  "estimate": {"cost": 0.12, "risk": 0.05, "steps_left": 2, "confidence": 0.85}
}

evidence_expect: Expected evidence (list of strings)

evidence_expect (details)

The evidence_expect field is an advisory list of tokens that indicate what kind of evidence the candidate expects from a tool. It helps adapters, the LLM service, and rulepacks interpret and validate tool outputs.

Common tokens and their intended meaning:

results — Structured results array (for example search results). Use when the tool returns a list of items with metadata.
result — A single structured result object.
text — Plain textual output; no structured items expected.
citation — Expect citation or source metadata alongside textual output.

How evidence_expect is used:

Adapters parse and map tool outputs according to the tokens in evidence_expect so the governor can evaluate them.
Rulepacks can reference evidence_expect to require or check for certain evidence types (example: when: 'evidence_expect contains citation').
The field is advisory and optional; omit it or leave it empty when no particular evidence format is required.

Example candidate showing evidence_expect:

{
  "type": "tool_call",
  "tool": "web",
  "args": {"url": "https://example.com/search?q=foo"},
  "estimate": {},
  "evidence_expect": ["results"]
}

Actions

block: Prevent the candidate from executing
warn: Allow but log a warning
require_approval: Require manual approval before proceeding
suggest_alternative: Suggest a safer alternative approach
redact_output: Automatically redact sensitive information
quarantine: Allow execution but isolate the results
escalate: Route to human review for complex cases
auto_fix: Apply automated remediation

Phases

pre: Checked before tool execution
post: Checked after tool execution
final: Checked on the final output of the run

Remediation Features

AISentinel supports automated remediation for detected violations. Rules can include remediation guidance and configuration for automatic fixes.

Recommendation Field

The optional recommendation field provides human-readable guidance for addressing violations:

- name: pii_email_detection
  when: output regex [A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}
  action: warn
  message: "Email address found in output"
  severity: medium
  phase: final
  recommendation: "Email detected in output. Consider using user IDs or anonymized identifiers."

Recommendations support template variables that are populated with violation context:

{detected_email} - The actual email found
{tool} - The tool that triggered the violation
{severity} - The violation severity level

Remediation Configuration

The optional remediation_config field enables automated remediation:

- name: pii_email_detection
  when: output regex [A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}
  action: warn
  message: "Email address found in output"
  severity: medium
  phase: final
  recommendation: "Email detected in output. Consider using user IDs or anonymized identifiers."
  remediation_config:
    auto_redact: true
    redaction_pattern: '[REDACTED_EMAIL]'
    requires_approval: false

Remediation Config Options:

auto_redact: Automatically apply redaction patterns (boolean)
redaction_pattern: Pattern to replace sensitive data (string)
requires_approval: Whether remediation needs manual approval (boolean)
auto_suggest: Provide alternative suggestions (boolean)
suggestion_type: Type of suggestion to provide (string)

Built-in Remediation

AISentinel includes built-in remediation for common violation types:

PII Violations (pii_* rules):

Automatically redacts emails, phone numbers, and SSNs
Replaces sensitive data with [REDACTED_*] placeholders

Cost Violations (cost_* rules):

Suggests lower-cost alternatives
Provides optimization recommendations
May require approval for significant changes

Remediation in Reports

Remediation information appears in:

Security Audit Reports (SAR): HTML reports show remediation actions and results
Dashboard Reports: Interactive remediation panels with one-click fixes
API Responses: Remediation options via /api/remediation/options

Example Complete Rule

rules:
  - name: comprehensive_pii_protection
    when: 'output regex [A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}'
    action: redact_output
    message: "Email addresses detected in output - automatically redacting"
    severity: medium
    phase: final
    tags: ["pii", "privacy", "automated"]
    recommendation: |
      Email address {detected_email} found in output.
      Automatically redacted with pattern: [REDACTED_EMAIL]
      Consider using anonymized user identifiers instead.
    remediation_config:
      auto_redact: true
      redaction_pattern: '[REDACTED_EMAIL]'
      requires_approval: false

Tags are optional lists used for categorization, such as ["pii", "high"] indicating a PII-related rule with high risk.