Agentic AI Security, Part 4: Data Classification, Model Integrity, and Runtime Monitoring

Modules 10–12 tie data handling (classification, PII, logs), model supply chain and weight verification, and runtime monitoring together—the stack teams need when agents read documents, call tools, and leave rich audit trails.

Data Classification and PII Redaction: Never Let Sensitive Data Hit Logs

Original module on ClawQL Docs: Data Classification and PII Redaction: Never Let Sensitive Data Hit Logs.

Even with strong runtime protection and sandboxing (Modules 8–9), sensitive data inevitably flows through agent sessions, documents, and tool calls. This module explains how to prevent PII, financial data, and other sensitive information from ever reaching persistent log stores.

Classification vs Redaction

Data classification and redaction are distinct but complementary controls:Classification tells you what data is sensitive and how it should be handled. Redaction ensures sensitive data is removed or masked before it is written to any queryable or long-term storage.

Both are required. Classification without redaction leaves raw PII in logs. Redaction without classification leaves you unable to reason about your data holdings.Organizations should maintain a formal data classification policy with tiers (Public, Internal, Confidential, Restricted) that maps to redaction rules.

Presidio in the Fluent Bit Pipeline

Reference stacks often run Microsoft Presidio as a pipeline stage in Fluent Bit — not as per-pod sidecars.

Why pipeline-level redaction?

One consistent redaction engine for all log sources. Fewer failure modes and surfaces to maintain. Redaction happens before logs reach Loki.

Presidio identifies and redacts PII (names, SSNs, credit cards, medical records, etc.) and financial data in real time as logs are collected.

Redaction-Before-Write for WORM Compliance

All security-relevant logs are written to WORM storage. Because redaction occurs before write:No raw sensitive data ever lands in persistent stores. WORM compliance is maintained without needing record deletion (which defeats WORM). Forensic value is preserved — enough context remains for investigation while PII is removed.

Forensic-Friendly Logging Design

Redaction rules are tuned to balance privacy and usability:Entity replacement with tokens (e.g., [REDACTED_SSN]) rather than full removal. Context around redacted fields is retained where possible. Full unredacted logs (if ever needed for incident response) are available only through strict break-glass procedures with multi-party approval.

Key Takeaways

Redaction must happen before data reaches any persistent log store — never after. Pipeline-level Presidio integration provides consistent, maintainable coverage across the entire platform. Classification policy + redaction-before-write satisfies both privacy regulations and forensic requirements. This approach ensures sensitive data never becomes a liability in logs, even during full incident investigations.

Proper data handling completes the protection of information in motion and at rest, enabling safe monitoring and response in the following modules.

Next module: Model Integrity – Verifying Weights Before Inference.

Further reading (vendor-neutral)

These resources are independent of any single product; use them to deepen the topic for audits, architecture reviews, or procurement discussions.


Model Integrity: Verifying Weights Before Inference

Original module on ClawQL Docs: Model Integrity: Verifying Weights Before Inference.

Model weights represent one of the largest and most overlooked attack surfaces in AI platforms. Traditional container scanning misses them entirely because they are large binary blobs fetched at runtime. This module explains how this pattern closes the “model-in-the-middle” attack vector with cryptographic verification before any inference begins.

The Model Weight Gap

Container images can be verified with Cosign and Kyverno, but model weights (Ollama models, Hugging Face checkpoints, custom fine-tunes) are typically downloaded directly and bypass image scanning. A poisoned weight file can contain backdoors that activate only during inference, exfiltrate data, or alter agent behavior.Treat model weights with the same rigor as container images.

Init-Container Verification Pattern

Every inference or agent pod that loads model weights runs a mandatory init container that performs verification before the main container starts.

Core Verification Steps:

SHA-256 hash validation against a signed manifest. Cosign blob signature verification. Manifest stored in Harbor alongside the weights.

Example Init Container:

initContainers:
  - name: verify-weights
    image: registry.internal.example/clawql/weight-verifier:latest
    command:
      - /bin/sh
      - -c
      - |
        cosign verify-blob \
          --key /etc/signing-keys/cosign.pub \
          --signature /weights/manifest.sig \
          /weights/manifest.json
        sha256sum -c /weights/manifest.json
    volumeMounts:
      - name: model-weights
        mountPath: /weights
      - name: signing-keys
        mountPath: /etc/signing-keys
        readOnly: true

The main inference container only starts if the init container succeeds.

Harbor Manifest Storage

Signed manifests and weights are stored in Harbor:One unified trust root for images and models. Replication and scanning policies apply uniformly. Kyverno policies can extend verifyImages logic to model-related init containers.

Key Takeaways

Model weights must be verified on every pod start, not just on first download. The init-container pattern combined with Cosign + SHA-256 provides strong cryptographic assurance. Storing manifests in Harbor unifies supply chain controls for both containers and models. This control closes a critical gap that standard container security tools cannot address.

Model integrity ensures the AI brains running your agents are exactly the ones you authorized and have not been tampered with.

Next module: Runtime Monitoring and Observability – Falco, Wazuh, Prometheus, and Merkle Metrics.

Further reading (vendor-neutral)

These resources are independent of any single product; use them to deepen the topic for audits, architecture reviews, or procurement discussions.


Runtime Monitoring and Observability: Falco, Wazuh, Prometheus, and Merkle Metrics

Original module on ClawQL Docs: Runtime Monitoring and Observability: Falco, Wazuh, Prometheus, and Merkle Metrics.

Strong prevention and containment are incomplete without comprehensive visibility. This module covers the runtime monitoring stack, which provides deep observability into system behavior, detects anomalies, and correlates events across layers.

The Observability Stack

Reference architectures often deploy a full observability suite:Falco (eBPF) — syscall-level monitoring and Kubernetes audit log integration. Detects suspicious activity such as unexpected shells, file modifications, or network connections inside containers. Wazuh — OSS SIEM for log correlation, rule-based alerting, vulnerability detection, and compliance reporting. Prometheus — metrics collection with custom exporters for Merkle root verification and Cuckoo filter health. Loki — log aggregation (receives only redacted logs from the Presidio pipeline). Tempo — distributed tracing for request flows through the intelligent MCP gateway. Kiali — Istio service mesh topology and traffic visualization.

Alert Tuning and Ownership

Wazuh and Falco generate high volumes of events by default. Runbooks should require:Named owner responsible for alert tuning. Tiered response (low-confidence → alert only; high-confidence → auto-quarantine via Talon). Regular tuning sessions to reduce noise while preserving signal.

Node Pinning Strategy

Observability workloads are pinned to dedicated non-GPU nodes using node selectors and taints. This prevents monitoring overhead from affecting inference latency or consuming GPU VRAM needed for agents.

Merkle and Cuckoo Metrics

Custom Prometheus metrics expose:Merkle root verification success/failure rates. Cuckoo filter false-positive rates (critical for security paths). Audit trail completeness.

These metrics ensure cryptographic integrity is actively monitored, not assumed.

Key Takeaways

Runtime monitoring turns the platform into a sensor that detects compromise early. Layered tools (Falco for low-level, Wazuh for correlation, Prometheus for metrics) provide comprehensive coverage with different strengths. Alert tuning and node pinning are operational requirements, not optional. Merkle and Cuckoo metrics bring cryptographic controls into day-to-day observability.

Effective monitoring enables the automated response and containment covered in the next module.

Next module: Automated Response and Containment – Falco + Talon Quarantine, Panguard Blocking.

Further reading (vendor-neutral)

These resources are independent of any single product; use them to deepen the topic for audits, architecture reviews, or procurement discussions.

Canonical curriculum and module-by-module versions: Agentic AI Security Curriculum (ClawQL Docs)