Skip to main content
Sandworm AI Security

Security for the AI you ship.

Prompt and output scanning, jailbreak detection, an agent firewall, guardrails, an AI-BOM, and red-teaming — defend the AI applications you build and run.

AI / LLM Security
Prompt scanningOutput scanningJailbreak detectionAgent firewallAI-BOMRed-team suite

What Sandworm AI Security does.

Prompt & output scanning

Inspect every prompt and every model response at runtime. Detect injected instructions, sensitive data in outputs, and policy violations before they reach end users.

Jailbreak detection

Classify and block known jailbreak patterns in real time using a continuously updated taxonomy of techniques — direct, indirect, multi-turn, and roleplay-based.

Agent firewall

Gate the tools and APIs your AI agents can call. Enforce least-privilege at the agentic layer where standard perimeter controls do not reach — before a tool call executes.

Guardrail authoring & policy simulator

Write content and safety policies in config, replay a corpus of historical traffic against them, and validate changes before shipping — without touching model weights.

AI-BOM & model lineage

Maintain a signed inventory of every model, fine-tune, adapter, and embedding in your stack — the AI equivalent of an SBOM for compliance, audit, and incident response.

Red-team suite & scheduled eval runs

Run structured adversarial campaigns against your own models on a schedule. Surface regressions in jailbreak resistance and safety before they reach production or an external researcher.

How Sandworm AI Security works

  1. 1

    Intercept at the LLM gateway

    Traffic from your application passes through Sandworm AI Security's inline scanning layer before reaching the model API. No SDK changes are required — configure your LLM gateway endpoint and scanning begins immediately.

  2. 2

    Scan prompts and outputs against your policy

    Each request and response is evaluated against your configured guardrail policies: injection patterns, sensitive-data classifiers, topic blocklists, and jailbreak signatures. Violations are blocked or flagged based on the action you configure per rule.

  3. 3

    Gate agent tool calls via the agent firewall

    For agentic workflows, every tool call is checked against an allowlist before execution. Calls outside the permitted scope are denied and logged, limiting blast radius if a prompt-injection attack succeeds.

  4. 4

    Log, alert, and maintain the AI-BOM

    Every scan, violation, and tool-call decision is written to a tamper-evident log. Model versions, adapters, and embeddings are automatically tracked in a signed AI-BOM you can export for audits.

Built for…

Teams shipping LLM features to production

Add runtime guardrails and scanning without slowing the release cadence your team already has. Drop in the gateway endpoint and go.

AI governance and model-risk review

Give compliance and legal a signed AI-BOM and a policy audit trail they can hand to auditors, regulators, or enterprise procurement reviewers.

Security teams responsible for agentic pipelines

Agentic AI can call APIs, read files, and send messages. The agent firewall enforces a least-privilege boundary so a compromised prompt cannot pivot across your environment.

Red-teaming your own models before someone else does

Run structured adversarial campaigns on a schedule and track coverage against known jailbreak taxonomies. Find your own regressions before an external researcher does.

Integrations

  • LLM gateways
  • OpenAI
  • Anthropic
  • Azure OpenAI
  • App SDKs
  • LangChain
  • LlamaIndex
  • REST / HTTP proxy

Frequently asked questions

How does Sandworm AI Security integrate with my existing LLM application?

Sandworm AI Security runs as a Docker container acting as an inline proxy between your application and the model API. Change one environment variable — swap the endpoint URL from the model provider to the Sandworm AI Security gateway — and scanning begins. No SDK changes are required for prompt and output scanning. Agentic pipelines require a lightweight hook into your framework's tool-dispatch layer to enable the agent firewall.

Does Sandworm AI Security log my prompts or model outputs?

By default, Sandworm AI Security records structured violation metadata — rule name, matched classifier, action taken — but not the raw prompt or response text. Full content logging is an opt-in setting and, when enabled, stores data within your own deployment boundary. Sandworm has no access to your prompt or response content at any tier.

What does Sandworm AI Security cost?

Sandworm AI Security is available in the Sandworm platform bundles and as a standalone add-on. See /pricing for current bundle and à-la-carte rates. There is no per-token or per-request surcharge layered on top of what your model provider already charges you.

Why not rely on the safety filters the model API already provides?

Provider safety filters are inside the model and not configurable: you cannot adjust policy rules, change the action from block to flag, add custom classifiers, or export an audit log you own. Sandworm AI Security runs outside the model and gives you all of that. It also covers multi-model stacks and self-hosted deployments where provider filters don't exist.

Does jailbreak detection cover indirect prompt injection?

Yes — indirect injection is a first-class detection category. The scanner inspects content retrieved from external sources (documents, web pages, tool output) that is inserted into the context window, not just the user's direct message. Agentic pipelines that retrieve-then-reason are the primary threat model here.

What's next for Sandworm AI Security?

The Mendicant AI engine — currently in development; frontier models handle production today — will add adaptive jailbreak classification that continuously learns from traffic patterns specific to your deployment. Additional planned work includes multi-modal content scanning for vision-enabled models and a self-service red-team corpus builder to help teams harden custom fine-tunes against known attack categories.

The rest of the platform

Also in Sandworm.

CNAPP

CloudGuard

Cloud-native application protection across AWS, Azure, and GCP.

See CloudGuard
SIEM

Sandworm SIEM

Security information and event management with real-time correlation.

See Sandworm SIEM
NGFW

Stillsuit

Packet filter · stateful · NGFW · WAF · IPS — one engine

See Stillsuit

Defend your AI apps.

Add runtime protection, policy guardrails, and an AI-BOM to the models you already run.