CompFly AI | The Trust Control Plane for Autonomous Agents

TL;DR

What it does: Sends adversarial prompts at your AI agent (prompt injections, tool misuse, jailbreaks, data exfiltration) and tells you exactly where it breaks.
Who it's for: Teams shipping agents with tools, memory, or multi-turn conversations who need to know their attack surface before production.
Killer features: Multi-turn escalation attacks that mimic real adversaries, agent-specific scenario generation that targets your agent's actual tools and integrations, and scenario import so you can bring your own attack datasets.
Try it: docker compose up -d + a few curl commands. Full OWASP Agentic AI Top 10 coverage in under 2 minutes.

The Problem

Your DevSecOps pipeline has a blind spot.

Traditional application security is well-understood: SAST catches insecure code patterns, SCA flags vulnerable dependencies, DAST probes runtime endpoints, and policy gates block bad builds from shipping. For conventional software, this works.

AI agents break every assumption this pipeline is built on.

An agent's security flaws don't live in the code. They emerge from the interaction between a model, its tools, its memory, and the prompts it receives. Static analysis can't catch a prompt injection that tricks an agent into calling an internal API it shouldn't. Dependency scanning can't detect that an agent will leak PII from its knowledge base when asked the right way. DAST can't simulate a multi-turn conversation that builds rapport before attempting data exfiltration.

This creates real pain points that teams are hitting right now:

Prompt injection isn't a CVE. Security teams have mature tooling for code vulnerabilities. They have almost nothing for adversarial prompt attacks against deployed agents.
Agents are over-privileged by default. They get access to tools, APIs, and data stores with broad permissions because "just give it access" is faster than defining boundaries. Nobody tests whether the agent can be tricked into misusing that access.
Model updates silently change behavior. A provider updates the underlying model, and your agent's safety characteristics shift without any code change on your side. There's no regression test for this.
Traditional security scans produce noise, not signal. SAST/DAST findings for AI agent logic are largely irrelevant. The real risks - tool misuse, data leakage through conversation, multi-turn escalation - don't show up in conventional scans.
Nobody owns agent safety. Is it the ML team? AppSec? DevOps? Most organizations haven't figured this out, so it falls through the cracks.

The gap is specific: between your CI security scans and your production monitoring, there's no layer that tests the agent system - the full combination of model, tools, memory, and conversation flow - against the attacks it will actually face.

Crosswind is that missing layer.

What We're Launching

Crosswind is an open-source security evaluation platform that tests AI agents against adversarial scenarios and produces actionable reports.

Core capabilities:

Evaluate any agent over HTTP, WebSocket, Google A2A, or MCP—no SDK integration required
Run multi-turn adversarial conversations that escalate attacks across turns, mimicking real adversary behavior
Generate agent-specific attack scenarios based on your agent's registered tools, memory configuration, and uploaded context documents
Import your own scenarios in a structured JSON format—bring custom attack datasets, internal red team prompts, or compliance-specific test cases and run them through the same evaluation and judgment pipeline
Score with a tiered judgment pipeline that routes easy cases through keyword detection and reserves expensive LLM judges for ambiguous responses
Map results to compliance frameworks (OWASP Agentic AI Top 10, EU AI Act, NIST AI RMF) automatically
Produce self-contained HTML reports with per-category breakdowns, severity analysis, and prioritized remediation steps

Architecture

Crosswind API

Orchestration & Auth

▶

▼

MongoDB

State

▶

Redis

Queue

Job Distribution

▶

Python

Eval Worker

Attack Logic

▼

Target

Your Agent

Mongo

State

Target

Agent

Three components, cleanly separated:

API server (Go): REST API for agent registration, eval orchestration, report serving. Handles credential encryption (AES-256) and job queuing through Redis.
Eval Worker (Python): Picks jobs off the Redis queue. Manages protocol adapters, rate limiting, the judgment pipeline, multi-turn session state, and crash recovery via checkpointing. Writes results to MongoDB and an analytics backend (DuckDB locally, ClickHouse for production).
Context Processor (Python): Extracts text from uploaded documents (PDFs, spreadsheets) using Docling. Feeds into the scenario generator so attack scenarios can reference your agent's actual domain knowledge.

Why This Is Different: Three Design Choices

1. Multi-turn attacks, not just single prompts

Most eval tools send a prompt, get a response, and score it. Real attackers don't work that way. They build rapport, probe boundaries, and escalate over multiple turns.

Crosswind's multi-turn evaluator runs adversarial conversations of up to 5 turns. Each turn is evaluated independently for the agent's stance (refused, deflected, partial compliance, full compliance) and attack success. Follow-up strategies are chosen adaptively - persist, escalate, reframe, build rapport, or exploit an opening - based on how the agent responded.

2. Tiered judgment pipeline

Running every response through a large LLM judge is slow and expensive. Crosswind's judgment pipeline has four tiers:

Keyword detection: Regex patterns catch obvious refusals and compliance. Confidence threshold: 0.98.
Embedding similarity (optional): Fast semantic matching for borderline cases.
Fast LLM judge (gpt-4o-mini): Compact prompt, confidence threshold 0.85.
Accurate LLM judge (gpt-5.2): Detailed prompt with extensive examples. Final fallback.

3. Agent-specific scenario generation

Generic red team datasets test generic attacks. Your agent's real attack surface is specific: the tools it has access to, the data it can reach, the policies it should follow. Crosswind acts as automated threat modeling, generating attack prompts that target your specific surfaces.

Where Crosswind Fits in Your Pipeline

CI: Quick Eval Gate

Run 60 curated tasks on every PR. Fail build if security score drops.

Staging: Custom Scenarios

Run tasks targeting specific tools and integrations.

Pre-Prod: Deep Audit

Run extensive tasks for compliance evidence (OWASP, EU AI Act).

Production: Continuous

Periodic re-evals to catch regressions from model updates.

Get Involved

Crosswind is Apache 2.0 licensed. Contributions, bug reports, and feedback are welcome.

View on GitHub

If you're shipping an AI agent to production, run Crosswind against it first. You'll learn something.

Introducing Crosswind: Security Evaluation for AI Agents