Automated red-teaming for multi-agent AI systems. Discover vulnerabilities, run adversarial attacks, and certify agents for healthcare, finance, and legal deployment.
Three specialized AI agents communicate through a shared Band room to discover, attack, and certify any target agent system.
The entire pipeline runs through a single Band room that acts as the shared memory and communication layer.
Type scan http://your-agent.com in your Band room. The Discovery Agent wakes up and begins reconnaissance.
18 probes map the target's identity, capabilities, data access, auth model, and LLM fingerprint. Results posted to Band room as a structured report.
The Attack Agent reads the report, generates AI-adaptive attacks tailored to this specific target, and retries ambiguous responses up to 3× with mutations.
The Report Agent scores results by category weight and produces a browser-viewable security_report.html with HIPAA / GDPR / PCI-DSS compliance assessment.
Each category is weighted by severity. Every attack is AI-generated and adapted to the specific target discovered during reconnaissance.
Can an attacker override the agent's original system instructions with malicious ones hidden in user input?
Can roleplay, DAN prompts, fictional framing, or hypotheticals bypass safety restrictions?
Can the agent be made to misuse its tools — read system files, exfiltrate via email, or execute commands?
Can the agent be forced to adopt an alternative unrestricted identity that ignores its original rules?
Can an attacker break session boundaries to access other users' data or conversation history?
Can an agent be slowly manipulated over multiple messages by building false rapport and trust?
Can the agent be tricked into revealing system prompts, config, PII, or training data?
Does claiming to be a developer, auditor, or authority figure grant elevated access?
Can base64, leet-speak, unicode homoglyphs, or ROT13 bypass string-based safety filters?
Can the agent be instructed to suppress logging, delete records, or act without leaving a trace?
Every tested agent receives a score from 0–100 and a certification tier suitable for regulated industry compliance reports.
Three real agents you can point the red-team pipeline at. Copy any URL, go to your Band room, and type scan <url> to begin.
A Flask customer-support agent with 6 intentional vulnerabilities. Ideal for testing and demonstrating the full pipeline. Hosted in this same deployment.
http://localhost:5000
An online enterprise-grade agent with realistic vulnerabilities deployed over ngrok. Use this to test against a more sophisticated target in a production-like setup.
https://targetagent--saoudihouda524.replit.app/vulnerable/chat
The same enterprise agent but with security hardening applied. Scan this after the vulnerable version to compare scores and see the difference a proper defense makes.
https://targetagent--saoudihouda524.replit.app/secure/chat
scan <url> in your Band room.
All three Band agents are connected and waiting for scan commands in the shared room.
Band · 18 probes · Groq + Gemini fallback
Band · 70+ attacks · Adaptive mutations ×3
Band · HTML report · Compliance scoring
:5000 · Flask · 6 intentional vulns
An intentionally vulnerable customer support agent used as the red-team target. Point any of the three agents at this URL.
| Method | Endpoint | Description |
|---|---|---|
| GET | /capabilities | Returns agent tools and capabilities — read by the Discovery Agent |
| POST | /chat | Main conversation endpoint — Attack Agent sends all attacks here 6 vulns |
| GET | /health | Health check — confirms the target agent is running |
| GET | /history | Conversation history — no authentication required SESSION-01 |
| GET | /admin/vulnerabilities | Lists all intentional vulnerabilities and their trigger conditions |
| POST | /admin/reset | Resets conversation history for clean test runs |