Most organizations deploy AI systems without ever testing what they'll do under adversarial input.

Intrenex is an adversarial research lab that tests how AI systems fail — from local LLM deployments to API-connected models — and publishes what we find.

The Security Gap Nobody's Measuring

Organizations are deploying AI models — behind their firewalls, through cloud APIs, embedded in internal tools — and calling them secure because they passed a vendor checklist. Deployed doesn't mean tested.

Default Configurations Ship Vulnerable

Default configurations ship with known vulnerabilities. Guardrails behave differently across model versions. The deployment guide doesn't cover what an attacker would try first.

The Gap Is Invisible Until It's Not

The distance between 'we deployed a model' and 'we know how that model fails' is where real risk lives. Most organizations haven't closed it — many don't know it exists.

The Attack Surface Exists Wherever the Model Accepts Input

Whether a model runs on local hardware or behind a cloud API, it accepts natural language input — and that input is an attack surface. If the model hasn't been tested against structured adversarial inputs, the assumption of safety is untested regardless of where it's deployed.

What the Research Is Revealing

The Intrenex Lab runs structured adversarial simulations against AI systems — probing for jailbreaks, prompt injections, and safety boundary failures across local deployments, API integrations, and hybrid architectures. We document everything. What broke. What held. What surprised us.

  • Controlled Sandbox: Docker provides the isolation layer for all testing. Ollama runs inside a Docker container for local model inference. PyRIT, Promptfoo, and all adversarial tooling run inside separate Docker containers. API-based testing is conducted through controlled endpoint configurations. All test environments mirror real enterprise setups.
  • Structured Methodology: Simulations follow a repeatable cycle — threat modeling, adversarial execution, log analysis, and published findings. Every test is reproducible.
  • Real Framework Alignment: All findings are mapped against NIST AI RMF, OWASP LLM Top 10, and MITRE ATLAS — the same frameworks enterprise security teams are measured against.
Promptfoo Scan — Llama-3.1 8B via OllamaComplete
Total probes60
Attack success rate48.33%
Meta-Agent strategy ASR90.00%
Baseline strategy ASR10.00%
MITRE ATLAS: System Prompt Disclosure — FAILED
OWASP LLM01 Prompt Injection — FAILED
OWASP LLM07 System Prompt Leakage — FAILED

Adversarial Assessment V1 — Llama-3.1 8B Instruct · Feb 2026