Prompt injection
Direct and indirect, including multi-turn jailbreaks and payloads buried in the documents, web pages, and tool output your agent quietly trusts.
Redproof attacks your LLM and agent products the way a real adversary would, then hands you the evidence your EU AI Act self-assessment needs. Far better you find the holes now than a regulator or a customer's procurement team finds them later.
Most of the EU AI Act becomes operative for high-risk and general-purpose AI systems. By then adversarial testing is no longer a nice-to-have. Procurement teams in regulated sectors already ask for a red-team report before they sign anything. Around 3,200 Dutch businesses fall directly in scope, and the work itself takes weeks. The real risk is leaving it too late.
The base model is rarely the weak point. The real gaps sit in your prompts, your retrieval pipeline, and the tools your agent is allowed to call. That is where we go looking.
Direct and indirect, including multi-turn jailbreaks and payloads buried in the documents, web pages, and tool output your agent quietly trusts.
Coaxing the system into leaking secrets, training data, internal errors, or other users' information.
Unsafe or unescaped output — XSS, SSRF, leaked stack traces — that the app around the model trusts and renders.
Getting an agent to call APIs, move money, or take actions it was never meant to take.
Extracting your system prompt and the tool schema that is supposed to protect it.
The exploits specific to your product — your pricing, your workflow, your data boundaries — beyond the OWASP checklist.
We agree the target system, the threat model, and the rules of engagement, all in writing.
Automated breadth, then manual depth where the real findings hide.
Each finding ranked by severity, with a working proof of exploit.
Plain-language findings mapped to OWASP LLM and the relevant EU AI Act articles.
You patch, we re-run, and your evidence shows the fix held.
Most teams start with a Full Engagement, then move to a quarterly retainer as the product changes.
Enterprise vendors start around €15k for one engagement, and a junior often does the actual testing. With Redproof the person who understands your system is the person testing it. Priced for the stage you are at, not theirs.
Redproof is the practice of a production AI engineer who builds and evaluates large models for a living. Most security shops point a tool at your endpoint and email you the printout. Redproof works the way an attacker actually does, because building and breaking these systems is the day job here, not a side service.
The person who scopes your engagement is the person who runs the attacks and writes the report. No handoff to a junior, no account manager in the middle. As the work grows, Redproof brings in vetted specialists for the larger jobs, but the standard holds on every test: senior hands, start to finish.
Start with a short call to scope the work. You get a fixed quote within a day, and findings in your hands a couple of weeks later.