How To Build AI Red Teams That Actually Work

Jeff Pollard, VP, Principal Analyst

Sep 30 2025

Generative AI is everywhere. It’s in your customer support workflows, embedded in your analytics dashboards, and quietly powering your internal tools. But while the business rushes to deploy, security teams are left trying to secure systems they didn’t design, didn’t know about, and can’t easily test. That’s where AI red teaming comes in.

AI red teaming blends offensive security tactics with safety evaluations for bias, toxicity, and reputational harm. It’s messy, fragmented and, most of all, necessary. Why? Because multimodal attacks are already here. GenAI now spans text, image, video, and audio. That means new attack vectors. If your red team isn’t testing multimodal inputs, you’re leaving gaps. Visual content can bypass filters, inject payloads, or trigger unintended behaviors.

Red Teaming Satisfies Stakeholders And Protects GenAI Investments

AI red teaming supports more than security. It delivers governance, compliance, and customer trust. AI red teaming should uncover security issues and bias, fairness, and privacy problems. This also helps meet GDPR and EU AI Act requirements. Use the following to get started on an AI red team that actually works:

AI red teaming is more than prompt bombing. Spamming prompts is a tactic, not a strategy. The real value comes from using AI against AI via “agentic red teaming.” Agentic red teaming uses adaptive multiflow agents that mimic adversarial behavior to uncover systemic weaknesses. These bot battles test more than the model and the prompt. They can assess the application stack: infrastructure, APIs, the SDLC, and everything in between.
Red-team before (and after) the system is fully built. You won’t always have a fully built system to test. That’s OK. Premature red teaming on prototypes will surface critical issues and help you build internal momentum. Jailbreaking a proof-of-concept agent might not give you a full risk profile, but it can spotlight systemic flaws and justify deeper investment.
Threat models must match the application context. A chatbot, a drug discovery engine, and a help desk tool may all use generative AI, but they don’t share the same risk profile. Threat modeling must reflect the specific use case.
Infrastructure still matters. Prompt jailbreaking grabs headlines. But attackers still target infrastructure, APIs, and CI/CD pipelines. These components often go untested due to cost constraints. That’s a mistake. You must assess the full stack. As one interviewee put it, “replace the word ‘AI’ with any software, and you would assess these controls.”
Shift to probabilistic risk modeling. AI is inconsistent — a prompt can succeed today and fail tomorrow. You need probabilistic testing. Run prompts multiple times, track success rates, and report risk as a probability. This is an enormous shift from the old “found it, fix it” mentality with traditional penetration testing.
Tie red teaming to revenue. Security leaders often struggle to show business value. AI red teaming is a clear opportunity. Preventing embarrassment protects brand reputation. Customers want safety reports. Regulators demand governance. AI red teaming delivers all of these outcomes. Use it to prove your value.

Red Teaming Costs Vary Widely — Read The Full Report To Get The Most For The Money

Expect to pay from $25,000 for basic automated testing to $200,000 for full stack assessments. Scope, scale, and methodology drive pricing. Incomplete testing leaves blind spots. Don’t cheap out. But also, don’t engage in AI red teaming without being prepared. We can help! For a complete playbook on structuring AI red team engagements, selecting vendors, and aligning testing with business goals, read Use AI Red Teaming To Evaluate The Security Posture Of AI-Enabled Applications.

Come To Security & Risk Summit 2025

Our Security & Risk Summit runs November 5–7 in Austin, Texas. I’ll be delivering a session about “Demystifying AI Red Teaming” in the application security track, starting at 2:35 p.m. Central Time on November 6. See you there!

To discuss our recommendations further, reach out to schedule a guidance session.

Get The Insights At Work Newsletter

Country*

Yes, I’d like to receive Forrester’s Insights At Work newsletter and receive occasional survey invitations and marketing communications.

Thanks for signing up.

Stay tuned for updates from the Forrester blogs.

Red Teaming Satisfies Stakeholders And Protects GenAI Investments

Red Teaming Costs Vary Widely — Read The Full Report To Get The Most For The Money

Come To Security & Risk Summit 2025

Categories

Get The Insights At Work Newsletter

Thanks for signing up.

Top 10 Emerging Technologies Fueling Growth

Discover the top technologies that will reshape industries. Hear Forrester’s Brian Hopkins on ROI, use cases, and strategies for quick wins and lasting growth.

Introducing Forrester’s ServiceNow Services Landscape: Partner Selection Becomes More Complex With Enterprise Workflows and Agentic Transformation

The Real Deal: A Black Friday-Inspired RFP Template For Vetting AI SaaS Vendors

Get The Insights At Work Newsletter

Thanks for signing up.