The LiteLLM breach changed how we think about AI gateways. Here are 6 options compared.

On March 24, 2026, a supply chain attack hit LiteLLM - one of the most widely used open-source AI gateways in production. A threat group called TeamPCP exploited a cascading CI/CD compromise that started with Aqua Security's Trivy scanner, stole a PyPI publishing token, and pushed two malicious package versions containing a multi-stage credential stealer.

The malware harvested SSH keys, cloud credentials, Kubernetes secrets, and .env files from every system that installed the compromised versions during a roughly five-hour window. LiteLLM is present in 36% of cloud environments according to Wiz. The incident was assigned CVE-2026-33634 with a CVSS score of 9.4.

This article isn't a post-mortem. It's about the question the incident forced every engineering team to ask: what should you actually look for in an AI gateway in 2026?

Why your gateway choice is a security decision

An AI gateway sits between your application and every LLM provider you use. It holds credentials for each provider, routes every request, and often sees every prompt and response.

The LiteLLM incident made the implication explicit: your gateway is one of the highest-value targets in your infrastructure. How it's deployed — self-hosted package, managed service, edge proxy — determines your attack surface.

We evaluated six gateways across security, observability, routing, platform breadth, ease of adoption, and governance.

1. Respan — full-stack LLM engineering platform

Type: Managed cloud

Respan bundles an AI gateway with observability, evaluations, prompt management, and optimization in one platform. The gateway is managed infrastructure — not a package you install. Provider credentials live on Respan's side, never in your environment variables or CI/CD secrets. The class of supply chain attack that hit LiteLLM doesn't apply here.

With 300+ models across all major providers, Respan matches the breadth of marketplaces like OpenRouter — but at lower cost and with built-in observability and evals that OpenRouter doesn't offer. Built-in evals test model outputs before deployment. Prompt management provides versioning, A/B testing, and optimization. Cost tracking gives engineering and finance shared visibility. SOC 2 compliant, with a free tier available.

Strengths: No supply chain risk; 300+ models at lower cost than aggregators; 99.99% uptime SLA; unlimited rate limits; only option combining gateway + observability + evals + prompt optimization; SOC 2; provider credentials never touch your infra; free tier.
Limitations: Managed-only (no self-hosted option); newest models from smaller providers and open-source ecosystems may appear on OpenRouter first.
Best for: Teams that want one platform for the full LLM lifecycle without assembling separate tools.

2. OpenRouter — managed LLM marketplace

Type: Managed API aggregator

OpenRouter gives you one API key and credit balance for 300+ models. No infrastructure, no provider accounts. Provider pricing is passed through with a 5.5% fee on credit purchases. Automatic fallback handles provider outages.

Strengths: Broadest model selection (300+); zero setup; no inference markup; free tier with 25+ models; BYOK option.
Limitations: Minimal observability; no governance or access controls; no guardrails, evals, or prompt management; 100–150ms added latency; no public SLA.
Best for: Individual developers and early-stage teams prototyping across multiple models.

3. Vercel AI Gateway — frontend-ecosystem gateway

Type: Managed (Vercel platform)

Vercel's AI Gateway provides a unified endpoint for ~100 models, tightly integrated with their deployment platform and open-source AI SDK. Excellent DX within the Next.js ecosystem. No markup on token pricing.

Strengths: Seamless Next.js integration; no token markup; built on the popular AI SDK; load balancing and failover; $5/mo free credit.
Limitations: Limited model selection (~100 models); tightly coupled to Vercel; observability tuned for web vitals, not AI metrics; no semantic caching; serverless timeouts limit agentic workflows; no evals or prompt management.
Best for: Frontend teams on Next.js/Vercel who want the fastest path to shipping AI features.

4. LiteLLM — open-source self-hosted gateway

Type: Open-source (Python)

LiteLLM provides an OpenAI-compatible interface to 100+ providers with ~97M monthly PyPI downloads and 40K+ GitHub stars. The team responded to the March breach with transparency and speed — engaging Mandiant and publishing detailed remediation. They were victims of a broader campaign, not negligent.

But the incident exposed a structural risk: it's a pip package, and any system that ran an unpinned install during the attack window was compromised. Self-hosting also means you own dependency management, security patching, scaling, and incident response.

Strengths: Broadest provider coverage (100+); fully open source; mature codebase; free.
Limitations: PyPI supply chain exposure; Python performance ceiling under concurrency; basic observability; no evals or prompt management; 800+ open GitHub issues.
Best for: Teams with strong DevOps/security capabilities willing to invest in hardening their own infrastructure.

5. Cloudflare AI Gateway — edge caching gateway

Type: Managed edge proxy

Cloudflare AI Gateway routes LLM traffic through their global edge network with response caching, basic analytics, and rate limiting. Setup is a one-line URL change.

Strengths: Zero infrastructure; edge caching reduces duplicate API costs; simple setup; Cloudflare security layer; free tier.
Limitations: Exact-match caching only; 10–50ms proxy latency; limited observability; no guardrails, evals, or prompt tooling.
Best for: Teams already on Cloudflare who want lightweight caching and traffic management.

6. Portkey — gateway + observability control plane

Type: Open-source + managed

Portkey is a gateway with deep observability, guardrails, and governance. It went fully open source on March 24, 2026, and reports processing 1T+ tokens daily across 24,000+ organizations. Every request is logged with full context. Guardrails include PII redaction and jailbreak detection. Governance covers RBAC, audit trails, and budget enforcement.

Strengths: Best-in-class observability, strong governance, 1,600+ model integrations, MCP Gateway for agents.
Limitations: Log-based pricing can get expensive at scale, no built-in evals or prompt optimization, enterprise features gated to higher tiers.
Best for: Teams that need robust observability and governance but already have eval tooling.

Comparison table

	Respan	OpenRouter	Vercel GW	LiteLLM	Cloudflare GW	Portkey
Type	Managed	Managed	Managed	Self-hosted OSS	Managed edge	OSS + Managed
Supply chain risk	None	None	None	High (PyPI)	None	Medium
Observability	Built-in	Minimal	Basic	Basic	Basic	Best-in-class
Evals & testing	Built-in	—	—	—	—	—
Prompt management	Built-in	—	—	—	—	Yes
Cost tracking	Built-in	Per-model	Per-model	Basic	Basic	Built-in
Guardrails	Yes	—	—	—	—	Yes
Reliability / uptime	99.99% SLA	No public SLA	Vercel SLA	Self-managed	Cloudflare SLA	99.99% SLA
Rate limits	Unlimited	Provider limits	Platform limits	Self-managed	Per-gateway limits	Tier-based
Free tier	Yes	Yes	$5/mo credit	Free (OSS)	Yes	Yes
SOC 2	Yes	—	Enterprise	*	Cloudflare	Enterprise

*LiteLLM's SOC 2 was certified via Delve, a compliance startup facing allegations of inaccurate reporting.

Which one should you choose?

Prototyping with multiple models? Start with OpenRouter — lowest friction, 300+ models, no infrastructure.

Building on Next.js/Vercel? The Vercel AI Gateway is the natural fit.

Need a caching layer on Cloudflare? One URL change, edge caching, done.

Observability and governance are the priority? Portkey has the deepest tracing and policy enforcement in the market.

Want full control and have a strong platform team? LiteLLM remains capable — but after March 24, "self-hosted" means "self-secured."

Want one platform for gateway, observability, evals, and prompts? That's the problem Respan was built to solve. Managed architecture, no supply chain risk, SOC 2 built in. Try it free.

The lesson from the LiteLLM breach isn't "don't use open source." The lesson is that AI gateways are critical infrastructure now, and they deserve the same rigor we apply to databases and auth systems. The teams that ask where their credentials live and who controls the release pipeline are the ones that won't be scrambling when the next attack lands.

This article isn't a post-mortem. It's about the question the incident forced every engineering team to ask: what should you actually look for in an AI gateway in 2026?

Why your gateway choice is a security decision

An AI gateway sits between your application and every LLM provider you use. It holds credentials for each provider, routes every request, and often sees every prompt and response.

We evaluated six gateways across security, observability, routing, platform breadth, ease of adoption, and governance.

1. Respan — full-stack LLM engineering platform

Type: Managed cloud

Strengths: No supply chain risk; 300+ models at lower cost than aggregators; 99.99% uptime SLA; unlimited rate limits; only option combining gateway + observability + evals + prompt optimization; SOC 2; provider credentials never touch your infra; free tier.
Limitations: Managed-only (no self-hosted option); newest models from smaller providers and open-source ecosystems may appear on OpenRouter first.
Best for: Teams that want one platform for the full LLM lifecycle without assembling separate tools.

2. OpenRouter — managed LLM marketplace

Type: Managed API aggregator

Strengths: Broadest model selection (300+); zero setup; no inference markup; free tier with 25+ models; BYOK option.
Limitations: Minimal observability; no governance or access controls; no guardrails, evals, or prompt management; 100–150ms added latency; no public SLA.
Best for: Individual developers and early-stage teams prototyping across multiple models.

3. Vercel AI Gateway — frontend-ecosystem gateway

Type: Managed (Vercel platform)

Strengths: Seamless Next.js integration; no token markup; built on the popular AI SDK; load balancing and failover; $5/mo free credit.
Limitations: Limited model selection (~100 models); tightly coupled to Vercel; observability tuned for web vitals, not AI metrics; no semantic caching; serverless timeouts limit agentic workflows; no evals or prompt management.
Best for: Frontend teams on Next.js/Vercel who want the fastest path to shipping AI features.

4. LiteLLM — open-source self-hosted gateway

Type: Open-source (Python)

Strengths: Broadest provider coverage (100+); fully open source; mature codebase; free.
Limitations: PyPI supply chain exposure; Python performance ceiling under concurrency; basic observability; no evals or prompt management; 800+ open GitHub issues.
Best for: Teams with strong DevOps/security capabilities willing to invest in hardening their own infrastructure.

5. Cloudflare AI Gateway — edge caching gateway

Type: Managed edge proxy

Cloudflare AI Gateway routes LLM traffic through their global edge network with response caching, basic analytics, and rate limiting. Setup is a one-line URL change.

Strengths: Zero infrastructure; edge caching reduces duplicate API costs; simple setup; Cloudflare security layer; free tier.
Limitations: Exact-match caching only; 10–50ms proxy latency; limited observability; no guardrails, evals, or prompt tooling.
Best for: Teams already on Cloudflare who want lightweight caching and traffic management.

6. Portkey — gateway + observability control plane

Type: Open-source + managed

Strengths: Best-in-class observability, strong governance, 1,600+ model integrations, MCP Gateway for agents.
Limitations: Log-based pricing can get expensive at scale, no built-in evals or prompt optimization, enterprise features gated to higher tiers.
Best for: Teams that need robust observability and governance but already have eval tooling.

Comparison table

	Respan	OpenRouter	Vercel GW	LiteLLM	Cloudflare GW	Portkey
Type	Managed	Managed	Managed	Self-hosted OSS	Managed edge	OSS + Managed
Supply chain risk	None	None	None	High (PyPI)	None	Medium
Observability	Built-in	Minimal	Basic	Basic	Basic	Best-in-class
Evals & testing	Built-in	—	—	—	—	—
Prompt management	Built-in	—	—	—	—	Yes
Cost tracking	Built-in	Per-model	Per-model	Basic	Basic	Built-in
Guardrails	Yes	—	—	—	—	Yes
Reliability / uptime	99.99% SLA	No public SLA	Vercel SLA	Self-managed	Cloudflare SLA	99.99% SLA
Rate limits	Unlimited	Provider limits	Platform limits	Self-managed	Per-gateway limits	Tier-based
Free tier	Yes	Yes	$5/mo credit	Free (OSS)	Yes	Yes
SOC 2	Yes	—	Enterprise	*	Cloudflare	Enterprise

*LiteLLM's SOC 2 was certified via Delve, a compliance startup facing allegations of inaccurate reporting.

Which one should you choose?

Prototyping with multiple models? Start with OpenRouter — lowest friction, 300+ models, no infrastructure.

Building on Next.js/Vercel? The Vercel AI Gateway is the natural fit.

Need a caching layer on Cloudflare? One URL change, edge caching, done.

Observability and governance are the priority? Portkey has the deepest tracing and policy enforcement in the market.

Want full control and have a strong platform team? LiteLLM remains capable — but after March 24, "self-hosted" means "self-secured."

Want one platform for gateway, observability, evals, and prompts? That's the problem Respan was built to solve. Managed architecture, no supply chain risk, SOC 2 built in. Try it free.

The LiteLLM breach changed how we think about AI gateways. Here are 6 options compared.

Why your gateway choice is a security decision

1. Respan — full-stack LLM engineering platform

2. OpenRouter — managed LLM marketplace

3. Vercel AI Gateway — frontend-ecosystem gateway

4. LiteLLM — open-source self-hosted gateway

5. Cloudflare AI Gateway — edge caching gateway

6. Portkey — gateway + observability control plane

Comparison table

Which one should you choose?

You might also like

Best Helicone alternatives after the Mintlify acquisition

How to get consistent and reproducible LLM outputs (OpenAI, Gemini, Claude, vLLM...)

Best open source LLMs in 2026

Built for AI agents.
Break less.
Ship more.

The LiteLLM breach changed how we think about AI gateways. Here are 6 options compared.

Why your gateway choice is a security decision

1. Respan — full-stack LLM engineering platform

2. OpenRouter — managed LLM marketplace

3. Vercel AI Gateway — frontend-ecosystem gateway

4. LiteLLM — open-source self-hosted gateway

5. Cloudflare AI Gateway — edge caching gateway

6. Portkey — gateway + observability control plane

Comparison table

Which one should you choose?

You might also like

Best Helicone alternatives after the Mintlify acquisition

How to get consistent and reproducible LLM outputs (OpenAI, Gemini, Claude, vLLM...)

Best open source LLMs in 2026

Built for AI agents.
Break less.
Ship more.

The LiteLLM breach changed how we think about AI gateways. Here are 6 options compared.

Why your gateway choice is a security decision

1. Respan — full-stack LLM engineering platform

2. OpenRouter — managed LLM marketplace

3. Vercel AI Gateway — frontend-ecosystem gateway

4. LiteLLM — open-source self-hosted gateway

5. Cloudflare AI Gateway — edge caching gateway

6. Portkey — gateway + observability control plane

Comparison table

Which one should you choose?

You might also like

Best Helicone alternatives after the Mintlify acquisition

How to get consistent and reproducible LLM outputs (OpenAI, Gemini, Claude, vLLM...)

Best open source LLMs in 2026

Built for AI agents. Break less. Ship more.

The LiteLLM breach changed how we think about AI gateways. Here are 6 options compared.

Why your gateway choice is a security decision

1. Respan — full-stack LLM engineering platform

2. OpenRouter — managed LLM marketplace

3. Vercel AI Gateway — frontend-ecosystem gateway

4. LiteLLM — open-source self-hosted gateway

5. Cloudflare AI Gateway — edge caching gateway

6. Portkey — gateway + observability control plane

Comparison table

Which one should you choose?

You might also like

Best Helicone alternatives after the Mintlify acquisition

How to get consistent and reproducible LLM outputs (OpenAI, Gemini, Claude, vLLM...)

Best open source LLMs in 2026

Built for AI agents. Break less. Ship more.

Built for AI agents.
Break less.
Ship more.

Built for AI agents.
Break less.
Ship more.