Community (Open Source)
$0
Perpetual
- MIT license
- Up to 10k probes/month
- All core features
- CLI and browser UI
- Self-hosted
- CI/CD integration
Promptfoo is an open-source tool for testing prompts, agents, and RAGs, with AI red teaming, pentesting, and vulnerability scanning for LLMs. Built under MIT license, Promptfoo was originally developed for LLM apps serving over 10 million users in production. The platform compares performance across GPT, Claude, Gemini, Llama, and more with simple declarative configs supporting command line and CI/CD integration. The Community version includes up to 10,000 probes monthly at no charge, with infrastructure costs typically USD 50-500 monthly for hosting and LLM API calls. Developers praise Promptfoo for its speed, quality-of-life features like live reloads and caching, security features including red teaming, and budget-friendly open-source model. However, the CLI-focused approach creates friction for non-technical team members, and the platform lacks end-to-end observability, version control for prompts, and test management features needed for complex production agents.
What this tool does well, and the limitations to keep in mind.
Pros
Cons
What's included in each plan, and how the tiers compare.
$0
Perpetual
$50-500
Per month
Custom
Annual contract
Integrate Promptfoo's open-source LLM testing with Respan to add comprehensive evaluation and red teaming to your AI workflows. Test prompts, agents, and RAGs against security threats and performance benchmarks. With Respan orchestrating Promptfoo alongside your AI services, ensure quality and security before production deployment.
Top companies in Observability, Prompts & Evals you can use instead of Promptfoo.
Respan
LLM tracing, evals, and gateway
LangSmith
Trace visualization for LLM chains
Weights & Biases
ML experiment tracking
MLflow
OpenTelemetry-native tracing
Langfuse
Open-source LLM observability
Arize AI
ML observability with LLM support
Helicone
Traceloop
OpenTelemetry
Datadog LLM
LLM monitoring within Datadog platform
HoneyHive
Prompt management
Braintrust
Real-time LLM logging and tracing
Phoenix
OpenTelemetry-based LLM and agent tracing
Patronus AI
Automated LLM evaluation platform
Humanloop
Portkey
Sentry
DeepEval
Ragas
RAG-specific evaluation framework
LangWatch
Multi-turn agent simulation testing
Galileo AI
LLM output quality evaluation
PromptLayer
Maxim AI
Distributed tracing for LLM and agent apps
Confident AI
DeepEval open-source evaluation framework
Opik
Agenta
Future AGI
Multimodal evaluation (text, image, audio, video)
Lunary
Parea AI
Moda
Hallucination detection
Ashr
Multi-modal synthetic testing
Sentrial
Agent failure root cause analysis
Athina AI
Chamber
ML infrastructure automation
Side-by-side comparisons with other tools in this category.
Companies from adjacent layers in the AI stack that work well with Promptfoo.
Last verified: March 10, 2026