Developer (Free)
Free
- All platform features
- 50K logs/month
- 14-day data retention
- 2 users
- 3 Scenarios + 3 Simulations
- Community support
LangWatch is an open-source LLMOps platform focused on testing, evaluating, and monitoring AI agents. Founded in 2023 in Amsterdam by Rogerio Chaves (CTO, ex-Booking.com, ex-Lightspeed) and Manouk Draisma (CEO), the company raised EUR 1M in pre-seed funding led by Passion Capital with participation from Volta Ventures and Antler.
LangWatch's standout differentiator is its Scenario framework — an open-source agent testing library (804 GitHub stars) that enables multi-turn, simulation-based testing of AI agents. Unlike static input/output evaluations, Scenario provides a User Simulator Agent that generates realistic conversations against your agent, with a Judge Agent evaluating pass/fail at every turn. Available in Python, TypeScript, and Go, it works with any agent framework (LangGraph, CrewAI, Pydantic AI, OpenAI, Vercel AI SDK, Google ADK).
The platform combines OpenTelemetry-native tracing, custom evaluators with real-time scoring, prompt and model management with version control, and dataset management that converts production traces into reusable test cases. LangWatch processes 900K+ daily evaluations, has 780K+ monthly package installs, and holds ISO 27001 and SOC2 certifications. It supports self-hosted deployment via Docker and Kubernetes with no feature gating.
Core capabilities this platform advertises.
What this tool does well, and the limitations to keep in mind.
Pros
Cons
What's included in each plan, and how the tiers compare.
Free
€59/seat/month
Monthly
Custom
Contact sales for a quote
AI teams building and testing LLM-powered agents
LangWatch and Respan serve complementary roles in the AI observability stack. LangWatch excels at pre-production agent testing and evaluation through its simulation framework, while Respan provides production-grade LLM monitoring, cost tracking, and gateway management. Together they cover the full agent lifecycle from testing to production.
Top companies in Observability, Prompts & Evals you can use instead of LangWatch.
Respan
LLM tracing, evals, and gateway
LangSmith
Trace visualization for LLM chains
Weights & Biases
ML experiment tracking
MLflow
OpenTelemetry-native tracing
Arize AI
ML observability with LLM support
Langfuse
Open-source LLM observability
Helicone
Datadog LLM
LLM monitoring within Datadog platform
Traceloop
OpenTelemetry
Braintrust
Real-time LLM logging and tracing
HoneyHive
Prompt management
Patronus AI
Automated LLM evaluation platform
Promptfoo
Phoenix
OpenTelemetry-based LLM and agent tracing
Portkey
Humanloop
Sentry
DeepEval
Ragas
RAG-specific evaluation framework
Galileo AI
LLM output quality evaluation
PromptLayer
Maxim AI
Distributed tracing for LLM and agent apps
Confident AI
DeepEval open-source evaluation framework
Opik
Agenta
Lunary
Future AGI
Multimodal evaluation (text, image, audio, video)
Parea AI
Moda
Hallucination detection
Chamber
ML infrastructure automation
Ashr
Multi-modal synthetic testing
Sentrial
Agent failure root cause analysis
Athina AI
Side-by-side comparisons with other tools in this category.
Companies from adjacent layers in the AI stack that work well with LangWatch.
Last verified: March 27, 2026