LangWatch — Observability, Prompts & Evals Platform

Observability, Prompts & EvalsLayer 4Open Source + Cloud

Founded 2023|Amsterdam, Netherlands|2-10

What is LangWatch?

LangWatch is an open-source LLMOps platform focused on testing, evaluating, and monitoring AI agents. Founded in 2023 in Amsterdam by Rogerio Chaves (CTO, ex-Booking.com, ex-Lightspeed) and Manouk Draisma (CEO), the company raised EUR 1M in pre-seed funding led by Passion Capital with participation from Volta Ventures and Antler.

LangWatch's standout differentiator is its Scenario framework — an open-source agent testing library (804 GitHub stars) that enables multi-turn, simulation-based testing of AI agents. Unlike static input/output evaluations, Scenario provides a User Simulator Agent that generates realistic conversations against your agent, with a Judge Agent evaluating pass/fail at every turn. Available in Python, TypeScript, and Go, it works with any agent framework (LangGraph, CrewAI, Pydantic AI, OpenAI, Vercel AI SDK, Google ADK).

The platform combines OpenTelemetry-native tracing, custom evaluators with real-time scoring, prompt and model management with version control, and dataset management that converts production traces into reusable test cases. LangWatch processes 900K+ daily evaluations, has 780K+ monthly package installs, and holds ISO 27001 and SOC2 certifications. It supports self-hosted deployment via Docker and Kubernetes with no feature gating.

Key Features

✓Multi-turn agent simulation testing
✓LLM evaluation with custom evaluators
✓OpenTelemetry-native tracing
✓Prompt & model management
✓Dataset management from production traces

Pros & Cons

Pros

+Unique agent simulation testing via Scenario framework — enables multi-turn, stateful agent testing unmatched by LangSmith or Langfuse
+Fully open-source with MIT license and no feature gating on self-hosted deployments
+All-in-one platform combining tracing, evaluation, prompt management, and PII redaction
+Framework-agnostic with OpenTelemetry-native integration across all major agent frameworks
+Enterprise-ready with ISO 27001, SOC2, GDPR compliance, and on-prem deployment options

Cons

-Very small team (~5 employees) with only EUR 1M pre-seed — long-term viability less proven than well-funded competitors
-Smaller community and fewer tutorials compared to LangSmith or Langfuse ecosystems
-Self-hosting requires Kubernetes/Helm for production with known issues on reverse proxy deployments
-Only supports LLM-as-a-judge evaluations — no heuristic or rule-based evaluation options

LangWatch Pricing

Free trial available

Developer (Free)Free

✓All platform features
✓50K logs/month
✓14-day data retention
✓2 users
✓3 Scenarios + 3 Simulations
✓Community support

GrowthEUR 59/seat/monthmonthly

✓200K events included
✓EUR 0.0005/additional event
✓30-day retention
✓Unlimited lite-users
✓Unlimited evals/simulations
✓Private Slack/Teams support

EnterpriseCustom

✓On-prem/hybrid/self-hosted
✓Custom retention
✓SSO/RBAC
✓Audit logs
✓Uptime & support SLA
✓Forward Deployed Engineer

View official pricing page

Common Use Cases

AI teams building and testing LLM-powered agents

•Agent testing before deployment
•Hallucination detection
•Regression testing for LLM apps
•Production monitoring
•Prompt optimization

Using LangWatch with Respan

LangWatch and Respan serve complementary roles in the AI observability stack. LangWatch excels at pre-production agent testing and evaluation through its simulation framework, while Respan provides production-grade LLM monitoring, cost tracking, and gateway management. Together they cover the full agent lifecycle from testing to production.

✓Use LangWatch Scenario testing during development and Respan monitoring in production for full lifecycle coverage
✓Combine LangWatch agent evaluations with Respan production metrics for comprehensive quality tracking
✓Feed Respan production traces back into LangWatch for regression testing and evaluation
✓Monitor LLM costs through Respan while using LangWatch to evaluate output quality

Complete your agent lifecycle with Respan production monitoring