Create an experiment

Create an experiment and start asynchronous workflow execution over a dataset.

Headers

AuthorizationstringRequired

Bearer token. Use Bearer YOUR_API_KEY for API key auth or Bearer <JWT> for dashboard auth.

Request

Create and asynchronously run an experiment. Provide exactly one scoring path: evaluator_ids/evaluator_slugs or evaluator_workflow_ids.

dataset_idstringRequired
Dataset ID to process.
workflowlist of objectsRequired
Workflow tasks to run for each dataset row.
evaluator_idslist of stringsOptional

Preferred evaluator identifiers for scoring. Mutually exclusive with evaluator_workflow_ids.

evaluator_slugslist of stringsOptional

Backward-compatible alias for evaluator_ids. If both are provided, evaluator_ids takes precedence.

evaluator_workflow_idslist of stringsOptional

WorkflowVersion IDs configured for eval-only scoring. Mutually exclusive with evaluator IDs/slugs.

experiment_idstringOptional

Optional client-provided experiment ID. The backend generates one when omitted.

namestringOptional
Experiment name.
descriptionstringOptional
Experiment description.
span_workflow_namestringOptionalDefaults to workflow
Root workflow span name.
enable_tracingbooleanOptionalDefaults to true
Whether to create trace logs.
batch_sizeintegerOptionalDefaults to 100
Batch size for processing.
concurrencyintegerOptionalDefaults to 15
Number of concurrent workers.
generation_methodstringOptional
Optional evaluation generation method override.

Response

Created experiment.
idstring
Experiment ID.
namestring
Experiment name.
statusstring
Experiment execution status.
created_atdatetime
descriptionstring or null
Experiment description.
datasetstring or null
Dataset ID associated with the experiment.
dataset_idstring or null
Dataset ID associated with the experiment.
dataset_namestring or null
Dataset name, when available.
workflow_countinteger
Number of workflow steps.
progressdouble
Execution progress percentage.
started_atdatetime or null
completed_atdatetime or null
tagslist of maps from strings to any
Tags attached to the experiment.
workflowlist of objects
Workflow tasks configured for the experiment.
evaluator_idslist of strings
Evaluator IDs used for scoring.
evaluator_slugslist of strings

Backward-compatible evaluator identifiers stored by the backend.

evaluator_workflow_idslist of strings

Eval-only workflow versions used for scoring.

batch_sizeinteger
concurrencyinteger
enable_tracingboolean
error_messagestring or null
Failure details when status is failed.

Errors

400
Bad Request Error
401
Unauthorized Error
404
Not Found Error