Creates a new empty dataset. Dataset creation is a two-step process:
Step 1 — Create the dataset (this endpoint)
Returns a datasetUploadInfo object containing a signed uploadUrl and required extraHeaders.
Step 2 — Upload your CSV via the signed URL
PUT your labelled CSV file directly to the uploadUrl. The extraHeaders field contains
additional headers required for the request (e.g. X-Goog-Content-Length-Range).
The signed URL is valid for 1 hour. Maximum file size: 500 MB.
curl -X PUT \
-H "Content-Type: text/csv" \
-H "X-Goog-Content-Length-Range: 10,534773760" \
--data-binary @/path/to/dataset.csv \
"{uploadUrl}"
After the PUT completes, the dataset moves to PROCESSING automatically.
Poll GET /api/v1/datasets/{datasetKey} until status is COMPLETED.
Obtain a token from POST /api/login. Valid for 1 hour.
Unique name for the dataset
3 - 30"My Training Dataset"
Optional plain-text description
3 - 150Controls data granularity for Cellular Balanced Learning. Start with 10–20. Higher values capture more detail but require more training time. See the Hyperparameter Tuning guide for guidance.
4 <= x <= 100010
Dataset created successfully
"2b0a9a83-d549-4f25-b7ed-8174e0c955cd"
Processing status of a dataset:
| Status | Description |
|---|---|
PENDING | Created, awaiting data upload |
PROCESSING | Data uploaded, preparation in progress |
COMPLETED | Ready for training |
FAILED | Processing failed |
PENDING, PROCESSING, COMPLETED, FAILED URL to poll for dataset processing status
Present when the dataset was created with a large-file upload path.
Use the uploadUrl to PUT your CSV file directly for files over 32 MB (up to 500 MB).