Skip to main content
POST
/
api
/
v1
/
datasets
Create dataset
curl --request POST \
  --url https://api.mathfi.ai/api/v1/datasets \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "datasetName": "Example Dataset",
  "numberOfBuckets": 10
}
'
{
  "datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
  "numberOfBuckets": 10,
  "status": "PENDING",
  "datasetCreationProgressUrl": "https://api.mathfi.ai/api/v1/datasets/bbb4d0fb-7287-44b0-860d-d81bea692648",
  "datasetUploadInfo": {
    "uploadUrl": "https://storage.googleapis.com/...",
    "extraHeaders": "X-Goog-Content-Length-Range:10,534773760"
  }
}

Authorizations

Authorization
string
header
required

Obtain a token from POST /api/login. Valid for 1 hour.

Body

application/json
datasetName
string
required

Unique name for the dataset

Required string length: 3 - 30
Example:

"My Training Dataset"

description
string | null

Optional plain-text description

Required string length: 3 - 150
numberOfBuckets
integer

Controls data granularity for Cellular Balanced Learning. Start with 10–20. Higher values capture more detail but require more training time. See the Hyperparameter Tuning guide for guidance.

Required range: 4 <= x <= 1000
Example:

10

Response

Dataset created successfully

datasetKey
string<uuid>
required
Example:

"2b0a9a83-d549-4f25-b7ed-8174e0c955cd"

status
enum<string>
required

Processing status of a dataset:

StatusDescription
PENDINGCreated, awaiting data upload
PROCESSINGData uploaded, preparation in progress
COMPLETEDReady for training
FAILEDProcessing failed
Available options:
PENDING,
PROCESSING,
COMPLETED,
FAILED
datasetCreationProgressUrl
string
required

URL to poll for dataset processing status

datasetUploadInfo
object
required

Present when the dataset was created with a large-file upload path. Use the uploadUrl to PUT your CSV file directly for files over 32 MB (up to 500 MB).

numberOfBuckets
integer