Documentation Index
Fetch the complete documentation index at: https://docs.mathfi.ai/llms.txt
Use this file to discover all available pages before exploring further.
These are some examples on how to use the API to perform typical workflows within MathFi.ai. See also API overview for more details on the API.
Auth and smoke test
- Login to obtain a token
- List datasets to test out access
curl -i -XPOST 'https://api.mathfi.ai/api/login' \
--header 'Content-Type: application/json' \
--data-raw '{ "username": "[email protected]", "password": "password"}'
{"access_token":"eyJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJidXR0ZX...","token_type":"Bearer","expires_in":3600,"username":"[email protected]"}
curl -i -XGET 'https://api.mathfi.ai/api/v1/datasets?offset=0&limit=10' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
HTTP/2 200
{"datasets":[{"datasetKey":"feebb52a-f098-4e0e-b1eb-459d833e5aa4","datasetName":"oilplantv4","numberOfBuckets":10,"status":"COMPLETED","createdOn":"2025-10-24T08:35:36Z"},{"datasetKey":"c2bd4dbd-b9e9-4fd2-9e52-36652b082060","datasetName":"oilplantv3","numberOfBuckets":10,"status":"COMPLETED","createdOn":"2025-10-02T18:09:48Z"}],"offset":0,"limit":10,"total":219}
Create a Dataset
- First, create an empty dataset
- Then, add CSV data to the created dataset
- Lastly, poll for progress until
COMPLETED
curl -i -XPOST 'https://{baseUrl}/api/v1/datasets' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN' \
--data '{
"datasetName": "OilPlantAnomalyV22",
"numberOfBuckets": 10
}'
{
"datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
"numberOfBuckets": 10,
"status": "PENDING",
"datasetCreationProgressUrl": "https://{baseUrl}/api/datasets/bbb4d0fb-7287-44b0-860d-d81bea692648",
"datasetUploadInfo": {
"uploadUrl": "https://storage.googleapis.com/...",
"extraHeaders": "X-Goog-Content-Length-Range:10,534773760"
}
}
Pick the datasetUploadInfo > uploadUrl URL and craft a PUT request with the extraHeaders with your CSV data:curl -i -XPUT '{uploadUrl}' \
--header 'X-Goog-Content-Length-Range: 10,534773760' \
--header 'Content-Type: text/csv' \
--header 'Authorization: Bearer $TOKEN' \
--data-binary '@/path/to/dataset-anomaly-gas-oil-plant.csv'
Finally, poll the dataset creation for progress, until COMPLETED (or FAILED):curl --location 'https://{baseUrl}/api/v1/datasets/bbb4d0fb-7287-44b0-860d-d81bea692648' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
{
"datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
"datasetName": "OilPlantAnomalyV22",
"status": "COMPLETED",
"createdOn": "2025-10-08T21:43:25Z",
"numberOfBuckets": 10
}
Key values to note for next stages:
- The
datasetKey uniquely identifies the dataset
- The
datasetName is a descriptive name to distinguish it from others or versions of the same underlying data
- The
numberOfBuckets is key hyperparameter that determines how data is shaped for training
Train a Model
Once the dataset is created and in COMPLETED state, training can be executed with the following commands and endpoints.
- Target
0.99 performance via performanceThreshold (see Glossary for metric definitions)
- Pass
19 as the scaling factor
curl -i -XPOST 'https://{baseUrl}/api/v1/training/datasets/bbb4d0fb-7287-44b0-860d-d81bea692648' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN' \
--data '{
"performanceThreshold": 0.99,
"scalingFactor": 19
}'
{
"trainingJobKey": "8ae21b68-a65d-4993-974e-264f742457eb",
"datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
"status": "PENDING",
"trainingJobProgressUrl": "https://{baseUrl}/api/v1/training/8ae21b68-a65d-4993-974e-264f742457eb"
}
Training progress monitoring can be done by polling the trainingProgressUrl directly or appending /progress to it for a more detailed view:curl -i -XGET 'https://{baseUrl}/api/v1/training/8ae21b68-a65d-4993-974e-264f742457eb' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
{
"trainingJobKey": "8ae21b68-a65d-4993-974e-264f742457eb",
"datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
"status": "RUNNING",
"scalingFactor": 19,
"targetPerformance": 0.90
}
curl -i -XGET 'https://{baseUrl}/api/v1/training/8ae21b68-a65d-4993-974e-264f742457eb/progress' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
{
"status": "RUNNING",
"jobs": [
{
"trainingJobKey": "23de1e06-a126-4719-90a8-8874d9f63a92",
"algorithm": "BSEV02",
"status": "COMPLETED",
"latestPerformance": 0.9000122,
"recentPerformances": [0.8674287, 0.89866984, 0.89997154, 0.89997154, 0.9000122]
},
{
"trainingJobKey": "54af946f-1f89-4560-99ef-f4934bfab237",
"algorithm": "BSEV01",
"status": "RUNNING",
"latestPerformance": 0.59744537,
"recentPerformances": [0.59744537, 0.59744537, 0.59744537, 0.59744537, 0.59744537]
}
]
}
As it can be observed in the output, there are 4 algorithms (wrapped into trainingJobs) and one overall status for whole training process launched.
- The job/algorithm with the highest achieved performance will be the winner (first getting to
COMPLETED status)
- When a job doesn’t reach the target performance in the timeout time (
1h currently), it’s marked with TIMEOUT status
- When a job doesn’t progress at least
0.02 for a given period (5 minutes) it’s stopped and marked as NOT_COMPLETED (stalled)
- When a job has an irrecoverable failure is marked as
FAILED and stopped
- The overall process stops only when all jobs are out of
PROCESSING. As long as 1 job completes successfully, the overall process is COMPLETED
- A champion model is created from the successful training, marking it as the best performing model so far for the dataset. If training is repeated with different hyperparameters and better performance is achieved, the champion model is overridden with the best one
The completed process response from progress monitoring endpoint looks like this:
{
"status": "COMPLETED",
"jobs": [
{
"trainingJobKey": "23de1e06-a126-4719-90a8-8874d9f63a92",
"algorithm": "BSEV02",
"status": "COMPLETED",
"latestPerformance": 0.9000122,
"recentPerformances": [0.8674287, 0.89866984, 0.89997154, 0.89997154, 0.9000122]
},
{
"trainingJobKey": "54af946f-1f89-4560-99ef-f4934bfab237",
"algorithm": "BSEV01",
"status": "NOT_COMPLETED",
"latestPerformance": 0.6260017,
"recentPerformances": [0.6260017, 0.625961, 0.6260017, 0.625961, 0.6260017]
},
{
"trainingJobKey": "c46820a8-f5fc-49ab-aa34-bdf71186eff1",
"algorithm": "BFIF01",
"status": "NOT_COMPLETED",
"latestPerformance": 0.7718656,
"recentPerformances": [0.7715808, 0.7716215, 0.77174354, 0.77178425, 0.7718656]
},
{
"trainingJobKey": "ea4bd529-e886-4167-b800-8a97a147396b",
"algorithm": "BSIX01",
"status": "NOT_COMPLETED",
"latestPerformance": 0.87388635,
"recentPerformances": [0.87388635, 0.87388635, 0.87388635, 0.87388635, 0.87388635]
}
]
}
The winner is BSEV02 with latest performance 0.90000. Retrieve the training detail to get the model key:
curl -i -XGET 'https://{baseUrl}/api/v1/training/8ae21b68-a65d-4993-974e-264f742457eb' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
{
"trainingJobKey": "8ae21b68-a65d-4993-974e-264f742457eb",
"datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
"status": "COMPLETED",
"scalingFactor": 19,
"targetPerformance": 0.9,
"achievedPerformance": 0.89604133,
"trainingPerformance": 0.9000122,
"testPerformance": 0.8920705,
"modelKey": "8a86bd31-ff0c-47c1-bdb8-d08331904508"
}
The model with modelKey=8a86bd31-ff0c-47c1-bdb8-d08331904508 can then be used to run predictions on unseen data.
Run Predictions
After getting the desired champion model trained, it’s time to run predictions with it. Before running a prediction check the model details:
curl -i -XGET 'https://{baseUrl}/api/v1/models/8a86bd31-ff0c-47c1-bdb8-d08331904508' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
{
"modelKey": "8a86bd31-ff0c-47c1-bdb8-d08331904508",
"datasetKey": "bbb4d0fb-7287-44b0-860d-d81bea692648",
"version": 1,
"achievedPerformance": 0.89604133,
"trainingPerformance": 0.9000122,
"testPerformance": 0.8920705,
"createdOn": "2025-10-08T22:21:06Z"
}
Then, create a prediction using this model. The blind CSV file to run prediction on must not be more than 32MB in size:
curl -i -XPOST 'https://{baseUrl}/api/v1/predictions/models/8a86bd31-ff0c-47c1-bdb8-d08331904508' \
--header 'Content-Type: text/csv' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN' \
--form '_file=@"/path/to/blind-anomaly-gas-oil-plant.csv"'
{
"predictionKey": "c87c26cd-8b5f-4e5f-8fd9-7b3a5f281c34",
"status": "PENDING",
"predictionCreationProgressUrl": "https://{baseUrl}/api/v1/predictions/c87c26cd-8b5f-4e5f-8fd9-7b3a5f281c34"
}
Poll the url for progress on the prediction result:
curl -i -XGET 'https://{baseUrl}/api/v1/predictions/c87c26cd-8b5f-4e5f-8fd9-7b3a5f281c34/progress' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer $TOKEN'
Once the status moves to COMPLETED, the result CSV can be downloaded:
curl --location 'https://{baseUrl}/api/v1/predictions/c87c26cd-8b5f-4e5f-8fd9-7b3a5f281c34/download' \
--header 'Content-Type: text/csv' \
--header 'Authorization: Bearer $TOKEN'
Hyperparameter Tuning
Hyperparameter tuning is done by combining the above endpoints in order to incrementally obtain better performance by changing the hyper-parameters:
- Create new versions of dataset with increased/decreased number of buckets
- Re-train using the training endpoints and progress monitoring observation with more or less scaling factor until the result is
COMPLETED successfully
- Gradually increase or reduce the performance threshold until no more improvements are visible while obtaining
COMPLETED results