Model Training

This page describes Step 4 in the dAIve web app.

Starting training

Go to Model Training and click Start Training.

The page contains:

a training progress chart (left panel)
a data split panel (right panel, top)
a results table (right panel, bottom)

Before starting:

confirm that Step 1 inputs and outputs are final
confirm that Step 3 validation settings are intentional
confirm that the selected model family still fits the task

Data split controls (right panel)

In Automatic Split, dAIve creates train / validation / test files from the training dataset. In Manual Upload, these controls do not modify your uploaded validation/test files.

Random Seed

controls the deterministic shuffle used for auto split
same dataset + same split settings + same seed => same split
changing the seed is useful to check whether results are stable across different random partitions

Extrapolation Risk warning

After each auto split, dAIve checks numeric input columns:

it calculates the min/max range seen in the train split
it counts values in validation/test that are outside that train range
if such values exist, a warning appears in the split panel

This warning means the model may need to extrapolate on those samples.

Force boundary coverage (advanced)

this option is strict opt-in and is off by default
when enabled, dAIve can move feature min/max boundary rows into train
this can reduce extrapolation risk, but it can also make validation/test metrics look more optimistic

Safety behavior:

disabled in time-series mode (to avoid leakage)
boundary candidates are deduplicated by row index
only a limited share of train rows can be forced (hard cap: up to 50% of planned train rows)

During training

Available behavior:

live training progress on the chart
Stop Training if the run should be canceled
plot updates while the run is active
metrics update after training finishes

Use Stop Training when:

the run was started with clearly wrong settings
the selected data split is invalid for the intended comparison
you need to cancel a long optimizer or batch run

Result panels

Depending on the configuration, the results table can show several panels. Use the arrow navigation to cycle between them.

Standard result panels

Training Results — metrics on the training set
Validation Results — metrics on the validation set
Test Results — metrics on the held-out test set

When multiple outputs exist, results are shown both as averages and per output.

Cross Validation Results

When cross-validation reporting is active, dAIve shows metrics as average ± standard deviation across all folds.

Hyperparameter Tuning (Optuna)

When the optimizer was used, a tuning results panel appears with:

Best Metric and the metric name
CV Std Dev (when cross-validation was active)
Trials Completed
Validation Strategy (e.g. 5 folds (Stratified K-Fold))

When you compare runs, keep these constant if possible:

the same target outputs
the same split strategy
the same validation mode
the same feature set

Exporting models

After successful training, the current trained model can be exported as a .dvm file.

Export a model when:

you want to run prediction without opening the full project
you want to archive a selected trained model separately
you want to move a trained model into a prediction-only workflow

Metrics & Interpretation

The training results are easier to compare when you know which metric fits the current task and validation setup.

Possible result panels

Depending on the selected validation setup, dAIve can show:

Training Results
Validation Results
Test Results
Cross Validation Results
Hyperparameter Tuning (Optuna) (when the optimizer was used)

Use the arrow navigation to cycle between panels when multiple are available. When multiple outputs exist, both average and per-output views are provided.

Classification

Accuracy

share of correct predictions
higher is better

Regression

MSE

squared error
lower is better

RMSE

same unit as the target
lower is better

MAE

average absolute error
lower is better

MAPE

percentage error
lower is better
unstable near zero

R Squared

explained variance
higher is better

Cross-validation results

When CV reporting is active, dAIve shows average ± standard deviation for each metric.

Interpretation:

high average and low standard deviation is ideal
lower standard deviation is safer when averages are similar
high standard deviation means unstable performance across splits

Optuna tuning results

When the optimizer was used, the tuning panel shows:

Best Metric — the best value found during the search, with the metric name
CV Std Dev — standard deviation across folds (only when CV was active)
Trials Completed — total number of trials that ran
Validation Strategy — the strategy used (e.g. 5 folds (Stratified K-Fold))

Practical reading order:

check whether results came from training, validation, test, or CV
identify the one metric that matters most for the task
compare candidate runs on the same setup only
check stability before choosing a final model

Credit usage

Training in the web app consumes compute credits.

The effective cost grows with:

batch size of the experiment set (each model in a batch run counts separately)
number of optimizer trials (each trial counts as a run)
number of folds and repeats in robust validation (each fold/repeat counts, plus the final model fit)

See the credit cost reference on the Profile page for exact per-operation costs.

Model Training ​

Starting training ​

Data split controls (right panel) ​

Random Seed ​

Extrapolation Risk warning ​

Force boundary coverage (advanced) ​

During training ​

Result panels ​

Standard result panels ​

Cross Validation Results ​

Hyperparameter Tuning (Optuna) ​

Exporting models ​

Metrics & Interpretation ​

Possible result panels ​

Classification ​

Accuracy ​

Regression ​

MSE ​

RMSE ​

MAE ​

MAPE ​

R Squared ​

Cross-validation results ​

Optuna tuning results ​

Credit usage ​