Model Training
This page describes Step 4 in the desktop app.
Starting training
Go to Model Training and click Start Training.
The page contains:
- a training progress chart (left panel)
- a data split panel (right panel, top)
- a results table (right panel, bottom)
Before starting:
- confirm that Step 1 inputs and outputs are final
- confirm that Step 3 validation settings are intentional
- confirm that the selected model family still fits the task
Data split controls (right panel)
In Automatic Split, dAIve creates train / validation / test files from the training dataset. In Manual Upload, these controls do not modify your uploaded validation/test files.
Random Seed
- controls the deterministic shuffle used for auto split
- same dataset + same split settings + same seed => same split
- changing the seed is useful to check whether results are stable across different random partitions
Extrapolation Risk warning
After each auto split, dAIve checks numeric input columns:
- it calculates the min/max range seen in the train split
- it counts values in validation/test that are outside that train range
- if such values exist, a warning appears in the split panel
This warning means the model may need to extrapolate on those samples.
Force boundary coverage (advanced)
- this option is strict opt-in and is off by default
- when enabled, dAIve can move feature min/max boundary rows into train
- this can reduce extrapolation risk, but it can also make validation/test metrics look more optimistic
Safety behavior:
- disabled in time-series mode (to avoid leakage)
- boundary candidates are deduplicated by row index
- only a limited share of train rows can be forced (hard cap: up to 50% of planned train rows)
During training
Available behavior:
- live training progress on the chart
- Stop Training if the run should be canceled
- plot updates while the run is active
- metrics update after training finishes
Use Stop Training when:
- the run was started with clearly wrong settings
- the selected data split is invalid for the intended comparison
- you need to cancel a long optimizer or batch run
Result panels
Depending on the configuration, the results table can show several panels. Use the arrow navigation to cycle between them.
Standard result panels
- Training Results — metrics on the training set
- Validation Results — metrics on the validation set
- Test Results — metrics on the held-out test set
When multiple outputs exist, results are shown both as averages and per output.
Cross Validation Results
When cross-validation reporting is active, dAIve shows metrics as average ± standard deviation across all folds.
Hyperparameter Tuning (Optuna)
When the optimizer was used, a tuning results panel appears with:
- Best Metric and the metric name
- CV Std Dev (when cross-validation was active)
- Trials Completed
- Validation Strategy (e.g. 5 folds (Stratified K-Fold))
When you compare runs, keep these constant if possible:
- the same target outputs
- the same split strategy
- the same validation mode
- the same feature set
Exporting models
After successful training, the current trained model can be exported as a .dvm file.
Export a model when:
- you want to run prediction without opening the full project
- you want to archive a selected trained model separately
- you want to move a trained model into a prediction-only workflow
Metrics & Interpretation
The training results are easier to compare when you know which metric fits the current task and validation setup.
Possible result panels
Depending on the selected validation setup, dAIve can show:
- Training Results
- Validation Results
- Test Results
- Cross Validation Results
- Hyperparameter Tuning (Optuna) (when the optimizer was used)
Use the arrow navigation to cycle between panels when multiple are available. When multiple outputs exist, both average and per-output views are provided.
Classification
Accuracy
- share of correct predictions
- higher is better
Regression
MSE
- squared error
- lower is better
RMSE
- same unit as the target
- lower is better
MAE
- average absolute error
- lower is better
MAPE
- percentage error
- lower is better
- unstable near zero
R Squared
- explained variance
- higher is better
Cross-validation results
When CV reporting is active, dAIve shows average ± standard deviation for each metric.
Interpretation:
- high average and low standard deviation is ideal
- lower standard deviation is safer when averages are similar
- high standard deviation means unstable performance across splits
Optuna tuning results
When the optimizer was used, the tuning panel shows:
- Best Metric — the best value found during the search, with the metric name
- CV Std Dev — standard deviation across folds (only when CV was active)
- Trials Completed — total number of trials that ran
- Validation Strategy — the strategy used (e.g. 5 folds (Stratified K-Fold))
Practical reading order:
- check whether results came from training, validation, test, or CV
- identify the one metric that matters most for the task
- compare candidate runs on the same setup only
- check stability before choosing a final model
