Skip to content

Model Training

This page describes Step 4 in the dAIve web app.

Starting training

Go to Model Training and click Start Training.

The page contains:

  • a training progress chart (left panel)
  • a data split panel (right panel, top)
  • a results table (right panel, bottom)

Before starting:

  • confirm that Step 1 inputs and outputs are final
  • confirm that Step 3 validation settings are intentional
  • confirm that the selected model family still fits the task

Data split controls (right panel)

In Automatic Split, dAIve creates train / validation / test files from the training dataset. In Manual Upload, these controls do not modify your uploaded validation/test files.

Random Seed

  • controls the deterministic shuffle used for auto split
  • same dataset + same split settings + same seed => same split
  • changing the seed is useful to check whether results are stable across different random partitions

Extrapolation Risk warning

After each auto split, dAIve checks numeric input columns:

  • it calculates the min/max range seen in the train split
  • it counts values in validation/test that are outside that train range
  • if such values exist, a warning appears in the split panel

This warning means the model may need to extrapolate on those samples.

Force boundary coverage (advanced)

  • this option is strict opt-in and is off by default
  • when enabled, dAIve can move feature min/max boundary rows into train
  • this can reduce extrapolation risk, but it can also make validation/test metrics look more optimistic

Safety behavior:

  • disabled in time-series mode (to avoid leakage)
  • boundary candidates are deduplicated by row index
  • only a limited share of train rows can be forced (hard cap: up to 50% of planned train rows)

During training

Available behavior:

  • live training progress on the chart
  • Stop Training if the run should be canceled
  • plot updates while the run is active
  • metrics update after training finishes

Use Stop Training when:

  • the run was started with clearly wrong settings
  • the selected data split is invalid for the intended comparison
  • you need to cancel a long optimizer or batch run

Result panels

Depending on the configuration, the results table can show several panels. Use the arrow navigation to cycle between them.

Standard result panels

  • Training Results — metrics on the training set
  • Validation Results — metrics on the validation set
  • Test Results — metrics on the held-out test set

When multiple outputs exist, results are shown both as averages and per output.

Cross Validation Results

When cross-validation reporting is active, dAIve shows metrics as average ± standard deviation across all folds.

Hyperparameter Tuning (Optuna)

When the optimizer was used, a tuning results panel appears with:

  • Best Metric and the metric name
  • CV Std Dev (when cross-validation was active)
  • Trials Completed
  • Validation Strategy (e.g. 5 folds (Stratified K-Fold))

When you compare runs, keep these constant if possible:

  • the same target outputs
  • the same split strategy
  • the same validation mode
  • the same feature set

Exporting models

After successful training, the current trained model can be exported as a .dvm file.

Export a model when:

  • you want to run prediction without opening the full project
  • you want to archive a selected trained model separately
  • you want to move a trained model into a prediction-only workflow

Metrics & Interpretation

The training results are easier to compare when you know which metric fits the current task and validation setup.

Possible result panels

Depending on the selected validation setup, dAIve can show:

  • Training Results
  • Validation Results
  • Test Results
  • Cross Validation Results
  • Hyperparameter Tuning (Optuna) (when the optimizer was used)

Use the arrow navigation to cycle between panels when multiple are available. When multiple outputs exist, both average and per-output views are provided.

Classification

Accuracy

  • share of correct predictions
  • higher is better

Regression

MSE

  • squared error
  • lower is better

RMSE

  • same unit as the target
  • lower is better

MAE

  • average absolute error
  • lower is better

MAPE

  • percentage error
  • lower is better
  • unstable near zero

R Squared

  • explained variance
  • higher is better

Cross-validation results

When CV reporting is active, dAIve shows average ± standard deviation for each metric.

Interpretation:

  • high average and low standard deviation is ideal
  • lower standard deviation is safer when averages are similar
  • high standard deviation means unstable performance across splits

Optuna tuning results

When the optimizer was used, the tuning panel shows:

  • Best Metric — the best value found during the search, with the metric name
  • CV Std Dev — standard deviation across folds (only when CV was active)
  • Trials Completed — total number of trials that ran
  • Validation Strategy — the strategy used (e.g. 5 folds (Stratified K-Fold))

Practical reading order:

  1. check whether results came from training, validation, test, or CV
  2. identify the one metric that matters most for the task
  3. compare candidate runs on the same setup only
  4. check stability before choosing a final model

Credit usage

Training in the web app consumes compute credits.

The effective cost grows with:

  • batch size of the experiment set (each model in a batch run counts separately)
  • number of optimizer trials (each trial counts as a run)
  • number of folds and repeats in robust validation (each fold/repeat counts, plus the final model fit)

See the credit cost reference on the Profile page for exact per-operation costs.

dAIve customer documentation for web app and desktop app