Skip to content

Training Data Assessment

This page describes Step 5 in the desktop app.

Overview

This step provides two side-by-side analysis panels that help you assess the training data and model after a successful training run. Both analyses retrain the model under controlled conditions to produce diagnostic results.

Data Size Analysis

The left panel answers the question: do I need more data?

dAIve retrains on progressively smaller subsets and shows how performance changes as data volume increases.

Interpretation:

  • rising curve: more data could help
  • plateau: more data is unlikely to help much
  • early sharp drop: the model learns basic patterns quickly

Use this analysis when:

  • you are unsure whether the model is data-limited
  • you need to decide between collecting more data and redesigning the model
  • stakeholders ask whether more samples are worth the effort

Input Dropout Analysis

The right panel removes one feature at a time and measures the impact on performance.

Use it to:

  • find important features
  • remove weak features
  • explain model behavior
  • find noisy or problematic columns

Good follow-up actions:

  • remove clearly unhelpful inputs
  • return to Step 1 to simplify the feature set
  • retrain after any important feature selection change

Availability

This step is only available after Step 4 completed successfully.

Retraining invalidates the assessment so that the analysis always matches the current model.

Compute credits

Both analyses consume compute credits. The credit cost depends on the model type:

  • NN models (FNN/RNN): 2.0 credits per analysis run
  • RF/XGB models: 1.0 credits per analysis run

Credits are multiplied by the run count. In batch mode, each trained model counts as a separate run. For optimizer runs, each trial counts. For cross-validation, each fold/repeat counts.

dAIve customer documentation for web app and desktop app