Skip to content

Training Data Import

This page describes Step 1 in the desktop app.

Desktop-specific notes:

  • the uploaded CSV is copied into the local project workspace
  • the file is stored inside the .dvp archive when the project is saved
  • replacing the training file resets the entire project state (preserving the project path and name)

Uploading the training file

Use the Choose .csv file drop zone to load the training dataset. You can either click the zone to browse for a file or drag and drop a CSV directly onto it.

The file should:

  • be a CSV file
  • use the first row as column names
  • contain one record per row

After upload, dAIve auto-detects:

  • file separator (Tab, ;, ,, or |)
  • decimal separator (. or ,)

If the preview is wrong, adjust the separator controls manually using the buttons under Choose a file separator and Choose a decimal separator.

What this step defines:

  • which file is the training source for the project
  • which columns are model inputs
  • which columns are targets
  • whether the project should be treated as a time-series task

Show Header

Use Show Header to load the column table. If the table is already shown, the button reads Update Header instead.

The button is disabled until a training file has been loaded.

The main table is labeled Choose Inputs and Outputs for the Prediction.

Inputs and outputs

Select:

  • Inputs (one checkbox per column, plus a "select all" header checkbox)
  • Outputs (one checkbox per column, plus a "select all" header checkbox)

The header shows a running count: e.g. Inputs (3) and Outputs (1).

To continue:

  • at least one input must be selected
  • at least one output must be selected

Practical recommendation:

  • start with the smallest sensible set of inputs
  • avoid clearly redundant or administrative columns
  • keep output selection stable once you move into training

Time Series

The Time Series toggle sits at the top of the column table.

Enabling time series changes the setup:

  • in normal mode, a column cannot be both input and output
  • in time-series mode, overlap is allowed (a column can be both input and output)
  • Step 3 switches to the Recurrent Neural Network path

Disabling time series while columns overlap removes the overlap by keeping those columns as outputs only.

Use time-series mode when:

  • row order carries meaning
  • past values are part of the prediction context
  • the model should learn temporal structure instead of plain tabular relationships

Common issues

  • wrong separator selected
  • decimal comma vs decimal point mismatch
  • missing header row
  • duplicate column names
  • mixed data types in a column

Changing the uploaded file or the input/output selection invalidates later steps on purpose.

That invalidation is important because:

  • Step 2 depends on the selected columns
  • Step 3 depends on the chosen task structure
  • trained models become outdated as soon as the source columns change

Steps with stale data display a yellow warning badge in the stepper.

dAIve customer documentation for web app and desktop app