Training Data Import

This page describes Step 1 in the dAIve web app.

Web-specific note:

the uploaded CSV is loaded into the browser workspace and stored in the cloud project when you save

Uploading the training file

Use the Choose .csv file drop zone to load the training dataset. You can either click the zone to browse for a file or drag and drop a CSV directly onto it.

The file should:

be a CSV file
use the first row as column names
contain one record per row

After upload, dAIve auto-detects:

file separator (Tab, ;, ,, or |)
decimal separator (. or ,)

If the preview is wrong, adjust the separator controls manually using the buttons under Choose a file separator and Choose a decimal separator.

What this step defines:

which file is the training source for the project
which columns are model inputs
which columns are targets
whether the project should be treated as a time-series task

Show Header

Use Show Header to load the column table. If the table is already shown, the button reads Update Header instead.

The button is disabled until a training file has been loaded.

The main table is labeled Choose Inputs and Outputs for the Prediction.

Inputs and outputs

Select:

Inputs (one checkbox per column, plus a "select all" header checkbox)
Outputs (one checkbox per column, plus a "select all" header checkbox)

The header shows a running count: e.g. Inputs (3) and Outputs (1).

To continue:

at least one input must be selected
at least one output must be selected

Practical recommendation:

start with the smallest sensible set of inputs
avoid clearly redundant or administrative columns
keep output selection stable once you move into training

Time Series

The Time Series toggle sits at the top of the column table.

Enabling time series changes the setup:

in normal mode, a column cannot be both input and output
in time-series mode, overlap is allowed (a column can be both input and output)
Step 3 switches to the Recurrent Neural Network path

Disabling time series while columns overlap removes the overlap by keeping those columns as outputs only.

Use time-series mode when:

row order carries meaning
past values are part of the prediction context
the model should learn temporal structure instead of plain tabular relationships

Common issues

wrong separator selected
decimal comma vs decimal point mismatch
missing header row
duplicate column names
mixed data types in a column

Changing the uploaded file or the input/output selection invalidates later steps on purpose.

That invalidation is important because:

Step 2 depends on the selected columns
Step 3 depends on the chosen task structure
trained models become outdated as soon as the source columns change

Steps with stale data display a yellow warning badge in the stepper.

Training Data Import ​

Uploading the training file ​

Show Header ​

Inputs and outputs ​

Time Series ​