Training Data Import
This page describes Step 1 in the dAIve web app.
Web-specific note:
- the uploaded CSV is loaded into the browser workspace and stored in the cloud project when you save
Uploading the training file
Use the Choose .csv file drop zone to load the training dataset. You can either click the zone to browse for a file or drag and drop a CSV directly onto it.
The file should:
- be a CSV file
- use the first row as column names
- contain one record per row
After upload, dAIve auto-detects:
- file separator (Tab,
;,,, or|) - decimal separator (
.or,)
If the preview is wrong, adjust the separator controls manually using the buttons under Choose a file separator and Choose a decimal separator.
What this step defines:
- which file is the training source for the project
- which columns are model inputs
- which columns are targets
- whether the project should be treated as a time-series task
Show Header
Use Show Header to load the column table. If the table is already shown, the button reads Update Header instead.
The button is disabled until a training file has been loaded.
The main table is labeled Choose Inputs and Outputs for the Prediction.
Inputs and outputs
Select:
- Inputs (one checkbox per column, plus a "select all" header checkbox)
- Outputs (one checkbox per column, plus a "select all" header checkbox)
The header shows a running count: e.g. Inputs (3) and Outputs (1).
To continue:
- at least one input must be selected
- at least one output must be selected
Practical recommendation:
- start with the smallest sensible set of inputs
- avoid clearly redundant or administrative columns
- keep output selection stable once you move into training
Time Series
The Time Series toggle sits at the top of the column table.
Enabling time series changes the setup:
- in normal mode, a column cannot be both input and output
- in time-series mode, overlap is allowed (a column can be both input and output)
- Step 3 switches to the Recurrent Neural Network path
Disabling time series while columns overlap removes the overlap by keeping those columns as outputs only.
Use time-series mode when:
- row order carries meaning
- past values are part of the prediction context
- the model should learn temporal structure instead of plain tabular relationships
Common issues
- wrong separator selected
- decimal comma vs decimal point mismatch
- missing header row
- duplicate column names
- mixed data types in a column
Changing the uploaded file or the input/output selection invalidates later steps on purpose.
That invalidation is important because:
- Step 2 depends on the selected columns
- Step 3 depends on the chosen task structure
- trained models become outdated as soon as the source columns change
Steps with stale data display a yellow warning badge in the stepper.
