Transformation operations¶
Transformation operations modify tabular data files by restructuring, filtering, or augmenting their content. These operations are the primary tools for preparing event files for analysis, converting between encodings, and cleaning datasets.
What are transformations?¶
Transformations process data files sequentially, applying changes to each file independently. They can:
Restructure data: Reorder, rename, or reorganize columns
Filter data: Remove unwanted rows or columns
Augment data: Add computed columns (e.g., factor vectors)
Convert formats: Transform between different data representations
Merge/split: Combine or separate events
Key characteristics:
Modify the input data files directly
Process files independently (stateless)
Can be chained together in sequences
Changes are saved to the output location
Common transformation workflows¶
Data cleaning pipeline:
Remove columns - Remove unnecessary columns
Remove rows - Filter out unwanted events
Rename columns - Standardize column names
Reorder columns - Establish consistent column order
Factor vector generation:
Factor column - Create factors from categorical columns
Factor HED tags - Create factors from HED tag queries
Factor HED type - Create factors from HED type tags
Event restructuring:
Split rows - Convert trial-level to event-level encoding
Merge consecutive - Combine consecutive identical events
Remap columns - Map value combinations to new encodings
Available transformations¶
Transformation summary¶
Basic column operations¶
Remove columns - Remove specified columns from data files
Rename columns - Rename columns using a mapping dictionary
Reorder columns - Reorder columns in a specified sequence
Row operations¶
Remove rows - Remove rows based on column value criteria
Merge consecutive - Merge consecutive rows with identical values
Split rows - Split trial-level encoding into event-level
Value mapping¶
Remap columns - Map combinations of source column values to destination columns
Factor generation¶
Factor column - Create factor vectors from column values
Factor HED tags - Create factor vectors from HED tag search queries
Factor HED type - Create factor vectors from HED type tags (e.g., Condition-variable)
HED operations note¶
Operations with “HED” in their name require:
A HED schema version specified when creating the Dispatcher
Often a JSON sidecar file containing HED annotations
Data files with HED-annotated columns
See the User guide for details on using HED operations.
Examples¶
Simple column cleanup:
[
{
"operation": "remove_columns",
"description": "Remove temporary processing columns",
"parameters": {
"column_names": ["temp_col", "debug_info"],
"ignore_missing": true
}
},
{
"operation": "reorder_columns",
"description": "Standardize column order",
"parameters": {
"column_order": ["onset", "duration", "trial_type"],
"keep_others": true,
"ignore_missing": false
}
}
]
Factor vector generation:
[
{
"operation": "factor_column",
"description": "Create factors for trial types",
"parameters": {
"column_name": "trial_type",
"factor_values": ["go", "stop"],
"factor_names": ["is_go", "is_stop"]
}
}
]
See also¶
Summarization operations - Summarization operations
Quickstart - Getting started guide
User guide - Complete usage documentation