Transformation operations

Transformation operations modify tabular data files by restructuring, filtering, or augmenting their content. These operations are the primary tools for preparing event files for analysis, converting between encodings, and cleaning datasets.

What are transformations?

Transformations process data files sequentially, applying changes to each file independently. They can:

  • Restructure data: Reorder, rename, or reorganize columns

  • Filter data: Remove unwanted rows or columns

  • Augment data: Add computed columns (e.g., factor vectors)

  • Convert formats: Transform between different data representations

  • Merge/split: Combine or separate events

Key characteristics:

  • Modify the input data files directly

  • Process files independently (stateless)

  • Can be chained together in sequences

  • Changes are saved to the output location

Common transformation workflows

Data cleaning pipeline:

  1. Remove columns - Remove unnecessary columns

  2. Remove rows - Filter out unwanted events

  3. Rename columns - Standardize column names

  4. Reorder columns - Establish consistent column order

Factor vector generation:

  1. Factor column - Create factors from categorical columns

  2. Factor HED tags - Create factors from HED tag queries

  3. Factor HED type - Create factors from HED type tags

Event restructuring:

  1. Split rows - Convert trial-level to event-level encoding

  2. Merge consecutive - Combine consecutive identical events

  3. Remap columns - Map value combinations to new encodings

Available transformations

Transformation summary

Basic column operations

Row operations

Value mapping

  • Remap columns - Map combinations of source column values to destination columns

Factor generation

HED operations note

Operations with “HED” in their name require:

  • A HED schema version specified when creating the Dispatcher

  • Often a JSON sidecar file containing HED annotations

  • Data files with HED-annotated columns

See the User guide for details on using HED operations.

Examples

Simple column cleanup:

[
    {
        "operation": "remove_columns",
        "description": "Remove temporary processing columns",
        "parameters": {
            "column_names": ["temp_col", "debug_info"],
            "ignore_missing": true
        }
    },
    {
        "operation": "reorder_columns",
        "description": "Standardize column order",
        "parameters": {
            "column_order": ["onset", "duration", "trial_type"],
            "keep_others": true,
            "ignore_missing": false
        }
    }
]

Factor vector generation:

[
    {
        "operation": "factor_column",
        "description": "Create factors for trial types",
        "parameters": {
            "column_name": "trial_type",
            "factor_values": ["go", "stop"],
            "factor_names": ["is_go", "is_stop"]
        }
    }
]

See also