Remap columns¶
The remap_columns operation maps combinations of values in m specified columns of a data file into values in n columns using a defined mapping. Remapping is useful during analysis to create columns in event files that are more directly useful or informative for a particular analysis.
Remapping is also important during the initial generation of event files from experimental logs. The log files generated by experimental control software often generate a code for each type of log entry. Remapping can be used to convert the column containing these codes into one or more columns with more informative information.
Purpose¶
Use this operation to:
Convert experimental log codes into meaningful categorical values
Combine multiple columns into a single informative column
Split one column into multiple columns based on value mappings
Translate between different coding schemes
Parameters¶
Parameters for the remap_columns operation.
Parameter |
Type |
Description |
|---|---|---|
source_columns |
list |
A list of m names of the source columns for the map. |
destination_columns |
list |
A list of n names of the destination columns for the map. |
map_list |
list |
A list of mappings. Each element is a list of m source |
ignore_missing |
bool |
If false, source column values not in the map generate “n/a” |
integer_sources |
list |
(Optional) A list of source columns that are integers. |
A column cannot be both a source and a destination, and all source columns must be present in the data files. New columns are created for destination columns that are missing from a data file.
The remap_columns operation only works for columns containing strings or integers, as it is meant for remapping categorical codes. You must specify which source columns contain integers so that n/a values can be handled appropriately.
The map_list parameter specifies how each unique combination of values from the source columns will be mapped into the destination columns. If there are m source columns and n destination columns, then each entry in map_list must be a list with m + n elements. The first m elements are the key values from the source columns. The map_list should have targets for all combinations of values that appear in the m source columns unless ignore_missing is true.
After remapping, the tabular file will contain both source and destination columns. If you wish to replace the source columns with the destination columns, use a remove_columns transformation after the remap_columns.
Example¶
The remap_columns operation in the following example creates a new column called response_type based on the unique values in the combination of columns response_accuracy and response_hand.
A JSON file with a single remap_columns transformation operation.
[{
"operation": "remap_columns",
"description": "Map response_accuracy and response hand into a single column.",
"parameters": {
"source_columns": ["response_accuracy", "response_hand"],
"destination_columns": ["response_type"],
"map_list": [["correct", "left", "correct_left"],
["correct", "right", "correct_right"],
["incorrect", "left", "incorrect_left"],
["incorrect", "right", "incorrect_left"],
["n/a", "n/a", "n/a"]],
"ignore_missing": true
}
}]
In this example there are two source columns and one destination column, so each entry in map_list must be a list with three elements two source values and one destination value. Since all the values in map_list are strings, the optional integer_sources list is not needed.
Results¶
The results of executing the previous remap_column command on the sample remodel event file are:
Mapping columns response_accuracy and response_hand into a response_type column.
onset |
duration |
trial_type |
stop_signal_delay |
response_time |
response_accuracy |
response_hand |
sex |
response_type |
|---|---|---|---|---|---|---|---|---|
0.0776 |
0.5083 |
go |
n/a |
0.565 |
correct |
right |
female |
correct_right |
5.5774 |
0.5083 |
unsuccesful_stop |
0.2 |
0.49 |
correct |
right |
female |
correct_right |
9.5856 |
0.5084 |
go |
n/a |
0.45 |
correct |
right |
female |
correct_right |
13.5939 |
0.5083 |
succesful_stop |
0.2 |
n/a |
n/a |
n/a |
female |
n/a |
17.1021 |
0.5083 |
unsuccesful_stop |
0.25 |
0.633 |
correct |
left |
male |
correct_left |
21.6103 |
0.5083 |
go |
n/a |
0.443 |
correct |
left |
male |
correct_left |
In this example, remap_columns combines the values from columns response_accuracy and response_hand to produce a new column called response_type that specifies both response hand and correctness information using a single code.
Notes¶
Source and destination columns remain after remapping; use
remove_columnsto clean upEach map_list entry must have exactly m + n elements
Source values are treated as strings for matching
Use
integer_sourcesto specify which source columns contain integersSet
ignore_missingto true to handle unmapped value combinations gracefullyUseful for both simplifying (many-to-one) and expanding (one-to-many) column structures