Tools¶
Utility tools and scripts for working with HED data.
Analysis Tools¶
TabularSummary¶
TabularSummary ¶
Summarize the contents of columnar files.
Source code in hed/tools/analysis/tabular_summary.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 |
|
extract_sidecar_template ¶
Extract a BIDS sidecar-compatible dictionary.
Returns:
Name | Type | Description |
---|---|---|
dict |
dict
|
A sidecar template that can be converted to JSON. |
Source code in hed/tools/analysis/tabular_summary.py
extract_summary
staticmethod
¶
Create a TabularSummary object from a serialized summary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
summary_info
|
dict or str
|
A JSON string or a dictionary containing contents of a TabularSummary. |
required |
Returns:
Name | Type | Description |
---|---|---|
TabularSummary |
TabularSummary
|
contains the information in summary_info as a TabularSummary object. |
Source code in hed/tools/analysis/tabular_summary.py
get_columns_info
staticmethod
¶
Extract unique value counts for columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataframe
|
DataFrame
|
The DataFrame to be analyzed. |
required |
skip_cols
|
list
|
List of names of columns to be skipped in the extraction. |
None
|
Returns:
Type | Description |
---|---|
dict[str, dict]
|
dict[str, dict]: A dictionary with keys that are column names (strings) and values that are dictionaries of unique value counts. |
Source code in hed/tools/analysis/tabular_summary.py
get_number_unique ¶
Return the number of unique values in columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_names
|
(list, None)
|
A list of column names to analyze or all columns if None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
dict |
dict
|
Column names are the keys and the number of unique values in the column are the values. |
Source code in hed/tools/analysis/tabular_summary.py
get_summary ¶
Return the summary in dictionary format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
as_json
|
bool
|
If False, return as a Python dictionary, otherwise convert to a JSON dictionary. |
False
|
Returns:
Type | Description |
---|---|
Union[dict, str]
|
Union[dict, str]: A dictionary containing the summary information or a JSON string if as_json is True. |
Source code in hed/tools/analysis/tabular_summary.py
make_combined_dicts
staticmethod
¶
make_combined_dicts(
file_dictionary, skip_cols=None
) -> tuple[TabularSummary, dict[str, TabularSummary]]
Return combined and individual summaries.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_dictionary
|
FileDictionary
|
Dictionary of file name keys and full path. |
required |
skip_cols
|
list
|
Name of the column. |
None
|
Returns:
Name | Type | Description |
---|---|---|
tuple |
tuple[TabularSummary, dict[str, TabularSummary]]
|
|
Source code in hed/tools/analysis/tabular_summary.py
update ¶
Update the counts based on data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame, str, or list
|
DataFrame containing data to update. |
required |
name
|
str
|
Name of the summary. |
None
|
Source code in hed/tools/analysis/tabular_summary.py
update_summary ¶
Add TabularSummary values to this object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tab_sum
|
TabularSummary
|
A TabularSummary to be combined. |
required |
Notes
- The value_cols and skip_cols are updated as long as they are not contradictory.
- A new skip column cannot be used.
Source code in hed/tools/analysis/tabular_summary.py
Annotation Utilities¶
annotation_util ¶
Utilities to facilitate annotation of events in BIDS.
check_df_columns ¶
check_df_columns(
df,
required_cols=(
"column_name",
"column_value",
"description",
"HED",
),
) -> list[str]
Return a list of the specified columns that are missing from a dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df
|
DataFrame
|
Spreadsheet to check the columns of. |
required |
required_cols
|
tuple
|
List of column names that must be present. |
('column_name', 'column_value', 'description', 'HED')
|
Returns:
Type | Description |
---|---|
list[str]
|
list[str]: List of column names that are missing. |
Source code in hed/tools/analysis/annotation_util.py
df_to_hed ¶
Create sidecar-like dictionary from a 4-column dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataframe
|
DataFrame
|
A four-column Pandas DataFrame with specific columns. |
required |
description_tag
|
bool
|
If True description tag is included. |
True
|
Returns:
Name | Type | Description |
---|---|---|
dict |
dict
|
A dictionary compatible with BIDS JSON tabular file that includes HED. |
Notes
- The DataFrame must have the columns with names: column_name, column_value, description, and HED.
Source code in hed/tools/analysis/annotation_util.py
extract_tags ¶
Extract all instances of specified tag from a tag_string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hed_string
|
str
|
Tag string from which to extract tag. |
required |
search_tag
|
str
|
HED tag to extract. |
required |
Returns:
Type | Description |
---|---|
tuple[str, list[str]]
|
tuple[str, list[str] - Tag string without the tags. - A list of the tags that were extracted, for example descriptions. |
Source code in hed/tools/analysis/annotation_util.py
generate_sidecar_entry ¶
Create a sidecar column dictionary for column.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_name
|
str
|
Name of the column. |
required |
column_values
|
list
|
List of column values. |
None
|
Returns: dict: A dictionary representing a template for a sidecar entry.
Source code in hed/tools/analysis/annotation_util.py
hed_to_df ¶
Return a 4-column dataframe of HED portions of sidecar.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sidecar_dict
|
dict
|
A dictionary conforming to BIDS JSON events sidecar format. |
required |
col_names
|
(list, None)
|
A list of the cols to include in the flattened sidecar. |
None
|
Returns:
Name | Type | Description |
---|---|---|
DataFrame |
DataFrame
|
Four-column spreadsheet representing HED portion of sidecar. |
Notes
- The returned DataFrame has columns: column_name, column_value, description, and HED.
Source code in hed/tools/analysis/annotation_util.py
merge_hed_dict ¶
Update a JSON sidecar based on the hed_dict values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sidecar_dict
|
dict
|
Dictionary representation of a BIDS JSON sidecar. |
required |
hed_dict
|
dict
|
Dictionary derived from a dataframe representation of HED in sidecar. |
required |
Source code in hed/tools/analysis/annotation_util.py
series_to_factor ¶
Convert a series to an integer factor list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
series
|
Series
|
Series to be converted to a list. |
required |
Returns:
Type | Description |
---|---|
list[int]
|
list[int] - contains 0's and 1's, empty, 'n/a' and np.nan are converted to 0. |
Source code in hed/tools/analysis/annotation_util.py
str_to_tabular ¶
Return a TabularInput a tsv string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tsv_str
|
str
|
A string representing a tabular input. |
required |
sidecar
|
(Sidecar, str, File or File - like)
|
An optional Sidecar object. |
None
|
Returns: TabularInput: Represents a tabular input object.
Source code in hed/tools/analysis/annotation_util.py
strs_to_hed_objs ¶
Returns a list of HedString objects from a list of strings.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hed_strings
|
string or list
|
String or strings representing HED annotations. |
required |
hed_schema
|
HedSchema or HedSchemaGroup
|
Schema version for the strings. |
required |
Returns:
Type | Description |
---|---|
Union[list[HedString], None]
|
Union[list[HedString], None]: A list of HedString objects or None. |
Source code in hed/tools/analysis/annotation_util.py
strs_to_sidecar ¶
Return a Sidecar from a sidecar as string or as a list of sidecars as strings.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sidecar_strings
|
string or list
|
String or strings representing sidecars. |
required |
Returns:
Type | Description |
---|---|
Union[Sidecar, None]
|
Union[Sidecar, None]: the merged sidecar from the list. |
Source code in hed/tools/analysis/annotation_util.py
to_factor ¶
Convert data to an integer factor list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Series or DataFrame
|
Series or DataFrame to be converted to a list. |
required |
column
|
str
|
Column name if DataFrame, otherwise column 0 is used. |
None
|
Returns:
Type | Description |
---|---|
list[int]
|
list[int]: A list containing 0's and 1's. Empty, 'n/a', and np.nan values are converted to 0. |
Source code in hed/tools/analysis/annotation_util.py
to_strlist ¶
Convert objects in a list to strings, preserving None values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj_list
|
list
|
A list of objects that are None or have a str method. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
list[str]: A list with the objects converted to strings. None values are preserved as empty strings. |
Source code in hed/tools/analysis/annotation_util.py
Remodeling Operations¶
Base Operations¶
base_op ¶
Base class for remodeling operations.
BaseOp ¶
Bases: ABC
Base class for operations. All remodeling operations should extend this class.
Source code in hed/tools/remodeling/operations/base_op.py
do_op
abstractmethod
¶
Base class method to be overridden by each operation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dispatcher
|
Dispatcher
|
Manages the operation I/O. |
required |
df
|
DataFrame
|
The tabular file to be remodeled. |
required |
name
|
str
|
Unique identifier for the data -- often the original file path. |
required |
sidecar
|
Sidecar or file - like
|
A JSON sidecar needed for HED operations. |
None
|
Source code in hed/tools/remodeling/operations/base_op.py
validate_input_data
abstractmethod
staticmethod
¶
Validates whether operation parameters meet op-specific criteria beyond that captured in json schema.
Example: A check to see whether two input arrays are the same length.
The minimum implementation should return an empty list to indicate no errors were found.
If additional validation is necessary, method should perform the validation and return a list with user-friendly error strings.
Source code in hed/tools/remodeling/operations/base_op.py
Remove Columns¶
remove_columns_op ¶
Remove columns from a columnar file.
RemoveColumnsOp ¶
Bases: BaseOp
Remove columns from a columnar file.
Required remodeling parameters
- column_names (list): The names of the columns to be removed.
- ignore_missing (boolean): If True, names in column_names that are not columns in df should be ignored.
Source code in hed/tools/remodeling/operations/remove_columns_op.py
do_op ¶
Remove indicated columns from a dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dispatcher
|
Dispatcher
|
Manages the operation I/O. |
required |
df
|
DataFrame
|
The DataFrame to be remodeled. |
required |
name
|
str
|
Unique identifier for the dataframe -- often the original file path. |
required |
sidecar
|
Sidecar or file - like
|
Not needed for this operation. |
None
|
Returns:
Type | Description |
---|---|
'pd.DataFrame'
|
pd.DataFrame: A new dataframe after processing. |
:raises KeyError: - If ignore_missing is False and a column not in the data is to be removed.
Source code in hed/tools/remodeling/operations/remove_columns_op.py
validate_input_data
staticmethod
¶
Additional validation required of operation parameters not performed by JSON schema validator.
Rename Columns¶
rename_columns_op ¶
Rename columns in a columnar file.
RenameColumnsOp ¶
Bases: BaseOp
Rename columns in a tabular file.
Required remodeling parameters
- column_mapping (dict): The names of the columns to be renamed with values to be remapped to.
- ignore_missing (bool): If true, the names in column_mapping that are not columns and should be ignored.
Source code in hed/tools/remodeling/operations/rename_columns_op.py
do_op ¶
Rename columns as specified in column_mapping dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dispatcher
|
Dispatcher
|
Manages the operation I/O. |
required |
df
|
DataFrame
|
The DataFrame to be remodeled. |
required |
name
|
str
|
Unique identifier for the dataframe -- often the original file path. |
required |
sidecar
|
Sidecar or file - like
|
Not needed for this operation. |
None
|
Returns:
Type | Description |
---|---|
'pd.DataFrame'
|
pd.Dataframe: A new dataframe after processing. |
:raises KeyError: - When ignore_missing is False and column_mapping has columns not in the data.
Source code in hed/tools/remodeling/operations/rename_columns_op.py
validate_input_data
staticmethod
¶
Additional validation required of operation parameters not performed by JSON schema validator.
BIDS Tools¶
BIDS Dataset Processing¶
bids ¶
Models for BIDS datasets and files.