Tools¶
Utility functions and data processing tools for HED operations.
Analysis Tools¶
EventManager¶
- class hed.tools.analysis.event_manager.EventManager(input_data, hed_schema, extra_defs=None)[source]¶
Bases:
object
Manager of events of temporal extent.
- __init__(input_data, hed_schema, extra_defs=None)[source]¶
Create an event manager for an events file. Manages events of temporal extent.
- Parameters:
input_data (TabularInput) – Represents an events file with its sidecar.
hed_schema (HedSchema) – HED schema used.
extra_defs (DefinitionDict) – Extra definitions not included in the input_data information.
- Raises:
HedFileError – If there are any unmatched offsets.
Notes: Keeps the events of temporal extend by their starting index in events file. These events are separated from the rest of the annotations, which are contained in self.hed_strings.
- unfold_context(remove_types=[])[source]¶
Unfold the event information into a tuple based on context.
- Parameters:
remove_types (list) – List of types to remove.
- Returns:
Union[list(str), HedString]: The information without the events of temporal extent. Union[list(str), HedString, None]: The onsets of the events of temporal extent. Union[list(str), HedString, None]: The ongoing context information.
- Return type:
tuple[Union[list(str), HedString], Union[list(str), HedString, None], Union[list(str), HedString, None]]
- get_type_defs(types)[source]¶
Return a list of definition names (lower case) that correspond to any of the specified types.
HedTagManager¶
- class hed.tools.analysis.hed_tag_manager.HedTagManager(event_manager, remove_types=[], extra_defs=None)[source]¶
Bases:
object
Manager for the HED tags from a columnar file.
- __init__(event_manager, remove_types=[], extra_defs=None)[source]¶
Create a tag manager for one tabular file.
- Parameters:
event_manager (EventManager) – an event manager for the tabular file.
remove_types (list or None) – List of type tags (such as condition-variable) to remove.
- get_hed_objs(include_context=True, replace_defs=False)[source]¶
Return a list of HED string objects of same length as the tabular file.
HedTypeManager¶
- class hed.tools.analysis.hed_type_manager.HedTypeManager(event_manager)[source]¶
Bases:
object
Manager for type factors and type definitions.
- __init__(event_manager)[source]¶
Create a variable manager for one tabular file for all type variables.
- Parameters:
event_manager (EventManager) – An event manager for the tabular file.
- Raises:
HedFileError – On errors such as unmatched onsets or missing definitions.
- property types¶
Return a list of types managed by this manager.
- Returns:
Type tags names.
- Return type:
- add_type(type_name)[source]¶
Add a type variable to be managed by this manager.
- Parameters:
type_name (str) – Type tag name of the type to be added.
- get_factor_vectors(type_tag, type_values=None, factor_encoding='one-hot')[source]¶
Return a DataFrame of factor vectors for the indicated HED tag and values.
- Parameters:
- Returns:
DataFrame containing the factor vectors as the columns.
- Return type:
Union[pd.DataFrame, None]
- get_type_tag_factor(type_tag, type_value)[source]¶
Return the HedTypeFactors a specified value and extension.
TabularSummary¶
- class hed.tools.analysis.tabular_summary.TabularSummary(value_cols=None, skip_cols=None, name='')[source]¶
Bases:
object
Summarize the contents of columnar files.
- __init__(value_cols=None, skip_cols=None, name='')[source]¶
Constructor for a BIDS tabular file summary.
- extract_sidecar_template() dict [source]¶
Extract a BIDS sidecar-compatible dictionary.
- Returns:
A sidecar template that can be converted to JSON.
- Return type:
- update_summary(tab_sum)[source]¶
Add TabularSummary values to this object.
- Parameters:
tab_sum (TabularSummary) – A TabularSummary to be combined.
Notes
The value_cols and skip_cols are updated as long as they are not contradictory.
A new skip column cannot be used.
- static extract_summary(summary_info) TabularSummary [source]¶
Create a TabularSummary object from a serialized summary.
- static get_columns_info(dataframe, skip_cols=None) dict[str, dict] [source]¶
Extract unique value counts for columns.
- static make_combined_dicts(file_dictionary, skip_cols=None) tuple[TabularSummary, dict[str, TabularSummary]] [source]¶
Return combined and individual summaries.
- Parameters:
file_dictionary (FileDictionary) – Dictionary of file name keys and full path.
skip_cols (list) – Name of the column.
- Returns:
A combined summary of all files in the dictionary.
A dictionary where keys are file names and values are individual TabularSummary objects.
- Return type:
HedType¶
- class hed.tools.analysis.hed_type.HedType(event_manager, name, type_tag='condition-variable')[source]¶
Bases:
object
Manager of a type variable and its associated context.
- __init__(event_manager, name, type_tag='condition-variable')[source]¶
Create a variable manager for one type-variable for one tabular file.
- Parameters:
event_manager (EventManager) – Event manager instance
name (str) – Name of the tabular file as a unique identifier.
type_tag (str) – Lowercase short form of the tag to be managed.
- Raises:
HedFileError – On errors such as unmatched onsets or missing definitions.
- property total_events¶
- get_type_value_factors(type_value)[source]¶
Return the HedTypeFactors associated with type_name or None.
- Parameters:
type_value (str) – The tag corresponding to the type’s value (such as the name of the condition variable).
- Returns:
Union[HedTypeFactors, None]
- get_type_value_level_info(type_value)[source]¶
Return type variable corresponding to type_value.
- Parameters:
type_value (str)
Returns:
- property type_variables¶
- get_type_factors(type_values=None, factor_encoding='one-hot')[source]¶
Create a dataframe with the indicated type tag values as factors.
FileDictionary¶
- class hed.tools.analysis.file_dictionary.FileDictionary(collection_name, file_list, key_indices=(0, 2), separator='_')[source]¶
Bases:
object
A file dictionary keyed by entity pair indices.
Notes
The entities are identified as 0, 1, … depending on order in the base filename.
The entity key-value pairs are assumed separated by ‘_’ unless a separator is provided.
- __init__(collection_name, file_list, key_indices=(0, 2), separator='_')[source]¶
Create a dictionary with full paths as values.
- Parameters:
Notes
This dictionary is used for cross listing BIDS style files for different studies.
Examples
If key_indices is (0, 2), the key generated for /tmp/sub-001_task-FaceCheck_run-01_events.tsv is sub_001_run-01.
- property name¶
Name of this dictionary.
- property key_list¶
Keys in this dictionary.
- property file_dict¶
Dictionary of path values in this dictionary.
- property file_list¶
List of path values in this dictionary.
- iter_files()[source]¶
Iterator over the files in this dictionary.
- Yields:
- str – Key into the dictionary. - file: File path.
- key_diffs(other_dict)[source]¶
Return symmetric key difference with another dict.
- Parameters:
other_dict (FileDictionary)
- Returns:
The symmetric difference of the keys in this dictionary and the other one.
- Return type:
- output_files(title=None, logger=None)[source]¶
Return a string with the output of the list.
- Parameters:
title (None, str) – Optional title.
logger (HedLogger) – Optional HED logger for recording.
- Returns:
The dictionary in string form.
- Return type:
Notes
The logger is updated if available.
- static make_file_dict(file_list, key_indices=(0, 2), separator='_')[source]¶
Return a dictionary of files using entity keys.
BIDS Tools¶
BidsDataset¶
- class hed.tools.bids.bids_dataset.BidsDataset(root_path, schema=None, suffixes=['events', 'participants'], exclude_dirs=['sourcedata', 'derivatives', 'code', 'stimuli'])[source]¶
Bases:
object
A BIDS dataset representation primarily focused on HED evaluation.
- schema¶
The schema used for evaluation.
- Type:
- __init__(root_path, schema=None, suffixes=['events', 'participants'], exclude_dirs=['sourcedata', 'derivatives', 'code', 'stimuli'])[source]¶
Constructor for a BIDS dataset.
- Parameters:
root_path (str) – Root path of the BIDS dataset.
schema (HedSchema or HedSchemaGroup) – A schema that overrides the one specified in dataset.
suffixes (list or None) – File name suffixes of items to include. If None or empty, then [‘_events’, ‘participants’] is assumed.
exclude_dirs=['sourcedata'
'derivatives'
'code'
'phenotype']
- get_file_group(suffix)[source]¶
Return the file group of files with the specified suffix.
- Parameters:
suffix (str) – Suffix of the BidsFileGroup to be returned.
- Returns:
The requested tabular group.
- Return type:
Union[BidsFileGroup, None]
- validate(check_for_warnings=False, schema=None)[source]¶
Validate the dataset.
- Parameters:
check_for_warnings (bool) – If True, check for warnings.
schema (HedSchema or HedSchemaGroup or None) – The schema used for validation.
- Returns:
List of issues encountered during validation. Each issue is a dictionary.
- Return type:
BidsFile¶
- class hed.tools.bids.bids_file.BidsFile(file_path)[source]¶
Bases:
object
A BIDS file with entity dictionary.
Notes
This class may hold the merged sidecar giving metadata for this file as well as contents.
- __init__(file_path)[source]¶
Constructor for a file path.
- Parameters:
file_path (str) – Full path of the file.
- property contents¶
Return the current contents of this object.
- get_key(entities=None)[source]¶
Return a key for this BIDS file given a list of entities.
- Parameters:
entities (tuple) – A tuple of strings representing entities.
- Returns:
A key based on this object.
- Return type:
Notes
If entities is None, then the file path is used as the key.
- set_contents(content_info=None, overwrite=False)[source]¶
Set the contents of this object.
- Parameters:
content_info (Any) – JSON dictionary The contents appropriate for this object.
overwrite (bool) – If False and the contents are not empty, do nothing.
Notes
Do not set if the contents are already set and no_overwrite is True.
BidsFileGroup¶
- class hed.tools.bids.bids_file_group.BidsFileGroup(root_path, file_list, suffix='events')[source]¶
Bases:
object
Container for BIDS files with a specified suffix.
- suffix¶
The file suffix specifying the class of file represented in this group (e.g., events).
- Type:
- sidecar_dir_dict¶
Dictionary whose keys are directory paths and values are list of sidecars in the corresponding directory.
- Type:
- summarize(value_cols=None, skip_cols=None)[source]¶
Return a BidsTabularSummary of group files.
- Parameters:
- Returns:
A summary of the number of values in different columns if tabular group.
- Return type:
Union[TabularSummary, None]
Notes
The columns that are not value_cols or skip_col are summarized by counting
the number of times each unique value appears in that column.
- validate(hed_schema, extra_def_dicts=None, check_for_warnings=False)[source]¶
Validate the sidecars and datafiles and return a list of issues.
- Parameters:
hed_schema (HedSchema) – Schema to apply to the validation.
extra_def_dicts (DefinitionDict) – Extra definitions that come from outside.
check_for_warnings (bool) – If True, include warnings in the check.
- Returns:
A list of validation issues found. Each issue is a dictionary.
- Return type:
- validate_sidecars(hed_schema, extra_def_dicts=None, error_handler=None)[source]¶
Validate merged sidecars.
- Parameters:
hed_schema (HedSchema) – HED schema for validation.
extra_def_dicts (DefinitionDict) – Extra definitions.
error_handler (ErrorHandler) – Error handler to use.
- Returns:
A list of validation issues found. Each issue is a dictionary.
- Return type:
- validate_datafiles(hed_schema, extra_def_dicts=None, error_handler=None)[source]¶
Validate the datafiles and return an error list.
- Parameters:
hed_schema (HedSchema) – Schema to apply to the validation.
extra_def_dicts (DefinitionDict) – Extra definitions that come from outside.
error_handler (ErrorHandler) – Error handler to use.
- Returns:
A list of validation issues found. Each issue is a dictionary.
- Return type:
Notes: This will clear the contents of the datafiles if they were not previously set.
BidsSidecarFile¶
- class hed.tools.bids.bids_sidecar_file.BidsSidecarFile(file_path)[source]¶
Bases:
BidsFile
A BIDS sidecar file.
- __init__(file_path)[source]¶
Constructs a bids sidecar from a file.
- Parameters:
file_path (str) – The real path of the sidecar.
- is_sidecar_for(obj)[source]¶
Return True if this is a sidecar for obj.
- Parameters:
obj (BidsFile) – A BidsFile object to check.
- Returns:
True if this is a BIDS parent of obj and False otherwise.
- Return type:
Notes
A sidecar is a sidecar for itself.
- set_contents(content_info=None, name='unknown', overwrite=False)[source]¶
Set the contents of the sidecar.
- Parameters:
Notes
- The handling of content_info is as follows:
None: This object’s file_path is used.
dict: This is interpreted as a JSON dictionary.
BidsTabularFile¶
- class hed.tools.bids.bids_tabular_file.BidsTabularFile(file_path)[source]¶
Bases:
BidsFile
A BIDS tabular file including its associated sidecar.
- __init__(file_path)[source]¶
Constructor for a BIDS tabular file.
- Parameters:
file_path (str) – Path of the tabular file.
- set_contents(content_info=None, overwrite=False)[source]¶
Set the contents of this tabular file (a TabularInput object). It’s sidecar should already be set.
- Parameters:
content_info (None) – This always uses the internal file_path to create the contents.
overwrite (bool) – If False (The Default), do not overwrite existing contents if any.
BIDS Utilities¶
- hed.tools.bids.bids_util.parse_bids_filename(file_path)[source]¶
Split a filename into BIDS-relevant components.
- Parameters:
file_path (str) – Path to be parsed.
- Returns:
Dictionary with keys ‘basename’, ‘suffix’, ‘prefix’, ‘ext’, ‘bad’, and ‘entities’.
- Return type:
Notes
Splits into BIDS suffix, extension, and a dictionary of entity name-value pairs.