Models

Core data models for working with HED data structures.

Core Models

The fundamental data structures for HED annotations and tags.

HedString

class hed.models.hed_string.HedString(hed_string, hed_schema, def_dict=None, _contents=None)[source]

Bases: HedGroup

A HED string with its schema and definitions.

OPENING_GROUP_CHARACTER = '('
CLOSING_GROUP_CHARACTER = ')'
__init__(hed_string, hed_schema, def_dict=None, _contents=None)[source]

Constructor for the HedString class.

Parameters:
  • hed_string (str) – A HED string consisting of tags and tag groups.

  • hed_schema (HedSchema) – The schema to use to identify tags.

  • def_dict (DefinitionDict or None) – The def dict to use to identify def/def expand tags.

  • _contents ([HedGroup and/or HedTag] or None) – Create a HedString from this exact list of children. Does not make a copy.

Notes

  • The HedString object parses its component tags and groups into a tree-like structure.

static from_hed_strings(hed_strings) HedString[source]

Create a new HedString from a list of HedStrings.

Parameters:

hed_strings (list or None) – A list of HedString objects to combine. This takes ownership of their children.

Returns:

The newly combined HedString.

Return type:

HedString

property is_group

Always False since the underlying string is not a group with parentheses.

copy() HedString[source]

Return a deep copy of this string.

Returns:

The copied group.

Return type:

HedString

remove_definitions()[source]

Remove definition tags and groups from this string.

This does not validate definitions and will blindly removing invalid ones as well.

shrink_defs() HedString[source]

Replace def-expand tags with def tags.

This does not validate them and will blindly shrink invalid ones as well.

Returns:

self

Return type:

HedString

expand_defs() HedString[source]

Replace def tags with def-expand tags.

This does very minimal validation.

Returns:

self

Return type:

HedString

get_as_original() str[source]

Return the original form of this string.

Returns:

The string with all the tags in their original form.

Return type:

str

Notes

Potentially with some extraneous spaces removed on returned string.

static split_into_groups(hed_string, hed_schema, def_dict=None) list[source]

Split the HED string into a parse tree.

Parameters:
  • hed_string (str) – A HED string consisting of tags and tag groups to be processed.

  • hed_schema (HedSchema) – HED schema to use to identify tags.

  • def_dict (DefinitionDict) – The definitions to identify.

Returns:

A list of HedTag and/or HedGroup.

Return type:

list

Raises:

ValueError – If the string is significantly malformed, such as mismatched parentheses.

Notes

  • The parse tree consists of tag groups, tags, and delimiters.

static split_hed_string(hed_string) list[tuple[bool, tuple[int, int]]][source]

Split a HED string into delimiters and tags.

Parameters:

hed_string (str) – The HED string to split.

Returns:

A list of tuples where each tuple is (is_hed_tag, (start_pos, end_pos)).

Return type:

list[tuple[bool, tuple[int, int]]]

Notes

  • The tuple format is as follows
    • is_hed_tag (bool): A (possible) HED tag if True, delimiter if not.

    • start_pos (int): Index of start of string in hed_string.

    • end_pos (int): Index of end of string in hed_string.

  • This function does not validate tags or delimiters in any form.

validate(allow_placeholders=True, error_handler=None) list[dict][source]

Validate the string using the schema.

Parameters:
  • allow_placeholders (bool) – Allow placeholders in the string.

  • error_handler (ErrorHandler or None) – The error handler to use, creates a default one if none passed.

Returns:

A list of issues for HED string.

Return type:

list[dict]

find_top_level_tags(anchor_tags, include_groups=2) list[source]

Find top level groups with an anchor tag.

A max of 1 tag located per top level group.

Parameters:
  • anchor_tags (container) – A list/set/etc. of short_base_tags to find groups by.

  • include_groups (0, 1 or 2) – Parameter indicating what return values to include. If 0: return only tags. If 1: return only groups. If 2 or any other value: return both.

Returns:

The returned result depends on include_groups.

Return type:

list

remove_refs()[source]

Remove any refs(tags contained entirely inside curly braces) from the string.

This does NOT validate the contents of the curly braces. This is only relevant when directly editing sidecar strings. Tools will naturally ignore these.

HedTag

class hed.models.hed_tag.HedTag(hed_string, hed_schema, span=None, def_dict=None)[source]

Bases: object

A single HED tag.

Notes

  • HedTag is a smart class in that it keeps track of its original value and positioning as well as pointers to the relevant HED schema information, if relevant.

__init__(hed_string, hed_schema, span=None, def_dict=None)[source]

Creates a HedTag.

Parameters:
  • hed_string (str) – Source HED string for this tag.

  • hed_schema (HedSchema) – A parameter for calculating canonical forms on creation.

  • span (int, int) – The start and end indexes of the tag in the hed_string.

  • def_dict (DefinitionDict or None) – The def dict to use to identify def/def expand tags.

copy() HedTag[source]

Return a deep copy of this tag.

Returns:

The copied group.

Return type:

HedTag

property schema_namespace: str

Library namespace for this tag if one exists.

Returns:

The library namespace, including the colon.

Return type:

str

property short_tag: str

Short form including value or extension.

Returns:

The short form of the tag, including value or extension.

Return type:

str

property base_tag: str

Long form without value or extension.

Returns:

The long form of the tag, without value or extension.

Return type:

str

property short_base_tag: str

Short form without value or extension.

Returns:

The short non-extension port of a tag.

Return type:

str

Notes

  • ParentNodes/Def/DefName would return just “Def”.

property org_base_tag: str

Original form without value or extension.

Returns:

The original form of the tag, without value or extension.

Return type:

str

Notes

  • Warning: This could be empty if the original tag had a name_prefix prepended. e.g. a column where “Label/” is prepended, thus the column value has zero base portion.

tag_modified() bool[source]

Return True if tag has been modified from original.

Returns:

Return True if the tag is modified.

Return type:

bool

Notes

  • Modifications can include adding a column name_prefix.

property tag: str

Returns the tag or the original tag if no user form set.

Returns:

The custom set user form of the tag.

Return type:

str

property extension: str

Get the extension or value of tag.

Generally this is just the portion after the last slash. Returns an empty string if no extension or value.

Returns:

The tag name.

Return type:

str

Notes

  • This tag must have been computed first.

property long_tag: str

Long form including value or extension.

Returns:

The long form of this tag.

Return type:

str

property org_tag: str

Return the original unmodified tag.

Returns:

The original unmodified tag.

Return type:

str

property expanded: bool

Return if this is currently expanded or not.

Will always be False unless expandable is set. This is primarily used for Def/Def-expand tags at present.

Returns:

True if this is currently expanded.

Return type:

bool

property expandable: 'HedGroup' | 'HedTag' | None

Return what this expands to.

This is primarily used for Def/Def-expand tags at present.

Lazily set the first time it’s called.

Returns:

Returns the expanded form of this tag.

Return type:

Union[HedGroup,HedTag,None]

is_column_ref() bool[source]

Return if this tag is a column reference from a sidecar.

You should only see these if you are directly accessing sidecar strings, tools should remove them otherwise.

Returns:

True if this is a column ref.

Return type:

bool

__str__() str[source]

Convert this HedTag to a string.

Returns:

The original tag if we haven’t set a new tag.(e.g. short to long).

Return type:

str

lower() str[source]

Convenience function, equivalent to str(self).lower().

casefold() str[source]

Convenience function, equivalent to str(self).casefold().

get_stripped_unit_value(extension_text) tuple[str | None, str | None][source]

Return the extension divided into value and units, if the units are valid.

Parameters:

extension_text (str) – The text to split, in case it’s a portion of a tag.

Returns:

The extension portion with the units removed or None if invalid units. Union[str, None]: The units or None if no units of the right unit class are found.

Return type:

Union[str, None]

Examples

‘Duration/3 ms’ will return (‘3’, ‘ms’)

value_as_default_unit() float | None[source]

Return the value converted to default units if possible or None if invalid.

Returns:

The extension value in default units. If no default units it assumes that the extension value is in default units.

Return type:

Union[float, None]

Examples

‘Duration/300 ms’ will return .3

property unit_classes: dict

Return a dict of all the unit classes this tag accepts.

Returns:

A dict of unit classes this tag accepts.

Return type:

dict

Notes

  • Returns empty dict if this is not a unit class tag.

  • The dictionary has unit name as the key and HedSchemaEntry as value.

property value_classes: dict

Return a dict of all the value classes this tag accepts.

Returns:

A dictionary of HedSchemaEntry value classes this tag accepts.

Return type:

dict

Notes

  • Returns empty dict if this is not a value class.

  • The dictionary has unit name as the key and HedSchemaEntry as value.

property attributes: dict

Return a dict of all the attributes this tag has or empty dict if this is not a value tag.

Returns:

A dict of attributes this tag has.

Return type:

dict

Notes

  • Returns empty dict if this is not a unit class tag.

  • The dictionary has unit name as the key and HedSchemaEntry as value.

tag_exists_in_schema() bool[source]

Return whether the schema entry for this tag exists.

Returns:

True if this tag exists.

Return type:

bool

Notes

  • This does NOT assure this is a valid tag.

is_takes_value_tag() bool[source]

Return True if this is a takes value tag.

Returns:

True if this is a takes value tag.

Return type:

bool

is_unit_class_tag() bool[source]

Return True if this is a unit class tag.

Returns:

True if this is a unit class tag.

Return type:

bool

is_value_class_tag() bool[source]

Return True if this is a value class tag.

Returns:

True if this is a tag with a value class.

Return type:

bool

is_basic_tag() bool[source]

Return True if a known tag with no extension or value.

Returns:

True if this is a known tag without extension or value.

Return type:

bool

has_attribute(attribute) bool[source]

Return True if this is an attribute this tag has.

Parameters:

attribute (str) – Name of the attribute.

Returns:

True if this tag has the attribute.

Return type:

bool

get_tag_unit_class_units() list[source]

Get the unit class units associated with a particular tag.

Returns:

A list containing the unit class units associated with a particular tag or an empty list.

Return type:

list

property default_unit

Get the default unit class unit for this tag.

Only a tag with a single unit class can have default units.

Returns:

the default unit entry for this tag, or None

Return type:

unit(UnitEntry or None)

base_tag_has_attribute(tag_attribute) bool[source]

Check to see if the tag has a specific attribute.

This is primarily used to check for things like TopLevelTag on Definitions and similar.

Parameters:

tag_attribute (str) – A tag attribute.

Returns:

True if the tag has the specified attribute. False, if otherwise.

Return type:

bool

is_placeholder() bool[source]

Returns if this tag has a placeholder in it.

Returns:

True if it has a placeholder.

Return type:

bool

replace_placeholder(placeholder_value)[source]

If tag has a placeholder character(#), replace with value.

Parameters:

placeholder_value (str) – Value to replace placeholder with.

get_normalized_str()[source]

HedGroup

class hed.models.hed_group.HedGroup(hed_string='', startpos=None, endpos=None, contents=None)[source]

Bases: object

A single parenthesized HED string.

__init__(hed_string='', startpos=None, endpos=None, contents=None)[source]

Return an empty HedGroup object.

Parameters:
  • hed_string (str or None) – Source HED string for this group.

  • startpos (int or None) – Starting index of group(including parentheses) in hed_string.

  • endpos (int or None) – Position after the end (including parentheses) in hed_string.

  • contents (list or None) – A list of HedTags and/or HedGroups that will be set as the contents of this group. Mostly used during definition expansion.

append(tag_or_group)[source]

Add a tag or group to this group.

Parameters:

tag_or_group (HedTag or HedGroup) – The new object to add to this group.

check_if_in_original(tag_or_group) bool[source]

Check if the tag or group in original string.

Parameters:

tag_or_group (HedTag or HedGroup) – The HedTag or HedGroup to be looked for in this group.

Returns:

True if in this group.

Return type:

bool

static replace(item_to_replace, new_contents)[source]

Replace an existing tag or group.

Note: This is a static method that relies on the parent attribute of item_to_replace.

Parameters:
  • item_to_replace (HedTag or HedGroup) – The item to replace must exist or this will raise an error.

  • new_contents (HedTag or HedGroup) – Replacement contents.

Raises:
remove(items_to_remove: Iterable[HedTag | HedGroup])[source]

Remove any tags/groups in items_to_remove.

Parameters:

items_to_remove (list) – List of HedGroups and/or HedTags to remove by identity.

Notes

  • Any groups that become empty will also be pruned.

  • If you pass a child and parent group, the child will also be removed from the parent.

copy() HedGroup[source]

Return a deep copy of this group.

Returns:

The copied group.

Return type:

HedGroup

sort()[source]

Sort the tags and groups in this HedString in a consistent order.

sorted() HedGroup[source]

Return a sorted copy of this HED group

Returns:

The sorted copy.

Return type:

HedGroup

property is_group

True if this is a parenthesized group.

get_all_tags() list[source]

Return HedTags, including descendants.

Returns:

A list of all the tags in this group including descendants.

Return type:

list

get_all_groups(also_return_depth=False) list[source]

Return HedGroups, including descendants and self.

Parameters:

also_return_depth (bool) – If True, yield tuples (group, depth) rather than just groups.

Returns:

The list of all HedGroups in this group, including descendants and self.

Return type:

list

tags() list[source]

Return the direct child tags of this group.

Returns:

All tags directly in this group, filtering out HedGroup children.

Return type:

list

groups() list[source]

Return the direct child groups of this group.

Returns:

All groups directly in this group, filtering out HedTag children.

Return type:

list

get_first_group() HedGroup[source]

Return the first group in this HED string or group.

Useful for things like Def-expand where they only have a single group.

Returns:

The first group.

Return type:

HedGroup

Raises:

ValueError – If there are no groups.

get_original_hed_string() str[source]

Get the original HED string.

Returns:

The original string with no modification.

Return type:

str

property span: tuple[int, int]

Return the source span.

Returns:

start and end index of the group (including parentheses) from the source string.

Return type:

tuple[int, int]

__str__() str[source]

Convert this HedGroup to a string.

Returns:

The group as a string, including any modified HedTags.

Return type:

str

get_as_short() str[source]

Return this HedGroup as a short tag string.

Returns:

The group as a string with all tags as short tags.

Return type:

str

get_as_long() str[source]

Return this HedGroup as a long tag string.

Returns:

The group as a string with all tags as long tags.

Return type:

str

get_as_form(tag_attribute) str[source]

Get the string corresponding to the specified form.

Parameters:

tag_attribute (str) – The hed_tag property to use to construct the string (usually short_tag or long_tag).

Returns:

The constructed string after transformation.

Return type:

str

lower()[source]

Convenience function, equivalent to str(self).lower().

casefold()[source]

Convenience function, equivalent to str(self).casefold().

get_as_indented(tag_attribute='short_tag') str[source]

Return the string as a multiline indented format.

Parameters:

tag_attribute (str) – The hed_tag property to use to construct the string (usually short_tag or long_tag).

Returns:

The indented string.

Return type:

str

find_placeholder_tag() HedTag | None[source]

Return a placeholder tag, if present in this group.

Returns:

The placeholder tag if found.

Return type:

Union[HedTag, None]

Notes

  • Assumes a valid HedString with no erroneous “#” characters.

__eq__(other)[source]

Test whether other is equal to this object.

Note: This does not account for sorting. Objects must be in the same order to match.

find_tags(search_tags, recursive=False, include_groups=2) list[source]

Find the base tags and their containing groups. This searches by short_base_tag, ignoring any ancestors or extensions/values.

Parameters:
  • search_tags (container) – A container of short_base_tags to locate.

  • recursive (bool) – If true, also check subgroups.

  • include_groups (0, 1 or 2) – Specify return values. If 0: return a list of the HedTags. If 1: return a list of the HedGroups containing the HedTags. If 2: return a list of tuples (HedTag, HedGroup) for the found tags.

Returns:

The contents of the list depends on the value of include_groups.

Return type:

list

find_wildcard_tags(search_tags, recursive=False, include_groups=2) list[source]

Find the tags and their containing groups.

This searches tag.short_tag.casefold(), with an implicit wildcard on the end.

e.g. “Eve” will find Event, but not Sensory-event.

Parameters:
  • search_tags (container) – A container of the starts of short tags to search.

  • recursive (bool) – If True, also check subgroups.

  • include_groups (0, 1 or 2) – Specify return values. If 0: return a list of the HedTags. If 1: return a list of the HedGroups containing the HedTags. If 2: return a list of tuples (HedTag, HedGroup) for the found tags.

Returns:

The contents of the list depends on the value of include_groups.

Return type:

list

find_exact_tags(exact_tags, recursive=False, include_groups=1) list[source]

Find the given tags. This will only find complete matches, any extension or value must also match.

Parameters:
  • exact_tags (list of HedTag) – A container of tags to locate.

  • recursive (bool) – If true, also check subgroups.

  • include_groups (bool) – 0, 1 or 2. If 0: Return only tags If 1: Return only groups If 2 or any other value: Return both

Returns:

A list of tuples. The contents depend on the values of the include_group.

Return type:

list

find_def_tags(recursive=False, include_groups=3) list[source]

Find def and def-expand tags.

Parameters:
  • recursive (bool) – If true, also check subgroups.

  • include_groups (int, 0, 1, 2, 3) – Options for return values. If 0: Return only def and def expand tags/. If 1: Return only def tags and def-expand groups. If 2: Return only groups containing defs, or def-expand groups. If 3 or any other value: Return all 3 as a tuple.

Returns:

A list of tuples. The contents depend on the values of the include_group.

Return type:

list

find_tags_with_term(term, recursive=False, include_groups=2) list[source]

Find any tags that contain the given term.

Note: This can only find identified tags.

Parameters:
  • term (str) – A single term to search for.

  • recursive (bool) – If true, recursively check subgroups.

  • include_groups (0, 1 or 2) – Controls return values If 0: Return only tags. If 1: Return only groups. If 2 or any other value: Return both.

Returns:

A list of tuples. The contents depend on the values of the include_group.

Return type:

list

DefinitionDict

class hed.models.definition_dict.DefinitionDict(def_dicts=None, hed_schema=None)[source]

Bases: object

Gathers definitions from a single source.

__init__(def_dicts=None, hed_schema=None)[source]

Definitions to be considered a single source.

Parameters:
  • def_dicts (str or list or DefinitionDict) – DefDict or list of DefDicts/strings or a single string whose definitions should be added.

  • hed_schema (HedSchema or None) – Required if passing strings or lists of strings, unused otherwise.

Raises:

TypeError – Bad type passed as def_dicts.

add_definitions(def_dicts, hed_schema=None)[source]

Add definitions from dict(s) or strings(s) to this dict.

Parameters:
  • def_dicts (list, DefinitionDict, dict, or str) – DefinitionDict or list of DefinitionDicts/strings/dicts whose definitions should be added.

  • hed_schema (HedSchema or None) – Required if passing strings or lists of strings, unused otherwise.

Note - dict form expects DefinitionEntries in the same form as a DefinitionDict

Note - str or list of strings will parse the strings using the hed_schema. Note - You can mix and match types, eg [DefinitionDict, str, list of str] would be valid input.

Raises:

TypeError – Bad type passed as def_dicts.

get(def_name) DefinitionEntry | None[source]

Get the definition entry for the definition name.

Not case-sensitive

Parameters:

def_name (str) – Name of the definition to retrieve.

Returns:

Definition entry for the requested definition.

Return type:

Union[DefinitionEntry, None]

items()[source]

Return the dictionary of definitions.

Alias for .defs.items()

Returns:

DefinitionEntry}): A list of definitions.

Return type:

def_entries({str

property issues

Return issues about duplicate definitions.

check_for_definitions(hed_string_obj, error_handler=None) list[dict][source]

Check string for definition tags, adding them to self.

Parameters:
  • hed_string_obj (HedString) – A single HED string to gather definitions from.

  • error_handler (ErrorHandler or None) – Error context used to identify where definitions are found.

Returns:

List of issues encountered in checking for definitions. Each issue is a dictionary.

Return type:

list[dict]

get_definition_entry(def_tag)[source]

Get the entry for a given def tag.

Does not validate at all.

Parameters:

def_tag (HedTag) – Source HED tag that may be a Def or Def-expand tag.

Returns:

The definition entry if it exists

Return type:

def_entry(DefinitionEntry or None)

static get_as_strings(def_dict) dict[str, str][source]

Convert the entries to strings of the contents

Parameters:

def_dict (dict) – A dict of definitions

Returns:

Definition name and contents

Return type:

dict[str,str]

Input Models

Models for handling different types of input data.

BaseInput

class hed.models.base_input.BaseInput(file, file_type=None, worksheet_name=None, has_column_names=True, mapper=None, name=None, allow_blank_names=True)[source]

Bases: object

Superclass representing a basic columnar file.

TEXT_EXTENSION = ['.tsv', '.txt']
EXCEL_EXTENSION = ['.xlsx']
__init__(file, file_type=None, worksheet_name=None, has_column_names=True, mapper=None, name=None, allow_blank_names=True)[source]

Constructor for the BaseInput class.

Parameters:
  • file (str or file-like or pd.Dataframe) – An xlsx/tsv file to open.

  • file_type (str or None) – “.xlsx” (Excel), “.tsv” or “.txt” (tab-separated text). Derived from file if file is a filename. Ignored if pandas dataframe.

  • worksheet_name (str or None) – Name of Excel workbook worksheet name to use. (Not applicable to tsv files.)

  • has_column_names (bool) – True if file has column names. This value is ignored if you pass in a pandas dataframe.

  • mapper (ColumnMapper or None) – Indicates which columns have HED tags. See SpreadsheetInput or TabularInput for examples of how to use built-in a ColumnMapper.

  • name (str or None) – Optional field for how this file will report errors.

  • allow_blank_names (bool) – If True, column names can be blank

Raises:

HedFileError – For various issues.

Notes: Reasons for raising HedFileError include:
  • file is blank.

  • An invalid dataframe was passed with size 0.

  • An invalid extension was provided.

  • A duplicate or empty column name appears.

  • Cannot open the indicated file.

  • The specified worksheet name does not exist.

  • If the sidecar file or tabular file had invalid format and could not be read.

reset_mapper(new_mapper)[source]

Set mapper to a different view of the file.

Parameters:

new_mapper (ColumnMapper) – A column mapper to be associated with this base input.

property dataframe

The underlying dataframe.

property dataframe_a: DataFrame

Return the assembled dataframe Probably a placeholder name.

Returns:

the assembled dataframe

Return type:

pd.Dataframe

property series_a: Series

Return the assembled dataframe as a series.

Returns:

the assembled dataframe with columns merged.

Return type:

pd.Series

property series_filtered: Series | None

Return the assembled dataframe as a series, with rows that have the same onset combined.

Returns:

the assembled dataframe with columns merged, and the rows filtered together.

Return type:

Union[pd.Series, None]

property onsets

Return the onset column if it exists.

property needs_sorting: bool

Return True if this both has an onset column, and it needs sorting.

property name: str

Name of the data.

property has_column_names: bool

True if dataframe has column names.

property loaded_workbook

The underlying loaded workbooks.

property worksheet_name

The worksheet name.

convert_to_form(hed_schema, tag_form)[source]

Convert all tags in underlying dataframe to the specified form.

Parameters:
  • hed_schema (HedSchema) – The schema to use to convert tags.

  • tag_form (str) – HedTag property to convert tags to. Most cases should use convert_to_short or convert_to_long below.

convert_to_short(hed_schema)[source]

Convert all tags in underlying dataframe to short form.

Parameters:

hed_schema (HedSchema) – The schema to use to convert tags.

convert_to_long(hed_schema)[source]

Convert all tags in underlying dataframe to long form.

Parameters:

hed_schema (HedSchema or None) – The schema to use to convert tags.

shrink_defs(hed_schema)[source]

Shrinks any def-expand found in the underlying dataframe.

Parameters:

hed_schema (HedSchema or None) – The schema to use to identify defs.

expand_defs(hed_schema, def_dict)[source]

Shrinks any def-expand found in the underlying dataframe.

Parameters:
  • hed_schema (HedSchema or None) – The schema to use to identify defs.

  • def_dict (DefinitionDict) – The definitions to expand.

to_excel(file)[source]

Output to an Excel file.

Parameters:

file (str or file-like) – Location to save this base input.

Raises:
  • ValueError – If empty file object was passed.

  • OSError – If the file cannot be opened.

to_csv(file=None) str | None[source]

Write to file or return as a string.

Parameters:

file (str, file-like, or None) – Location to save this file. If None, return as string.

Returns:

None if file is given or the contents as a str if file is None.

Return type:

Union[str, None]

Raises:

OSError – If the file cannot be opened.

property columns: list[str]

Returns a list of the column names.

Empty if no column names.

Returns:

The column names.

Return type:

list

column_metadata() dict[int, ColumnMetadata][source]

Return the metadata for each column.

Returns:

Number/ColumnMetadata pairs.

Return type:

dict[int, ColumnMetadata]

set_cell(row_number, column_number, new_string_obj, tag_form='short_tag')[source]

Replace the specified cell with transformed text.

Parameters:
  • row_number (int) – The row number of the spreadsheet to set.

  • column_number (int) – The column number of the spreadsheet to set.

  • new_string_obj (HedString) – Object with text to put in the given cell.

  • tag_form (str) – Version of the tags (short_tag, long_tag, base_tag, etc.)

Notes

Any attribute of a HedTag that returns a string is a valid value of tag_form.

Raises:
  • ValueError – If there is not a loaded dataframe.

  • KeyError – If the indicated row/column does not exist.

  • AttributeError – If the indicated tag_form is not an attribute of HedTag.

get_worksheet(worksheet_name=None) Workbook | None[source]

Get the requested worksheet.

Parameters:

worksheet_name (str or None) – The name of the requested worksheet by name or the first one if None.

Returns:

The workbook request.

Return type:

Union[openpyxl.workbook.Workbook, None]

Notes

If None, returns the first worksheet.

Raises:

KeyError – If the specified worksheet name does not exist.

validate(hed_schema, extra_def_dicts=None, name=None, error_handler=None) list[dict][source]

Creates a SpreadsheetValidator and returns all issues with this file.

Parameters:
  • hed_schema (HedSchema) – The schema to use for validation.

  • extra_def_dicts (list of DefDict or DefDict) – All definitions to use for validation.

  • name (str) – The name to report errors from this file as.

  • error_handler (ErrorHandler) – Error context to use. Creates a new one if None.

Returns:

A list of issues for a HED string.

Return type:

list[dict]

assemble(mapper=None, skip_curly_braces=False) DataFrame[source]

Assembles the HED strings.

Parameters:
  • mapper (ColumnMapper or None) – Generally pass none here unless you want special behavior.

  • skip_curly_braces (bool) – If True, don’t plug in curly brace values into columns.

Returns:

The assembled dataframe.

Return type:

pd.Dataframe

static combine_dataframe(dataframe) Series[source]
Combine all columns in the given dataframe into a single HED string series,

skipping empty columns and columns with empty strings.

Parameters:

dataframe (pd.Dataframe) – The dataframe to combine

Returns:

The assembled series.

Return type:

pd.Series

get_def_dict(hed_schema, extra_def_dicts=None) DefinitionDict[source]

Return the definition dict for this file.

Note: Baseclass implementation returns just extra_def_dicts.

Parameters:
  • hed_schema (HedSchema) – Identifies tags to find definitions(if needed).

  • extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.

Returns:

A single definition dict representing all the data(and extra def dicts).

Return type:

DefinitionDict

get_column_refs() list[source]

Return a list of column refs for this file.

Default implementation returns empty list.

Returns:

A list of unique column refs found.

Return type:

list

Sidecar

class hed.models.sidecar.Sidecar(files, name=None)[source]

Bases: object

Contents of a JSON file or JSON files.

__init__(files, name=None)[source]

Construct a Sidecar object representing a JSON file.

Parameters:
  • files (str or FileLike or list) – A string or file-like object representing a JSON file, or a list of such.

  • name (str or None) – Optional name identifying this sidecar, generally a filename.

__iter__()[source]

An iterator to go over the individual column metadata.

Returns:

An iterator over the column metadata values.

Return type:

iterator

property all_hed_columns: list[str]

Return all columns that are HED compatible.

Returns:

A list of all valid HED columns by name.

Return type:

list

property def_dict: DefinitionDict

Definitions from this sidecar.

Generally you should instead call get_def_dict to get the relevant definitions.

Returns:

The definitions for this sidecar.

Return type:

DefinitionDict

property column_data

Generate the ColumnMetadata for this sidecar.

Returns:

ColumnMetadata}): The column metadata defined by this sidecar.

Return type:

dict({str

get_def_dict(hed_schema, extra_def_dicts=None) DefinitionDict[source]

Return the definition dict for this sidecar.

Parameters:
  • hed_schema (HedSchema) – Identifies tags to find definitions.

  • extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.

Returns:

A single definition dict representing all the data(and extra def dicts).

Return type:

DefinitionDict

save_as_json(save_filename)[source]

Save column metadata to a JSON file.

Parameters:

save_filename (str) – Path to save file.

get_as_json_string() str[source]

Return this sidecar’s column metadata as a string.

Returns:

The json string representing this sidecar.

Return type:

str

load_sidecar_file(file)[source]

Load column metadata from a given json file.

Parameters:

file (str or FileLike) – If a string, this is a filename. Otherwise, it will be parsed as a file-like.

Raises:

HedFileError – If the file was not found or could not be parsed into JSON.

load_sidecar_files(files)[source]

Load json from a given file or list.

Parameters:

files (str or FileLike or list) – A string or file-like object representing a JSON file, or a list of such.

Raises:

HedFileError – If the file was not found or could not be parsed into JSON.

validate(hed_schema, extra_def_dicts=None, name=None, error_handler=None) list[dict][source]

Create a SidecarValidator and validate this sidecar with the schema.

Parameters:
  • hed_schema (HedSchema) – Input data to be validated.

  • extra_def_dicts (list or DefinitionDict) – Extra def dicts in addition to sidecar.

  • name (str) – The name to report this sidecar as.

  • error_handler (ErrorHandler) – Error context to use. Creates a new one if None.

Returns:

A list of issues associated with each level in the HED string.

Return type:

list[dict]

extract_definitions(hed_schema, error_handler=None) DefinitionDict[source]

Gather and validate definitions in metadata.

Parameters:
  • hed_schema (HedSchema) – The schema to used to identify tags.

  • error_handler (ErrorHandler or None) – The error handler to use for context, uses a default one if None.

Returns:

Contains all the definitions located in the sidecar.

Return type:

DefinitionDict

get_column_refs() list[str][source]

Returns a list of column refs found in this sidecar.

This does not validate

Returns:

A list of unique column refs found.

Return type:

list[str]

TabularInput

class hed.models.tabular_input.TabularInput(file=None, sidecar=None, name=None)[source]

Bases: BaseInput

A BIDS tabular file with sidecar.

HED_COLUMN_NAME = 'HED'
__init__(file=None, sidecar=None, name=None)[source]

Constructor for the TabularInput class.

Parameters:
  • file (str or FileLike or pd.Dataframe) – A tsv file to open.

  • sidecar (str or Sidecar or FileLike) – A Sidecar or source file/filename.

  • name (str) – The name to display for this file for error purposes.

Raises:
  • HedFileError – For the following issues:

  • - The file is blank.

  • - An invalid dataframe was passed with size 0.

  • - An invalid extension was provided.

  • - A duplicate or empty column name appears.

OSError: If it cannot open the indicated file. ValueError: If this file has no column names.

reset_column_mapper(sidecar=None)[source]

Change the sidecars and settings.

Parameters:

sidecar (str or [str] or Sidecar or [Sidecar]) – A list of json filenames to pull sidecar info from.

get_def_dict(hed_schema, extra_def_dicts=None) DefinitionDict[source]

Return the definition dict for this sidecar.

Parameters:
  • hed_schema (HedSchema) – Used to identify tags to find definitions.

  • extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.

Returns:

A single definition dict representing all the data(and extra def dicts).

Return type:

DefinitionDict

get_column_refs() list[str][source]

Return a list of column refs for this file.

Default implementation returns none.

Returns:

A list of unique column refs found.

Return type:

list[str]

get_sidecar() Sidecar | None[source]

Return the sidecar associated with this TabularInput.

SpreadsheetInput

class hed.models.spreadsheet_input.SpreadsheetInput(file=None, file_type=None, worksheet_name=None, tag_columns=None, has_column_names=True, column_prefix_dictionary=None, name=None)[source]

Bases: BaseInput

A spreadsheet of HED tags.

__init__(file=None, file_type=None, worksheet_name=None, tag_columns=None, has_column_names=True, column_prefix_dictionary=None, name=None)[source]

Constructor for the SpreadsheetInput class.

Parameters:
  • file (str or file like) – An xlsx/tsv file to open or a File object.

  • file_type (str or None) – “.xlsx” for Excel, “.tsv” or “.txt” for tsv. data.

  • worksheet_name (str or None) – The name of the Excel workbook worksheet that contains the HED tags. Not applicable to tsv files. If omitted for Excel, the first worksheet is assumed.

  • tag_columns (list) – A list of ints or strs containing the columns that contain the HED tags. If ints then column numbers with [1] indicating only the second column has tags.

  • has_column_names (bool) – True if file has column names. Validation will skip over the first row. first line of the file if the spreadsheet as column names.

  • column_prefix_dictionary (dict or None) – Dictionary with keys that are column numbers/names and values are HED tag prefixes to prepend to the tags in that column before processing.

Notes

  • If file is a string, file_type is derived from file and this parameter is ignored.

  • column_prefix_dictionary may be deprecated/renamed. These are no longer prefixes, but rather converted to value columns. e.g. {“key”: “Description”, 1: “Label/”} will turn into value columns as {“key”: “Description/#”, 1: “Label/#”} It will be a validation issue if column 1 is called “key” in the above example. This means it no longer accepts anything but the value portion only in the columns.

Raises:
  • HedFileError – for any of the following issues:

  • - The file is blank.

  • - An invalid dataframe was passed with size 0.

  • - An invalid extension was provided.

  • - A duplicate or empty column name appears.

  • - Cannot open the indicated file.

  • - The specified worksheet name does not exist.