Schema

HED schema management and validation tools.

Core schema classes

HedSchemaBase

class HedSchemaBase[source]

Bases: ABC

Baseclass for schema and schema group.

Implementing the abstract functions will allow you to use the schema for validation

abstractmethod check_compliance(check_for_warnings=True, name=None, error_handler=None)[source]

Check for HED3 compliance of this schema.

Parameters:
  • check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.

  • name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.

  • error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.

Returns:

A list of all warnings and errors found in the file. Each issue is a dictionary.

Return type:

list

abstractmethod find_tag_entry(tag, schema_namespace='')[source]

Find the schema entry for a given source tag.

Parameters:
  • tag (str, HedTag) – Any form of tag to look up. Can have an extension, value, etc.

  • schema_namespace (str) – The schema namespace of the tag, if any.

Returns:

The located tag entry for this tag. str: The remainder of the tag that isn’t part of the base tag. list: A list of errors while converting.

Return type:

HedTagEntry

Notes

Works left to right (which is mostly relevant for errors).

abstractmethod get_formatted_version()[source]

The HED version string including namespace and library name if any of this schema.

Returns:

The complete version of this schema including library name and namespace.

Return type:

str

abstractmethod get_schema_versions()[source]

A list of HED version strings including namespace and library name if any of this schema.

Returns:

The complete version of this schema including library name and namespace.

Return type:

list

abstractmethod get_tag_entry(name, key_class=HedSectionKey.Tags, schema_namespace='')[source]

Return the schema entry for this tag, if one exists.

Parameters:
  • name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required

  • key_class (HedSectionKey or str) – The type of entry to return.

  • schema_namespace (str) – Only used on Tags. If incorrect, will return None.

Returns:

The schema entry for the given tag.

Return type:

HedSchemaEntry

abstractmethod get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags)[source]

Return tag entries with the given attribute.

Parameters:
  • attribute (str) – A tag attribute: e.g., HedKey.ExtensionAllowed

  • key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.

Returns:

A list of all tags with this attribute.

Return type:

list

Notes

  • The result is cached so will be fast after first call.

property name

User provided name for this schema, defaults to filename or version if no name provided.

property schema_83_props

Returns if this is an 8.3.0 or greater schema.

Returns:

True if standard or partnered schema is 8.3.0 or greater.

Return type:

bool

abstractmethod schema_for_namespace(namespace)[source]

Return the HedSchema for the library namespace.

Parameters:

namespace (str) – A schema library name namespace.

Returns:

The specific schema for this library name namespace if exists.

Return type:

HedSchema or None

abstract property valid_prefixes

Return a list of all prefixes this group will accept.

Returns:

A list of strings representing valid prefixes for this group.

Return type:

list[str]

HedSchema

class HedSchema[source]

Bases: HedSchemaBase

A HED schema suitable for processing.

property attributes: HedSchemaSection

Return the attributes schema section.

Returns:

The attributes section.

Return type:

HedSchemaSection

can_save() bool[source]

Returns if it’s legal to save this schema.

You cannot save schemas loaded as merged from multiple library schemas.

Returns:

True if this can be saved.

Return type:

bool

check_compliance(check_for_warnings=True, name=None, error_handler=None) list[dict][source]

Check for HED3 compliance of this schema.

Parameters:
  • check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.

  • name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.

  • error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.

Returns:

A list of all warnings and errors found in the file. Each issue is a dictionary.

Return type:

list[dict]

finalize_dictionaries()[source]

Call to finish loading.

find_tag_entry(tag, schema_namespace='') tuple[HedTagEntry | None, str | None, list[dict]][source]

Find the schema entry for a given source tag.

Parameters:
  • tag (str, HedTag) – Any form of tag to look up. Can have an extension, value, etc.

  • schema_namespace (str) – The schema namespace of the tag, if any.

Returns:

  • The located tag entry for this tag.

  • The remainder of the tag that isn’t part of the base tag.

  • A list of errors while converting.

Return type:

tuple[Union[“HedTagEntry”, None], Union[str, None], list[dict]]

Notes

Works left to right (which is mostly relevant for errors).

get_as_dataframes(save_merged=False) dict[DataFrame][source]

Get a dict of dataframes representing this file

Parameters:

save_merged (bool) – If True, returns DFs as if merged with standard.

Returns:

A dict of dataframes you can load as a schema.

Return type:

dict[DataFrame]

get_as_json_string(save_merged=True) str[source]

Return the schema as a JSON string.

Parameters:

save_merged (bool) – If True, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Returns:

The schema as a JSON string.

Return type:

str

get_as_mediawiki_string(save_merged=False) str[source]

Return the schema to a MEDIAWIKI string.

Parameters:

save_merged (bool) – If True, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Returns:

The schema as a string in MEDIAWIKI format.

Return type:

str

get_as_xml_string(save_merged=True) str[source]

Return the schema to an XML string.

Parameters:

save_merged (bool) – If True, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Returns:

The schema as an XML string.

Return type:

str

get_extras(extras_key) DataFrame | None[source]

Get the extras corresponding to the given key

Parameters:

extras_key (str) – The key to check for in the extras dictionary.

Returns:

The DataFrame for this extras key, or None if it doesn’t exist or is empty.

Return type:

Union[pd.DataFrame, None]

get_formatted_version() str[source]

The HED version string including namespace and library name if any of this schema.

Returns:

A json formatted string of the complete version of this schema including library name and namespace.

Return type:

str

get_save_header_attributes(save_merged: bool = False) dict[source]

Returns the attributes that should be saved.

Parameters:

save_merged (bool) – Whether to save as merged schema.

Returns:

The header attributes dictionary.

Return type:

dict

get_schema_versions() list[str][source]

A list of HED version strings including namespace and library name if any of this schema.

Returns:

The complete version of this schema including library name and namespace.

Return type:

list[str]

get_tag_attribute_names_old() dict[str, HedSchemaEntry][source]

Return a dict of all allowed tag attributes.

Returns:

A dictionary whose keys are attribute names and values are HedSchemaEntry object.

Return type:

dict[str, HedSchemaEntry]

get_tag_entry(name: str, key_class=HedSectionKey.Tags, schema_namespace: str = '') HedSchemaEntry | None[source]

Return the schema entry for this tag, if one exists.

Parameters:
  • name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required

  • key_class (HedSectionKey or str) – The type of entry to return.

  • schema_namespace (str) – Only used on Tags. If incorrect, will return None.

Returns:

The schema entry for the given tag, or None if not found.

Return type:

Union[HedSchemaEntry, None]

get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags) list[HedSchemaEntry][source]

Return tag entries with the given attribute.

Parameters:
  • attribute (str) – A tag attribute: e.g., HedKey.ExtensionAllowed

  • key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.

Returns:

A list of all tags with this attribute.

Return type:

list[HedSchemaEntry]

Notes

  • The result is cached so will be fast after first call.

has_duplicates()[source]

Returns the first duplicate tag/unit/etc. if any section has a duplicate name

property library: str

The name of this library schema if one exists.

Returns:

Library name if any.

Return type:

str

property merged: bool

Returns if this schema was loaded from a merged file.

Returns:

True if file was loaded from a merged file.

Return type:

bool

property name

User provided name for this schema, defaults to filename or version if no name provided.

property properties: HedSchemaSection

Return the properties schema section.

Returns:

The properties section.

Return type:

HedSchemaSection

save_as_dataframes(base_filename, save_merged=False)[source]

Save as dataframes to a folder of files.

If base_filename has a .tsv suffix, save directly to the indicated location. If base_filename is a directory(does NOT have a .tsv suffix), save the contents into a directory named that. The subfiles are named the same: e.g. HED8.3.0/HED8.3.0_Tag.tsv

Parameters:
  • base_filename (str) – Save filename. A suffix will be added to most, e.g. _Tag

  • save_merged (bool) – If True, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Raises:

OSError – File cannot be saved for some reason.

save_as_json(filename, save_merged=True)[source]

Save as JSON to a file.

Parameters:
  • filename (str) – Save location.

  • save_merged (bool) – If true, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Raises:

OSError – File cannot be saved for some reason.

save_as_mediawiki(filename, save_merged=False)[source]

Save as MEDIAWIKI to a file.

Parameters:
  • filename (str) – Save location.

  • save_merged (bool) – If True, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Raises:

OSError – File cannot be saved for some reason.

save_as_xml(filename, save_merged=True)[source]

Save as XML to a file.

Parameters:
  • filename (str) – Save location.

  • save_merged (bool) – If true, this will save the schema as a merged schema if it is a “withStandard” schema. If it is not a “withStandard” schema, this setting has no effect.

Raises:

OSError – File cannot be saved for some reason.

property schema_83_props

Returns if this is an 8.3.0 or greater schema.

Returns:

True if standard or partnered schema is 8.3.0 or greater.

Return type:

bool

schema_for_namespace(namespace: str) HedSchema | None[source]

Return HedSchema object for this namespace.

Parameters:

namespace (str) – The schema library name namespace.

Returns:

The HED schema object for this schema, or None if namespace doesn’t match.

Return type:

Union[HedSchema, None]

property schema_namespace: str

Returns the schema namespace prefix.

Returns:

The schema namespace prefix.

Return type:

str

set_schema_prefix(schema_namespace)[source]

Set library namespace associated for this schema.

Parameters:

schema_namespace (str) – Should be empty, or end with a colon.(Colon will be automated added if missing).

Raises:

HedFileError – The prefix is invalid.

property tags: HedSchemaTagSection

Return the tag schema section.

Returns:

The tag section.

Return type:

HedSchemaTagSection

property unit_classes: HedSchemaUnitClassSection

Return the unit classes schema section.

Returns:

The unit classes section.

Return type:

HedSchemaUnitClassSection

property unit_modifiers: HedSchemaSection

Return the modifiers classes schema section.

Returns:

The unit modifiers section.

Return type:

HedSchemaSection

property units: HedSchemaUnitSection

Return the unit schema section.

Returns:

The unit section.

Return type:

HedSchemaUnitSection

property valid_prefixes: list[str]

Return a list of all prefixes this schema will accept

Returns:

A list of valid tag prefixes for this schema.

Return type:

list[str]

Notes

  • The return value is always length 1 if using a HedSchema.

property value_classes: HedSchemaSection

Return the value classes schema section.

Returns:

The value classes section.

Return type:

HedSchemaSection

property version: str

The complete schema version, including prefix and library name(if applicable).

Returns:

The complete schema version including library name and namespace.

Return type:

str

property version_number: str

The HED version of this schema.

Returns:

The version of this schema.

Return type:

str

property with_standard: str

The version of the base schema this is extended from, if it exists.

Returns:

HED version or empty string.

Return type:

str

HedSchemaGroup

class HedSchemaGroup(schema_list, name='')[source]

Bases: HedSchemaBase

Container for multiple HedSchema objects.

Notes

  • The container class is useful when library schema are included.

  • You cannot save/load/etc. the combined schema object directly.

check_compliance(check_for_warnings=True, name=None, error_handler=None) list[dict][source]

Check for HED3 compliance of this schema.

Parameters:
  • check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.

  • name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.

  • error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.

Returns:

A list of all warnings and errors found in the file. Each issue is a dictionary.

Return type:

list[dict]

find_tag_entry(tag, schema_namespace='') tuple[HedTagEntry | None, str | None, list][source]

Find the schema entry for a given source tag.

Parameters:
  • tag (str, HedTag) – Any form of tag to look up. Can have an extension, value, etc.

  • schema_namespace (str) – The schema namespace of the tag, if any.

Returns:

The located tag entry for this tag. str: The remainder of the tag that isn’t part of the base tag. list: A list of errors while converting.

Return type:

HedTagEntry

Notes

Works left to right (which is mostly relevant for errors).

get_formatted_version() str[source]

The HED version string including namespace and library name if any of this schema.

Returns:

The complete version of this schema including library name and namespace.

Return type:

str

get_schema_versions() list[str][source]

A list of HED version strings including namespace and library name if any for these schemas.

Returns:

The complete version of this schema including library name and namespace.

Return type:

list[str]

get_tag_entry(name, key_class=HedSectionKey.Tags, schema_namespace='') HedSchemaEntry | None[source]

Return the schema entry for this tag, if one exists.

Parameters:
  • name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required

  • key_class (HedSectionKey or str) – The type of entry to return.

  • schema_namespace (str) – Only used on Tags. If incorrect, will return None.

Returns:

The schema entry for the given tag.

Return type:

HedSchemaEntry

get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags) list[source]

Return tag entries with the given attribute.

Parameters:
  • attribute (str) – A tag attribute: e.g., HedKey.ExtensionAllowed

  • key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.

Returns:

A list of all tags with this attribute.

Return type:

list

Notes

  • The result is cached so will be fast after first call.

property name

User provided name for this schema, defaults to filename or version if no name provided.

property schema_83_props

Returns if this is an 8.3.0 or greater schema.

Returns:

True if standard or partnered schema is 8.3.0 or greater.

Return type:

bool

schema_for_namespace(namespace) HedSchema | None[source]

Return the HedSchema for the library namespace.

Parameters:

namespace (str) – A schema library name namespace.

Returns:

The specific schema for this library name namespace if exists.

Return type:

Union[HedSchema, None]

property valid_prefixes: list[str]

Return a list of all prefixes this group will accept.

Returns:

A list of strings representing valid prefixes for this group.

Return type:

list[str]

Schema entry classes

HedSchemaEntry

class HedSchemaEntry(name, section)[source]

Bases: object

A single node in the HED schema vocabulary.

Every term, unit, unit class, value class, attribute, and property that appears in a loaded HedSchema is represented as a HedSchemaEntry (or one of its subclasses). The entry stores the node’s name, all declared attributes (e.g. takesValue, allowedCharacter), its description, and a back-reference to its containing HedSchemaSection.

Concrete subclasses add section-specific state:

Use this class (or its subclasses) directly when you need to:

  • Introspect schema vocabulary (e.g. list all tags with takesValue).

  • Build schema validators, schema browsers, or schema-diff tools.

  • Implement custom HED annotation tooling that looks up tag metadata.

Most users never need this classget_tag_entry() and get_all_schema_tags() are sufficient for the common lookup patterns.

attribute_has_property(attribute, property_name) bool[source]

Return True if attribute has property.

Parameters:
  • attribute (str) – Attribute name to check for property_name.

  • property_name (str) – The property value to return.

Returns:

Returns True if this entry has the property.

Return type:

bool

finalize_entry(schema)[source]

Called once after loading to set internal state.

Parameters:

schema (HedSchema) – The schema that holds the rules.

has_attribute(attribute, return_value=False) bool | Any[source]

Checks for the existence of an attribute in this entry.

Parameters:
  • attribute (str) – The attribute to check for.

  • return_value (bool) – If True, returns the actual value of the attribute. If False, returns a boolean indicating the presence of the attribute.

Returns:

If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.

Return type:

Union[bool, any]

Notes

  • The existence of an attribute does not guarantee its validity.

property section_key

Returns the HedSectionKey identifying which schema section owns this entry.

Returns:

The section key for this entry’s parent section.

Return type:

HedSectionKey

HedTagEntry

class HedTagEntry(*args, **kwargs)[source]

Bases: HedSchemaEntry

A vocabulary tag node in the HED schema.

Extends HedSchemaEntry with full/short tag name forms, value-class and unit-class associations, and helper methods for tag-path traversal.

Typical access pattern:

entry = schema.get_tag_entry("Sensory-event")
print(entry.long_tag_name)   # "Event/Sensory-event"
print(entry.takes_value_child)  # child "#" entry if tag takes a value
unit_classes

Unit classes accepted by this tag’s value (non-empty only if takesValue is set).

Type:

dict[str, UnitClassEntry]

value_classes

Value classes that constrain the value format.

Type:

dict[str, HedSchemaEntry]

long_tag_name

The full slash-separated path from the schema root, with any trailing /# stripped.

Type:

str

short_tag_name

The final component of the tag path (short form).

Type:

str

attribute_has_property(attribute, property_name) bool

Return True if attribute has property.

Parameters:
  • attribute (str) – Attribute name to check for property_name.

  • property_name (str) – The property value to return.

Returns:

Returns True if this entry has the property.

Return type:

bool

base_tag_has_attribute(tag_attribute)[source]

Check if the base tag has a specific attribute.

Parameters:

tag_attribute (str) – A tag attribute.

Returns:

True if the tag has the specified attribute. False, if otherwise.

Return type:

bool

Notes

This mostly is relevant for takes value tags.

finalize_entry(schema)[source]

Called once after schema loading to set state.

Parameters:

schema (HedSchema) – The schema that the rules come from.

has_attribute(attribute, return_value=False)[source]

Returns th existence or value of an attribute in this entry.

This also checks parent tags for inheritable attributes like ExtensionAllowed.

Parameters:
  • attribute (str) – The attribute to check for.

  • return_value (bool) – If True, returns the actual value of the attribute. If False, returns a boolean indicating the presence of the attribute.

Returns:

If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.

Return type:

Union[bool, any]

Notes

  • The existence of an attribute does not guarantee its validity.

property parent

Get the parent entry of this tag

property parent_name

Gets the parent tag entry name

property section_key

Returns the HedSectionKey identifying which schema section owns this entry.

Returns:

The section key for this entry’s parent section.

Return type:

HedSectionKey

UnitClassEntry

class UnitClassEntry(*args, **kwargs)[source]

Bases: HedSchemaEntry

A unit class node in the HED schema (e.g. time, mass, frequency).

Extends HedSchemaEntry with the set of UnitEntry objects that belong to the class and a pre-computed derivative_units dict that maps every accepted surface form (including SI prefixes and plurals) to its canonical UnitEntry.

Typical access pattern:

unit_class = schema.get_tag_entry("time", HedSectionKey.UnitClasses)
for name, unit in unit_class.units.items():
    print(name, unit.attributes)
units

Map from unit name to entry after finalize_entry() is called.

Type:

dict[str, UnitEntry]

derivative_units

Map from every accepted surface form (plural, SI-prefixed, etc.) to the base unit entry.

Type:

dict[str, UnitEntry]

add_unit(unit_entry)[source]

Add the given unit entry to this unit class.

Parameters:

unit_entry (HedSchemaEntry) – Unit entry to add.

attribute_has_property(attribute, property_name) bool

Return True if attribute has property.

Parameters:
  • attribute (str) – Attribute name to check for property_name.

  • property_name (str) – The property value to return.

Returns:

Returns True if this entry has the property.

Return type:

bool

property children

Alias to get the units for this class

Returns:

The unit list for this class

Return type:

unit_list(list)

finalize_entry(schema)[source]

Called once after schema load to set state.

Parameters:

schema (HedSchema) – The object with the schema rules.

get_derivative_unit_entry(units)[source]

Gets the (derivative) unit entry if it exists

Parameters:

units (str) – The unit name to check, can be plural or include a modifier.

Returns:

The unit entry if it exists.

Return type:

Union[UnitEntry, None]

has_attribute(attribute, return_value=False) bool | Any

Checks for the existence of an attribute in this entry.

Parameters:
  • attribute (str) – The attribute to check for.

  • return_value (bool) – If True, returns the actual value of the attribute. If False, returns a boolean indicating the presence of the attribute.

Returns:

If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.

Return type:

Union[bool, any]

Notes

  • The existence of an attribute does not guarantee its validity.

property section_key

Returns the HedSectionKey identifying which schema section owns this entry.

Returns:

The section key for this entry’s parent section.

Return type:

HedSectionKey

UnitEntry

class UnitEntry(*args, **kwargs)[source]

Bases: HedSchemaEntry

A single unit node in the HED schema (e.g. second, gram, hertz).

Extends HedSchemaEntry with the list of SI unit modifiers that apply to this unit, a pre-computed derivative_units mapping (surface form → conversion factor), and a back-reference to the parent UnitClassEntry.

unit_modifiers

SI modifier entries (e.g. milli, kilo).

Type:

list[HedSchemaEntry]

derivative_units

Map from every accepted surface form to its numeric conversion factor relative to the SI base unit.

Type:

dict[str, float]

unit_class_entry

The parent unit class.

Type:

UnitClassEntry

attribute_has_property(attribute, property_name) bool

Return True if attribute has property.

Parameters:
  • attribute (str) – Attribute name to check for property_name.

  • property_name (str) – The property value to return.

Returns:

Returns True if this entry has the property.

Return type:

bool

finalize_entry(schema)[source]

Called once after loading to set internal state.

Parameters:

schema (HedSchema) – The schema rules come from.

get_conversion_factor(unit_name)[source]

Returns the conversion factor from combining this unit with the specified modifier

Parameters:

unit_name (str or None) – the full name of the unit with modifier

Returns:

Returns the conversion factor or None

Return type:

Union[float, None]

has_attribute(attribute, return_value=False) bool | Any

Checks for the existence of an attribute in this entry.

Parameters:
  • attribute (str) – The attribute to check for.

  • return_value (bool) – If True, returns the actual value of the attribute. If False, returns a boolean indicating the presence of the attribute.

Returns:

If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.

Return type:

Union[bool, any]

Notes

  • The existence of an attribute does not guarantee its validity.

property section_key

Returns the HedSectionKey identifying which schema section owns this entry.

Returns:

The section key for this entry’s parent section.

Return type:

HedSectionKey

Schema section classes

HedSchemaSection

class HedSchemaSection(section_key, case_sensitive=True)[source]

Bases: object

Typed container for all entries in one section of a loaded HED schema.

A HedSchema is divided into sections (tags, unit classes, units, value classes, attributes, properties, unit modifiers). Each section is a HedSchemaSection that maps lower-cased entry names to their HedSchemaEntry objects and tracks which attributes are valid for that section.

The concrete entry type for each section is determined by entries_by_section:

Section key

Entry type

HedSectionKey.Tags | HedTagEntry HedSectionKey.UnitClasses | UnitClassEntry HedSectionKey.Units | UnitEntry everything else | HedSchemaEntry

Use this class directly when you need to:

  • Iterate over all entries in a specific schema section.

  • Build schema comparison or diff tools.

  • Access valid_attributes to determine which attributes are legal for a given section.

all_names

Map from lower-cased name to entry.

Type:

dict[str, HedSchemaEntry]

all_entries

Entries in insertion order.

Type:

list[HedSchemaEntry]

valid_attributes

Attribute entries that are declared valid for this section.

Type:

dict[str, HedSchemaEntry]

property duplicate_names

Returns a dict of entries that share the same name within this section.

Returns:

Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.

Return type:

dict

get(key)[source]

Return the name associated with key.

Parameters:

key (str) – The name of the key.

get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str][source]

Return entries or names with given attribute.

Parameters:
  • attribute_name (str) – The name of the attribute(generally a HedKey entry).

  • return_name_only (bool) – If True, return the name as a string rather than the tag entry.

  • schema_namespace (str) – Prepends given namespace to each name if returning names.

Returns:

List of HedSchemaEntry or strings representing the names.

Return type:

list[Union[HedSchemaEntry, str]]

items()[source]

Return the items.

keys()[source]

The names of the keys.

property section_key

Returns the HedSectionKey identifying this section.

Returns:

The key for this schema section.

Return type:

HedSectionKey

values()[source]

All names of the sections.

HedSchemaUnitSection

class HedSchemaUnitSection(section_key, case_sensitive=True)[source]

Bases: HedSchemaSection

The schema section containing units.

property duplicate_names

Returns a dict of entries that share the same name within this section.

Returns:

Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.

Return type:

dict

get(key)

Return the name associated with key.

Parameters:

key (str) – The name of the key.

get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str]

Return entries or names with given attribute.

Parameters:
  • attribute_name (str) – The name of the attribute(generally a HedKey entry).

  • return_name_only (bool) – If True, return the name as a string rather than the tag entry.

  • schema_namespace (str) – Prepends given namespace to each name if returning names.

Returns:

List of HedSchemaEntry or strings representing the names.

Return type:

list[Union[HedSchemaEntry, str]]

items()

Return the items.

keys()

The names of the keys.

property section_key

Returns the HedSectionKey identifying this section.

Returns:

The key for this schema section.

Return type:

HedSectionKey

values()

All names of the sections.

HedSchemaUnitClassSection

class HedSchemaUnitClassSection(section_key, case_sensitive=True)[source]

Bases: HedSchemaSection

The schema section containing unit classes.

property duplicate_names

Returns a dict of entries that share the same name within this section.

Returns:

Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.

Return type:

dict

get(key)

Return the name associated with key.

Parameters:

key (str) – The name of the key.

get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str]

Return entries or names with given attribute.

Parameters:
  • attribute_name (str) – The name of the attribute(generally a HedKey entry).

  • return_name_only (bool) – If True, return the name as a string rather than the tag entry.

  • schema_namespace (str) – Prepends given namespace to each name if returning names.

Returns:

List of HedSchemaEntry or strings representing the names.

Return type:

list[Union[HedSchemaEntry, str]]

items()

Return the items.

keys()

The names of the keys.

property section_key

Returns the HedSectionKey identifying this section.

Returns:

The key for this schema section.

Return type:

HedSectionKey

values()

All names of the sections.

HedSchemaTagSection

class HedSchemaTagSection(*args, case_sensitive=False, **kwargs)[source]

Bases: HedSchemaSection

The schema section containing all tags.

property duplicate_names

Returns a dict of entries that share the same name within this section.

Returns:

Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.

Return type:

dict

get(key)[source]

Return the tag entry for the given long-form key, or None if not found.

Parameters:

key (str) – Long-form tag string to look up.

Returns:

The matching entry, or None if absent.

Return type:

HedTagEntry or None

get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str]

Return entries or names with given attribute.

Parameters:
  • attribute_name (str) – The name of the attribute(generally a HedKey entry).

  • return_name_only (bool) – If True, return the name as a string rather than the tag entry.

  • schema_namespace (str) – Prepends given namespace to each name if returning names.

Returns:

List of HedSchemaEntry or strings representing the names.

Return type:

list[Union[HedSchemaEntry, str]]

items()

Return the items.

keys()

The names of the keys.

property section_key

Returns the HedSectionKey identifying this section.

Returns:

The key for this schema section.

Return type:

HedSectionKey

values()

All names of the sections.

Schema IO and caching

Schema loading functions

Utilities for loading and outputting HED schema.

from_dataframes(schema_data, schema_namespace=None, name=None) HedSchema[source]

Create a schema from the given string.

Parameters:
  • schema_data (dict of str or None) – A dict of DF_SUFFIXES:file_as_string_or_df Should have an entry for all values of DF_SUFFIXES.

  • schema_namespace (str, None) – The name_prefix all tags in this schema will accept.

  • name (str or None) – User supplied identifier for this schema

Returns:

The loaded schema.

Return type:

HedSchema

Raises:

Notes

  • The loading is determined by file type.

from_string(schema_string, schema_format='.xml', schema_namespace=None, schema=None, name=None) HedSchema[source]

Create a schema from the given string.

Parameters:
  • schema_string (str) – An XML or MEDIAWIKI file as a single long string

  • schema_format (str) – The schema format of the source schema string. Allowed normal values: .mediawiki, .xml, .json

  • schema_namespace (str, None) – The name_prefix all tags in this schema will accept.

  • schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.

  • name (str or None) – User supplied identifier for this schema

Returns:

The loaded schema.

Return type:

HedSchema

Raises:

HedFileError

  • If empty string or invalid extension is passed.

  • Other fatal formatting issues with file

Notes

  • The loading is determined by file type.

get_hed_xml_version(xml_file_path) str[source]

Get the version number from a HED XML file.

Parameters:

xml_file_path (str) – The path to a HED XML file.

Returns:

The version number of the HED XML file.

Return type:

str

Raises:

HedFileError

  • There is an issue loading the schema

load_schema(hed_path, schema_namespace=None, schema=None, name=None) HedSchema[source]

Load a schema from the given file or URL path.

Parameters:
  • hed_path (str) – A filepath or url to open a schema from. If loading a TSV file, this should be a single filename where: Template: basename.tsv, where files are named basename_Struct.tsv, basename_Tag.tsv, etc. Alternatively, you can point to a directory containing the .tsv files.

  • schema_namespace (str or None) – The name_prefix all tags in this schema will accept.

  • schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.

  • name (str or None) – User supplied identifier for this schema

Returns:

The loaded schema.

Return type:

HedSchema

Raises:
load_schema_version(xml_version=None, xml_folder=None) HedSchema | HedSchemaGroup[source]

Return a HedSchema or HedSchemaGroup extracted from xml_version

Parameters:
  • xml_version (str or list) – List or str specifying which official HED schemas to use. A json str format is also supported, based on the output of HedSchema.get_formatted_version Basic format: [schema_namespace:][library_name_]X.Y.Z.

  • xml_folder (str) – Path to a folder containing schema.

Returns:

The schema or schema group extracted.

Return type:

Union[HedSchema, HedSchemaGroup]

Raises:
  • HedFileError – The xml_version is not valid.

  • HedFileError – The specified version cannot be found or loaded.

  • HedFileError – Other fatal errors loading the schema (These are unlikely if you are not editing them locally).

  • HedFileError – The prefix is invalid.

parse_version_list(xml_version_list) dict[source]

Takes a list of xml versions and returns a dictionary split by prefix

e.g. [“score”, “testlib”] will return {“”: “score, testlib”} e.g. [“score”, “testlib”, “ol:otherlib”] will return {“”: “score, testlib”, “ol:”: “otherlib”}

Parameters:

xml_version_list (list) – List of str specifying which HED schemas to use

Returns:

A dictionary of version strings split by prefix.

Return type:

dict

Cache management

Infrastructure for caching HED schema from remote repositories.

cache_local_versions(cache_folder) int | None[source]

Cache all schemas included with the HED installation.

Parameters:

cache_folder (str) – The folder holding the cache.

Returns:

Returns -1 on cache access failure. None otherwise

Return type:

Union[int, None]

cache_xml_versions(hed_base_urls=('https://api.github.com/repos/hed-standard/hed-schemas/contents/standard_schema',), hed_library_urls=('https://api.github.com/repos/hed-standard/hed-schemas/contents/library_schemas',), skip_folders=('deprecated',), cache_folder=None) float[source]

Cache all schemas at the given URLs.

Parameters:
  • hed_base_urls (str or list) – Path or list of paths. These should point to a single folder.

  • hed_library_urls (str or list) – Path or list of paths. These should point to folder containing library folders.

  • skip_folders (list) – A list of subfolders to skip over when downloading.

  • cache_folder (str) – The folder holding the cache.

Returns:

Returns -1 if cache failed for any reason, including having been cached too recently.

Returns 0 if it successfully cached this time.

Return type:

float

Notes

get_cache_directory(cache_folder=None) str[source]

Return the current value of HED_CACHE_DIRECTORY.

Parameters:

cache_folder (str) – Optional cache folder override.

Returns:

The cache directory path.

Return type:

str

get_hed_version_path(xml_version, library_name=None, local_hed_directory=None) str | None[source]

Get the HED XML file path for a given version.

Searches the local cache first. If the version is not found and local_hed_directory is the default HED cache, the cache is refreshed from GitHub before a second lookup. No network call is made for custom directories.

Parameters:
  • xml_version (str) – The version string to look up.

  • library_name (str or None) – Optional schema library name.

  • local_hed_directory (str or None) – Path to local HED directory. Defaults to HED_CACHE_DIRECTORY. Passing a custom path disables the automatic GitHub refresh.

Returns:

The path to the requested HED XML file, or None.

Return type:

Union[str, None]

get_hed_versions(local_hed_directory=None, library_name=None, check_prerelease=False) list | dict[source]

Get the HED versions in the HED directory.

Parameters:
  • local_hed_directory (str) – Directory to check for versions which defaults to hed_cache.

  • library_name (str or None) – An optional schema library name. None retrieves the standard schema only. Pass “all” to retrieve all standard and library schemas as a dict.

  • check_prerelease (bool) – If True, results can include prerelease schemas. Default is False, returning only released versions.

Returns:

List of version numbers or dictionary {library_name: [versions]}.

Return type:

Union[list, dict]

get_library_data(library_name, cache_folder=None) dict[source]

Retrieve the library data for the given library.

Currently, this is just the valid ID range.

Parameters:
  • library_name (str) – The schema name. “” for standard schema.

  • cache_folder (str) – The cache folder to use if not using the default.

Returns:

The data for a specific library.

Return type:

dict

set_cache_directory(new_cache_dir)[source]

Set default global HED cache directory.

Parameters:

new_cache_dir (str) – Directory to check for versions.

Schema loader base class

class SchemaLoader(filename, schema_as_string=None, schema=None, file_format=None, name='')[source]

Bases: ABC

Baseclass for schema loading, to handle basic errors and partnered schemas

Expected usage is SchemaLoaderXML.load(filename)

SchemaLoaderXML(filename) will load just the header_attributes

static find_rooted_entry(tag_entry, schema, loading_merged)[source]

This semi-validates rooted tags, raising an exception on major errors

Parameters:
  • tag_entry (HedTagEntry) – the possibly rooted tag

  • schema (HedSchema) – The schema being loaded

  • loading_merged (bool) – If this schema was already merged before loading

Returns:

The base tag entry from the standard schema

Returns None if this tag isn’t rooted

Return type:

Union[HedTagEntry, None]

Raises:

HedFileError

  • A rooted attribute is found in a non-paired schema

  • A rooted attribute is not a string

  • A rooted attribute was found on a non-root node in an unmerged schema.

  • A rooted attribute is found on a root node in a merged schema.

  • A rooted attribute indicates a tag that doesn’t exist in the base schema.

fix_extra(key)[source]

Normalize an extras dataframe by ensuring required columns are present and in canonical order.

Parameters:

key (str) – The extras dict key identifying which extra dataframe to fix.

Returns:

The normalized dataframe with required columns added and sorted.

Return type:

pd.DataFrame

fix_extras()[source]

Fixes the extras after loading the schema, to ensure they are in the correct format.

classmethod load(filename=None, schema_as_string=None, schema=None, file_format=None, name='')[source]

Loads and returns the schema, including partnered schema if applicable.

Parameters:
  • filename (str or None) – A valid filepath or None

  • schema_as_string (str or None) – A full schema as text or None

  • schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.

  • file_format (str or None) – If this is an owl file being loaded, this is the format. Allowed values include: turtle, json-ld, and owl(xml)

  • name (str or None) – Optional user supplied identifier, by default uses filename

Returns:

The new schema

Return type:

HedSchema

property schema

The partially loaded schema if you are after just header attributes.

Schema serializers

class Schema2DF[source]

Bases: Schema2Base

Converts a HedSchema to a set of pandas DataFrames, one per schema section.

process_schema(hed_schema, save_merged=False)

Convert a HedSchema object to the subclass’s output format (mediawiki, XML, JSON, or TSV).

This method owns the canonical section-traversal order for all serializers. Each _output_* call delegates to the format-specific subclass hook.

Warning

If a new HedSectionKey is added to the schema, a new _output_* call must be inserted here and the matching hook must be implemented in each of the four serializer subclasses (schema2wiki, schema2xml, schema2json, schema2df).

Parameters:
  • hed_schema (HedSchema) – The schema to be serialized.

  • save_merged (bool) – If True, serialize as a merged (fully expanded) schema when the schema has a withStandard attribute; ignored for standard schemas (which are always saved fully).

Returns:

Format-dependent output object (string, ElementTree, dict, or DataFrame

dict depending on the subclass).

Return type:

Any

Raises:

HedFileError – If the schema cannot be saved (e.g., merged multi-library schema).

class Schema2Wiki[source]

Bases: Schema2Base

Converts a HedSchema to MediaWiki-format text output.

process_schema(hed_schema, save_merged=False)

Convert a HedSchema object to the subclass’s output format (mediawiki, XML, JSON, or TSV).

This method owns the canonical section-traversal order for all serializers. Each _output_* call delegates to the format-specific subclass hook.

Warning

If a new HedSectionKey is added to the schema, a new _output_* call must be inserted here and the matching hook must be implemented in each of the four serializer subclasses (schema2wiki, schema2xml, schema2json, schema2df).

Parameters:
  • hed_schema (HedSchema) – The schema to be serialized.

  • save_merged (bool) – If True, serialize as a merged (fully expanded) schema when the schema has a withStandard attribute; ignored for standard schemas (which are always saved fully).

Returns:

Format-dependent output object (string, ElementTree, dict, or DataFrame

dict depending on the subclass).

Return type:

Any

Raises:

HedFileError – If the schema cannot be saved (e.g., merged multi-library schema).

class Schema2XML[source]

Bases: Schema2Base

Converts a HedSchema to an XML ElementTree representation.

process_schema(hed_schema, save_merged=False)

Convert a HedSchema object to the subclass’s output format (mediawiki, XML, JSON, or TSV).

This method owns the canonical section-traversal order for all serializers. Each _output_* call delegates to the format-specific subclass hook.

Warning

If a new HedSectionKey is added to the schema, a new _output_* call must be inserted here and the matching hook must be implemented in each of the four serializer subclasses (schema2wiki, schema2xml, schema2json, schema2df).

Parameters:
  • hed_schema (HedSchema) – The schema to be serialized.

  • save_merged (bool) – If True, serialize as a merged (fully expanded) schema when the schema has a withStandard attribute; ignored for standard schemas (which are always saved fully).

Returns:

Format-dependent output object (string, ElementTree, dict, or DataFrame

dict depending on the subclass).

Return type:

Any

Raises:

HedFileError – If the schema cannot be saved (e.g., merged multi-library schema).

Schema comparison

SchemaComparer

class SchemaComparer(schema1, schema2)[source]

Bases: object

Class for comparing HED schemas and generating change logs.

ANNOTATION_PROPERTY_EXTERNAL = 'AnnotationPropertyExternal'
EXTRAS_SECTION = 'Extras changes'
HED_ID_SECTION = 'HedId changes'
MISC_SECTION = 'misc'
PREFIXES = 'Prefixes'
SECTION_ENTRY_NAMES = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attribute', HedSectionKey.Properties: 'Property', HedSectionKey.Tags: 'Tag', HedSectionKey.UnitClasses: 'Unit Class', HedSectionKey.UnitModifiers: 'Unit Modifier', HedSectionKey.Units: 'Unit', HedSectionKey.ValueClasses: 'Value Class'}
SECTION_ENTRY_NAMES_PLURAL = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'Extras changes': 'Extras', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attributes', HedSectionKey.Properties: 'Properties', HedSectionKey.Tags: 'Tags', HedSectionKey.UnitClasses: 'Unit Classes', HedSectionKey.UnitModifiers: 'Unit Modifiers', HedSectionKey.Units: 'Units', HedSectionKey.ValueClasses: 'Value Classes'}
SOURCES = 'Sources'
compare_differences(attribute_filter=None, title='')[source]

Compare two schemas and return a formatted report of all differences.

This is a convenience method that combines gather_schema_changes and pretty_print_change_dict to produce a complete, human-readable comparison report.

Parameters:
  • attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is None.

  • title (str) – Custom title for the report. If empty, generates title from schema names. Default is empty string.

Returns:

Formatted markdown string describing all differences between the schemas.

Return type:

str

compare_schemas(attribute_filter='inLibrary', sections=(HedSectionKey.Tags,))[source]

Compare two schemas section by section, categorizing entries by match status.

This is the core comparison method that categorizes all schema entries into four groups: matches (identical entries), entries only in schema1, entries only in schema2, and entries with the same name but different attributes.

Parameters:
  • attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is HedKey.InLibrary.

  • sections (tuple or None) – Tuple of HedSectionKey values to compare. If None, compares all sections including miscellaneous metadata. Default is (HedSectionKey.Tags,).

Returns:

Four dictionaries (matches, not_in_schema1, not_in_schema2, unequal_entries):
  • matches: Entries identical in both schemas

  • not_in_schema1: Entries only in schema2

  • not_in_schema2: Entries only in schema1

  • unequal_entries: Entries with same name but different attributes

Return type:

tuple

find_matching_tags(sections=(HedSectionKey.Tags,), return_string=True)[source]

Find tags with matching names in both schemas.

This method identifies all entries that exist in both schemas with the same name, regardless of whether their attributes differ.

Parameters:
  • sections (tuple) – Tuple of HedSectionKey values indicating which sections to compare. Default is (HedSectionKey.Tags,).

  • return_string (bool) – If True, return formatted string. If False, return dictionary. Default is True.

Returns:

If return_string is True, returns formatted string listing matching tags.

If False, returns dictionary mapping section keys to dictionaries of matching tag entries.

Return type:

str or dict

gather_schema_changes(attribute_filter=None)[source]

Generate a structured changelog by comparing the two schemas.

This method performs a comprehensive comparison and produces a categorized change dictionary suitable for version control and documentation. Changes are classified by severity (Major, Minor, Patch, Unknown) and organized by schema section.

Parameters:

attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is None.

Returns:

Dictionary mapping section keys to lists of change dictionaries. Each change

dictionary contains ‘change_type’, ‘change’ (description), and ‘tag’ (affected entry).

Return type:

dict

pretty_print_change_dict(change_dict, title='Schema changes', use_markdown=True)[source]

Format a change dictionary into a human-readable string.

Converts the structured change dictionary from gather_schema_changes into a formatted text report suitable for display or documentation.

Parameters:
  • change_dict (dict) – Dictionary of changes as returned by gather_schema_changes.

  • title (str) – Title for the change report. Default is “Schema changes”.

  • use_markdown (bool) – If True, use markdown formatting (bold headers, bullet points). If False, use plain text with tabs. Default is True.

Returns:

Formatted string representation of the changes.

Return type:

str

Comparison utilities

Functions supporting comparison of schemas.

class SchemaComparer(schema1, schema2)[source]

Class for comparing HED schemas and generating change logs.

ANNOTATION_PROPERTY_EXTERNAL = 'AnnotationPropertyExternal'
EXTRAS_SECTION = 'Extras changes'
HED_ID_SECTION = 'HedId changes'
MISC_SECTION = 'misc'
PREFIXES = 'Prefixes'
SECTION_ENTRY_NAMES = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attribute', HedSectionKey.Properties: 'Property', HedSectionKey.Tags: 'Tag', HedSectionKey.UnitClasses: 'Unit Class', HedSectionKey.UnitModifiers: 'Unit Modifier', HedSectionKey.Units: 'Unit', HedSectionKey.ValueClasses: 'Value Class'}
SECTION_ENTRY_NAMES_PLURAL = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'Extras changes': 'Extras', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attributes', HedSectionKey.Properties: 'Properties', HedSectionKey.Tags: 'Tags', HedSectionKey.UnitClasses: 'Unit Classes', HedSectionKey.UnitModifiers: 'Unit Modifiers', HedSectionKey.Units: 'Units', HedSectionKey.ValueClasses: 'Value Classes'}
SOURCES = 'Sources'
compare_differences(attribute_filter=None, title='')[source]

Compare two schemas and return a formatted report of all differences.

This is a convenience method that combines gather_schema_changes and pretty_print_change_dict to produce a complete, human-readable comparison report.

Parameters:
  • attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is None.

  • title (str) – Custom title for the report. If empty, generates title from schema names. Default is empty string.

Returns:

Formatted markdown string describing all differences between the schemas.

Return type:

str

compare_schemas(attribute_filter='inLibrary', sections=(HedSectionKey.Tags,))[source]

Compare two schemas section by section, categorizing entries by match status.

This is the core comparison method that categorizes all schema entries into four groups: matches (identical entries), entries only in schema1, entries only in schema2, and entries with the same name but different attributes.

Parameters:
  • attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is HedKey.InLibrary.

  • sections (tuple or None) – Tuple of HedSectionKey values to compare. If None, compares all sections including miscellaneous metadata. Default is (HedSectionKey.Tags,).

Returns:

Four dictionaries (matches, not_in_schema1, not_in_schema2, unequal_entries):
  • matches: Entries identical in both schemas

  • not_in_schema1: Entries only in schema2

  • not_in_schema2: Entries only in schema1

  • unequal_entries: Entries with same name but different attributes

Return type:

tuple

find_matching_tags(sections=(HedSectionKey.Tags,), return_string=True)[source]

Find tags with matching names in both schemas.

This method identifies all entries that exist in both schemas with the same name, regardless of whether their attributes differ.

Parameters:
  • sections (tuple) – Tuple of HedSectionKey values indicating which sections to compare. Default is (HedSectionKey.Tags,).

  • return_string (bool) – If True, return formatted string. If False, return dictionary. Default is True.

Returns:

If return_string is True, returns formatted string listing matching tags.

If False, returns dictionary mapping section keys to dictionaries of matching tag entries.

Return type:

str or dict

gather_schema_changes(attribute_filter=None)[source]

Generate a structured changelog by comparing the two schemas.

This method performs a comprehensive comparison and produces a categorized change dictionary suitable for version control and documentation. Changes are classified by severity (Major, Minor, Patch, Unknown) and organized by schema section.

Parameters:

attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is None.

Returns:

Dictionary mapping section keys to lists of change dictionaries. Each change

dictionary contains ‘change_type’, ‘change’ (description), and ‘tag’ (affected entry).

Return type:

dict

pretty_print_change_dict(change_dict, title='Schema changes', use_markdown=True)[source]

Format a change dictionary into a human-readable string.

Converts the structured change dictionary from gather_schema_changes into a formatted text report suitable for display or documentation.

Parameters:
  • change_dict (dict) – Dictionary of changes as returned by gather_schema_changes.

  • title (str) – Title for the change report. Default is “Schema changes”.

  • use_markdown (bool) – If True, use markdown formatting (bold headers, bullet points). If False, use plain text with tabs. Default is True.

Returns:

Formatted string representation of the changes.

Return type:

str

Schema validation

Compliance checking

check_compliance(hed_schema, check_for_warnings=True, name=None, error_handler=None)[source]

Check a HED schema for compliance.

Parameters:
  • hed_schema (HedSchema) – HedSchema object to check for hed3 compliance.

  • check_for_warnings (bool) – If True, check for formatting issues like invalid characters, capitalization, etc.

  • name (str) – If present, will use as filename for context.

  • error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.

Returns:

A list of all warnings and errors found. Each issue is a dict.

The returned list has an additional compliance_summary attribute (ComplianceSummary) providing a structured report.

Return type:

list

Raises:

ValueError – If hed_schema is not a HedSchema instance.

class SchemaValidator(hed_schema, error_handler)[source]

Bases: object

Validates a loaded HedSchema for compliance.

The five content sections (Tags, UnitClasses, Units, UnitModifiers, ValueClasses) are validated using range and domain metadata that the schema itself provides in its Attributes and Properties sections.

Typical usage is through check_compliance().

check_annotation_attribute_values()[source]

Validate that annotation attribute values reference valid prefixes, external annotations, and sources.

For each entry that has an annotation attribute, checks that:

  1. The value starts with prefix:id where prefix: is defined in the Prefixes extras section and prefix: + id is a row in the ExternalAnnotations extras section.

  2. If the annotation references dc:source, the remaining text after ``dc:source `` must start with a name from the Sources extras section.

check_attributes()[source]

Validate every attribute on every entry in every section.

For each attribute this performs three layers of checking:

  1. Domain — the attribute is valid for the entry’s section. Any attribute not in the section’s valid_attributes was already flagged as _unknown_attributes during loading; those are reported here.

  2. Range — the attribute value matches the range type declared on the attribute’s own definition (boolRange, tagRange, etc.).

  3. Semantic — extra attribute-specific rules (e.g. takesValue requires a placeholder entry, deprecatedFrom version must exist).

check_duplicate_hed_ids()[source]

Check for duplicate hedId values across all schema sections.

check_duplicate_names()[source]

Check for duplicate entry names across library merges.

check_extras_columns()[source]

Validate that all extras DataFrames have non-empty values in required columns.

For each extras section (Sources, Prefixes, ExternalAnnotations), checks that every cell in the required columns defined in df_constants.extras_column_dict has a non-empty value.

Note

Missing columns are automatically added with empty strings during schema loading (see base2schema.fix_extra), so only value presence needs to be checked here.

check_if_prerelease_version()[source]

Warn if this schema version is newer than all known released versions.

check_invalid_characters()[source]

Validate characters in entry names and descriptions.

check_prologue_epilogue()[source]

Validate characters in the prologue and epilogue.

Compliance summary

class ComplianceSummary(schema_name='', schema_version='')[source]

Bases: object

Tracks what was checked during schema compliance validation and the results.

This provides a structured report of all checks performed, how many entries were examined, and how many issues were found per check category.

Use get_summary() for a human-readable text report, or access check_results directly for programmatic use.

add_sub_check(sub_check_name)[source]

Record a named sub-check within the current check.

Parameters:

sub_check_name (str) – Name of the sub-check (e.g. an attribute validator name).

get_summary(verbose=True)[source]

Return a human-readable summary of all compliance checks.

Parameters:

verbose (bool) – If True, include per-section breakdowns and sub-check lists.

Returns:

Formatted multi-line summary report.

Return type:

str

record_issues(issue_count)[source]

Record issues found during the current check.

Parameters:

issue_count (int) – Number of issues found.

record_section(section_key, entries_checked, entries_skipped=0)[source]

Record that a section was examined during the current check.

Parameters:
  • section_key (HedSectionKey or str) – The section that was checked.

  • entries_checked (int) – Number of entries examined in this section.

  • entries_skipped (int) – Number of entries skipped (e.g. deprecated).

start_check(check_name, description='')[source]

Begin tracking a new compliance check.

Parameters:
  • check_name (str) – Short identifier for the check (e.g. “prerelease_version”).

  • description (str) – Human-readable description of what this check validates.

property total_entries_checked

Return total entries checked across all checks.

Returns:

Total number of entries examined.

Return type:

int

property total_issues

Return total issues across all checks.

Returns:

Total number of issues found.

Return type:

int

HED ID validator

class HedIDValidator(hed_schema)[source]

Bases: object

Support class to validate hedIds in schemas

verify_tag_id(hed_schema, tag_entry, attribute_name)[source]

Validates the hedID attribute values

This follows the template from schema_attribute_validators.py

Parameters:
  • hed_schema (HedSchema) – The schema to use for validation

  • tag_entry (HedSchemaEntry) – The schema entry for this tag.

  • attribute_name (str) – The name of this attribute.

Returns:

A list of issues from validating this attribute.

Return type:

issues(list)

Schema constants

HedKey

class HedKey[source]

Bases: object

Known property and attribute names.

Notes

  • These names should match the attribute values in the XML/wiki.

AllowedCharacter = 'allowedCharacter'
AnnotationProperty = 'annotationProperty'
BoolRange = 'boolRange'
ConversionFactor = 'conversionFactor'
DefaultUnits = 'defaultUnits'
DeprecatedFrom = 'deprecatedFrom'
ElementDomain = 'elementDomain'
ExtensionAllowed = 'extensionAllowed'
HedID = 'hedId'
InLibrary = 'inLibrary'
NumericRange = 'numericRange'
Recommended = 'recommended'
RelatedTag = 'relatedTag'
RequireChild = 'requireChild'
Required = 'required'
Reserved = 'reserved'
Rooted = 'rooted'
SIUnit = 'SIUnit'
SIUnitModifier = 'SIUnitModifier'
SIUnitSymbolModifier = 'SIUnitSymbolModifier'
StringRange = 'stringRange'
SuggestedTag = 'suggestedTag'
TagDomain = 'tagDomain'
TagGroup = 'tagGroup'
TagRange = 'tagRange'
TakesValue = 'takesValue'
TopLevelTagGroup = 'topLevelTagGroup'
Unique = 'unique'
UnitClass = 'unitClass'
UnitClassDomain = 'unitClassDomain'
UnitClassRange = 'unitClassRange'
UnitDomain = 'unitDomain'
UnitModifierDomain = 'unitModifierDomain'
UnitPrefix = 'unitPrefix'
UnitRange = 'unitRange'
UnitSymbol = 'unitSymbol'
ValueClass = 'valueClass'
ValueClassDomain = 'valueClassDomain'
ValueClassRange = 'valueClassRange'

HedSectionKey

class HedSectionKey(*values)[source]

Bases: Enum

Keys designating specific sections in a HedSchema object.

Attributes = 'attributes'
Properties = 'properties'
Tags = 'tags'
UnitClasses = 'unitClasses'
UnitModifiers = 'unitModifiers'
Units = 'units'
ValueClasses = 'valueClasses'

HedKeyOld

class HedKeyOld[source]

Bases: object

Fully deprecated schema attribute key constants retained for backwards compatibility.

BoolProperty = 'boolProperty'
ElementProperty = 'elementProperty'
IsInheritedProperty = 'isInheritedProperty'
NodeProperty = 'nodeProperty'
UnitClassProperty = 'unitClassProperty'
UnitModifierProperty = 'unitModifierProperty'
UnitProperty = 'unitProperty'
ValueClassProperty = 'valueClassProperty'