Schema¶
HED schema management and validation tools.
Core schema classes¶
HedSchemaBase¶
- class HedSchemaBase[source]¶
Bases:
ABCBaseclass for schema and schema group.
Implementing the abstract functions will allow you to use the schema for validation
- abstractmethod check_compliance(check_for_warnings=True, name=None, error_handler=None)[source]¶
Check for HED3 compliance of this schema.
- Parameters:
check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.
name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.
error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.
- Returns:
A list of all warnings and errors found in the file. Each issue is a dictionary.
- Return type:
- abstractmethod find_tag_entry(tag, schema_namespace='')[source]¶
Find the schema entry for a given source tag.
- Parameters:
- Returns:
The located tag entry for this tag. str: The remainder of the tag that isn’t part of the base tag. list: A list of errors while converting.
- Return type:
Notes
Works left to right (which is mostly relevant for errors).
- abstractmethod get_formatted_version()[source]¶
The HED version string including namespace and library name if any of this schema.
- Returns:
The complete version of this schema including library name and namespace.
- Return type:
- abstractmethod get_schema_versions()[source]¶
A list of HED version strings including namespace and library name if any of this schema.
- Returns:
The complete version of this schema including library name and namespace.
- Return type:
- abstractmethod get_tag_entry(name, key_class=HedSectionKey.Tags, schema_namespace='')[source]¶
Return the schema entry for this tag, if one exists.
- Parameters:
name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required
key_class (HedSectionKey or str) – The type of entry to return.
schema_namespace (str) – Only used on Tags. If incorrect, will return None.
- Returns:
The schema entry for the given tag.
- Return type:
- abstractmethod get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags)[source]¶
Return tag entries with the given attribute.
- Parameters:
attribute (str) – A tag attribute: e.g., HedKey.ExtensionAllowed
key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.
- Returns:
A list of all tags with this attribute.
- Return type:
Notes
The result is cached so will be fast after first call.
- property name¶
User provided name for this schema, defaults to filename or version if no name provided.
- property schema_83_props¶
Returns if this is an 8.3.0 or greater schema.
- Returns:
True if standard or partnered schema is 8.3.0 or greater.
- Return type:
HedSchema¶
- class HedSchema[source]¶
Bases:
HedSchemaBaseA HED schema suitable for processing.
- property attributes: HedSchemaSection¶
Return the attributes schema section.
- Returns:
The attributes section.
- Return type:
- can_save() bool[source]¶
Returns if it’s legal to save this schema.
You cannot save schemas loaded as merged from multiple library schemas.
- Returns:
True if this can be saved.
- Return type:
- check_compliance(check_for_warnings=True, name=None, error_handler=None) list[dict][source]¶
Check for HED3 compliance of this schema.
- Parameters:
check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.
name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.
error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.
- Returns:
A list of all warnings and errors found in the file. Each issue is a dictionary.
- Return type:
- find_tag_entry(tag, schema_namespace='') tuple[HedTagEntry | None, str | None, list[dict]][source]¶
Find the schema entry for a given source tag.
- Parameters:
- Returns:
The located tag entry for this tag.
The remainder of the tag that isn’t part of the base tag.
A list of errors while converting.
- Return type:
tuple[Union[“HedTagEntry”, None], Union[str, None], list[dict]]
Notes
Works left to right (which is mostly relevant for errors).
- get_as_dataframes(save_merged=False) dict[DataFrame][source]¶
Get a dict of dataframes representing this file
- get_extras(extras_key) DataFrame | None[source]¶
Get the extras corresponding to the given key
- Parameters:
extras_key (str) – The key to check for in the extras dictionary.
- Returns:
The DataFrame for this extras key, or None if it doesn’t exist or is empty.
- Return type:
Union[pd.DataFrame, None]
- get_formatted_version() str[source]¶
The HED version string including namespace and library name if any of this schema.
- Returns:
A json formatted string of the complete version of this schema including library name and namespace.
- Return type:
- get_save_header_attributes(save_merged: bool = False) dict[source]¶
Returns the attributes that should be saved.
- get_schema_versions() list[str][source]¶
A list of HED version strings including namespace and library name if any of this schema.
- get_tag_attribute_names_old() dict[str, HedSchemaEntry][source]¶
Return a dict of all allowed tag attributes.
- Returns:
A dictionary whose keys are attribute names and values are HedSchemaEntry object.
- Return type:
- get_tag_entry(name: str, key_class=HedSectionKey.Tags, schema_namespace: str = '') HedSchemaEntry | None[source]¶
Return the schema entry for this tag, if one exists.
- Parameters:
name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required
key_class (HedSectionKey or str) – The type of entry to return.
schema_namespace (str) – Only used on Tags. If incorrect, will return None.
- Returns:
The schema entry for the given tag, or None if not found.
- Return type:
Union[HedSchemaEntry, None]
- get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags) list[HedSchemaEntry][source]¶
Return tag entries with the given attribute.
- Parameters:
attribute (str) – A tag attribute: e.g., HedKey.ExtensionAllowed
key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.
- Returns:
A list of all tags with this attribute.
- Return type:
Notes
The result is cached so will be fast after first call.
- has_duplicates()[source]¶
Returns the first duplicate tag/unit/etc. if any section has a duplicate name
- property library: str¶
The name of this library schema if one exists.
- Returns:
Library name if any.
- Return type:
- property merged: bool¶
Returns if this schema was loaded from a merged file.
- Returns:
True if file was loaded from a merged file.
- Return type:
- property name¶
User provided name for this schema, defaults to filename or version if no name provided.
- property properties: HedSchemaSection¶
Return the properties schema section.
- Returns:
The properties section.
- Return type:
- save_as_dataframes(base_filename, save_merged=False)[source]¶
Save as dataframes to a folder of files.
If base_filename has a .tsv suffix, save directly to the indicated location. If base_filename is a directory(does NOT have a .tsv suffix), save the contents into a directory named that. The subfiles are named the same: e.g. HED8.3.0/HED8.3.0_Tag.tsv
- Parameters:
- Raises:
OSError – File cannot be saved for some reason.
- property schema_83_props¶
Returns if this is an 8.3.0 or greater schema.
- Returns:
True if standard or partnered schema is 8.3.0 or greater.
- Return type:
- schema_for_namespace(namespace: str) HedSchema | None[source]¶
Return HedSchema object for this namespace.
- property schema_namespace: str¶
Returns the schema namespace prefix.
- Returns:
The schema namespace prefix.
- Return type:
- set_schema_prefix(schema_namespace)[source]¶
Set library namespace associated for this schema.
- Parameters:
schema_namespace (str) – Should be empty, or end with a colon.(Colon will be automated added if missing).
- Raises:
HedFileError – The prefix is invalid.
- property tags: HedSchemaTagSection¶
Return the tag schema section.
- Returns:
The tag section.
- Return type:
- property unit_classes: HedSchemaUnitClassSection¶
Return the unit classes schema section.
- Returns:
The unit classes section.
- Return type:
- property unit_modifiers: HedSchemaSection¶
Return the modifiers classes schema section.
- Returns:
The unit modifiers section.
- Return type:
- property units: HedSchemaUnitSection¶
Return the unit schema section.
- Returns:
The unit section.
- Return type:
- property valid_prefixes: list[str]¶
Return a list of all prefixes this schema will accept
Notes
The return value is always length 1 if using a HedSchema.
- property value_classes: HedSchemaSection¶
Return the value classes schema section.
- Returns:
The value classes section.
- Return type:
- property version: str¶
The complete schema version, including prefix and library name(if applicable).
- Returns:
The complete schema version including library name and namespace.
- Return type:
HedSchemaGroup¶
- class HedSchemaGroup(schema_list, name='')[source]¶
Bases:
HedSchemaBaseContainer for multiple HedSchema objects.
Notes
The container class is useful when library schema are included.
You cannot save/load/etc. the combined schema object directly.
- check_compliance(check_for_warnings=True, name=None, error_handler=None) list[dict][source]¶
Check for HED3 compliance of this schema.
- Parameters:
check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.
name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.
error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.
- Returns:
A list of all warnings and errors found in the file. Each issue is a dictionary.
- Return type:
- find_tag_entry(tag, schema_namespace='') tuple[HedTagEntry | None, str | None, list][source]¶
Find the schema entry for a given source tag.
- Parameters:
- Returns:
The located tag entry for this tag. str: The remainder of the tag that isn’t part of the base tag. list: A list of errors while converting.
- Return type:
Notes
Works left to right (which is mostly relevant for errors).
- get_formatted_version() str[source]¶
The HED version string including namespace and library name if any of this schema.
- Returns:
The complete version of this schema including library name and namespace.
- Return type:
- get_schema_versions() list[str][source]¶
A list of HED version strings including namespace and library name if any for these schemas.
- get_tag_entry(name, key_class=HedSectionKey.Tags, schema_namespace='') HedSchemaEntry | None[source]¶
Return the schema entry for this tag, if one exists.
- Parameters:
name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required
key_class (HedSectionKey or str) – The type of entry to return.
schema_namespace (str) – Only used on Tags. If incorrect, will return None.
- Returns:
The schema entry for the given tag.
- Return type:
- get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags) list[source]¶
Return tag entries with the given attribute.
- Parameters:
attribute (str) – A tag attribute: e.g., HedKey.ExtensionAllowed
key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.
- Returns:
A list of all tags with this attribute.
- Return type:
Notes
The result is cached so will be fast after first call.
- property name¶
User provided name for this schema, defaults to filename or version if no name provided.
- property schema_83_props¶
Returns if this is an 8.3.0 or greater schema.
- Returns:
True if standard or partnered schema is 8.3.0 or greater.
- Return type:
Schema entry classes¶
HedSchemaEntry¶
- class HedSchemaEntry(name, section)[source]¶
Bases:
objectA single node in the HED schema vocabulary.
Every term, unit, unit class, value class, attribute, and property that appears in a loaded
HedSchemais represented as aHedSchemaEntry(or one of its subclasses). The entry stores the node’s name, all declared attributes (e.g.takesValue,allowedCharacter), its description, and a back-reference to its containingHedSchemaSection.Concrete subclasses add section-specific state:
HedTagEntry— vocabulary tag nodes.UnitClassEntry— unit class nodes (e.g. time, mass).UnitEntry— individual unit nodes (e.g. second, gram).
Use this class (or its subclasses) directly when you need to:
Introspect schema vocabulary (e.g. list all tags with
takesValue).Build schema validators, schema browsers, or schema-diff tools.
Implement custom HED annotation tooling that looks up tag metadata.
Most users never need this class —
get_tag_entry()andget_all_schema_tags()are sufficient for the common lookup patterns.- attribute_has_property(attribute, property_name) bool[source]¶
Return True if attribute has property.
- finalize_entry(schema)[source]¶
Called once after loading to set internal state.
- Parameters:
schema (HedSchema) – The schema that holds the rules.
- has_attribute(attribute, return_value=False) bool | Any[source]¶
Checks for the existence of an attribute in this entry.
- Parameters:
- Returns:
If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.
- Return type:
Union[bool, any]
Notes
The existence of an attribute does not guarantee its validity.
- property section_key¶
Returns the HedSectionKey identifying which schema section owns this entry.
- Returns:
The section key for this entry’s parent section.
- Return type:
HedTagEntry¶
- class HedTagEntry(*args, **kwargs)[source]¶
Bases:
HedSchemaEntryA vocabulary tag node in the HED schema.
Extends
HedSchemaEntrywith full/short tag name forms, value-class and unit-class associations, and helper methods for tag-path traversal.Typical access pattern:
entry = schema.get_tag_entry("Sensory-event") print(entry.long_tag_name) # "Event/Sensory-event" print(entry.takes_value_child) # child "#" entry if tag takes a value
- unit_classes¶
Unit classes accepted by this tag’s value (non-empty only if
takesValueis set).- Type:
- value_classes¶
Value classes that constrain the value format.
- Type:
- long_tag_name¶
The full slash-separated path from the schema root, with any trailing
/#stripped.- Type:
- base_tag_has_attribute(tag_attribute)[source]¶
Check if the base tag has a specific attribute.
- Parameters:
tag_attribute (str) – A tag attribute.
- Returns:
True if the tag has the specified attribute. False, if otherwise.
- Return type:
Notes
This mostly is relevant for takes value tags.
- finalize_entry(schema)[source]¶
Called once after schema loading to set state.
- Parameters:
schema (HedSchema) – The schema that the rules come from.
- has_attribute(attribute, return_value=False)[source]¶
Returns th existence or value of an attribute in this entry.
This also checks parent tags for inheritable attributes like ExtensionAllowed.
- Parameters:
- Returns:
If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.
- Return type:
Union[bool, any]
Notes
The existence of an attribute does not guarantee its validity.
- property parent¶
Get the parent entry of this tag
- property parent_name¶
Gets the parent tag entry name
- property section_key¶
Returns the HedSectionKey identifying which schema section owns this entry.
- Returns:
The section key for this entry’s parent section.
- Return type:
UnitClassEntry¶
- class UnitClassEntry(*args, **kwargs)[source]¶
Bases:
HedSchemaEntryA unit class node in the HED schema (e.g. time, mass, frequency).
Extends
HedSchemaEntrywith the set ofUnitEntryobjects that belong to the class and a pre-computedderivative_unitsdict that maps every accepted surface form (including SI prefixes and plurals) to its canonicalUnitEntry.Typical access pattern:
unit_class = schema.get_tag_entry("time", HedSectionKey.UnitClasses) for name, unit in unit_class.units.items(): print(name, unit.attributes)
- units¶
Map from unit name to entry after
finalize_entry()is called.
- derivative_units¶
Map from every accepted surface form (plural, SI-prefixed, etc.) to the base unit entry.
- add_unit(unit_entry)[source]¶
Add the given unit entry to this unit class.
- Parameters:
unit_entry (HedSchemaEntry) – Unit entry to add.
- property children¶
Alias to get the units for this class
- Returns:
The unit list for this class
- Return type:
unit_list(list)
- finalize_entry(schema)[source]¶
Called once after schema load to set state.
- Parameters:
schema (HedSchema) – The object with the schema rules.
- has_attribute(attribute, return_value=False) bool | Any¶
Checks for the existence of an attribute in this entry.
- Parameters:
- Returns:
If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.
- Return type:
Union[bool, any]
Notes
The existence of an attribute does not guarantee its validity.
- property section_key¶
Returns the HedSectionKey identifying which schema section owns this entry.
- Returns:
The section key for this entry’s parent section.
- Return type:
UnitEntry¶
- class UnitEntry(*args, **kwargs)[source]¶
Bases:
HedSchemaEntryA single unit node in the HED schema (e.g. second, gram, hertz).
Extends
HedSchemaEntrywith the list of SI unit modifiers that apply to this unit, a pre-computedderivative_unitsmapping (surface form → conversion factor), and a back-reference to the parentUnitClassEntry.- unit_modifiers¶
SI modifier entries (e.g. milli, kilo).
- Type:
- derivative_units¶
Map from every accepted surface form to its numeric conversion factor relative to the SI base unit.
- unit_class_entry¶
The parent unit class.
- Type:
- finalize_entry(schema)[source]¶
Called once after loading to set internal state.
- Parameters:
schema (HedSchema) – The schema rules come from.
- get_conversion_factor(unit_name)[source]¶
Returns the conversion factor from combining this unit with the specified modifier
- has_attribute(attribute, return_value=False) bool | Any¶
Checks for the existence of an attribute in this entry.
- Parameters:
- Returns:
If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.
- Return type:
Union[bool, any]
Notes
The existence of an attribute does not guarantee its validity.
- property section_key¶
Returns the HedSectionKey identifying which schema section owns this entry.
- Returns:
The section key for this entry’s parent section.
- Return type:
Schema section classes¶
HedSchemaSection¶
- class HedSchemaSection(section_key, case_sensitive=True)[source]¶
Bases:
objectTyped container for all entries in one section of a loaded HED schema.
A
HedSchemais divided into sections (tags, unit classes, units, value classes, attributes, properties, unit modifiers). Each section is aHedSchemaSectionthat maps lower-cased entry names to theirHedSchemaEntryobjects and tracks which attributes are valid for that section.The concrete entry type for each section is determined by
entries_by_section:Section key
Entry type
HedSectionKey.Tags| HedTagEntryHedSectionKey.UnitClasses| UnitClassEntryHedSectionKey.Units| UnitEntry everything else | HedSchemaEntryUse this class directly when you need to:
Iterate over all entries in a specific schema section.
Build schema comparison or diff tools.
Access
valid_attributesto determine which attributes are legal for a given section.
- all_names¶
Map from lower-cased name to entry.
- Type:
- all_entries¶
Entries in insertion order.
- Type:
- valid_attributes¶
Attribute entries that are declared valid for this section.
- Type:
- property duplicate_names¶
Returns a dict of entries that share the same name within this section.
- Returns:
Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.
- Return type:
- get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str][source]¶
Return entries or names with given attribute.
- Parameters:
- Returns:
List of HedSchemaEntry or strings representing the names.
- Return type:
list[Union[HedSchemaEntry, str]]
- property section_key¶
Returns the HedSectionKey identifying this section.
- Returns:
The key for this schema section.
- Return type:
HedSchemaUnitSection¶
- class HedSchemaUnitSection(section_key, case_sensitive=True)[source]¶
Bases:
HedSchemaSectionThe schema section containing units.
- property duplicate_names¶
Returns a dict of entries that share the same name within this section.
- Returns:
Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.
- Return type:
- get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str]¶
Return entries or names with given attribute.
- Parameters:
- Returns:
List of HedSchemaEntry or strings representing the names.
- Return type:
list[Union[HedSchemaEntry, str]]
- items()¶
Return the items.
- keys()¶
The names of the keys.
- property section_key¶
Returns the HedSectionKey identifying this section.
- Returns:
The key for this schema section.
- Return type:
- values()¶
All names of the sections.
HedSchemaUnitClassSection¶
- class HedSchemaUnitClassSection(section_key, case_sensitive=True)[source]¶
Bases:
HedSchemaSectionThe schema section containing unit classes.
- property duplicate_names¶
Returns a dict of entries that share the same name within this section.
- Returns:
Mapping of lowercased name to a list of conflicting HedSchemaEntry objects.
- Return type:
- get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str]¶
Return entries or names with given attribute.
- Parameters:
- Returns:
List of HedSchemaEntry or strings representing the names.
- Return type:
list[Union[HedSchemaEntry, str]]
- items()¶
Return the items.
- keys()¶
The names of the keys.
- property section_key¶
Returns the HedSectionKey identifying this section.
- Returns:
The key for this schema section.
- Return type:
- values()¶
All names of the sections.
Schema IO and caching¶
Schema loading functions¶
Utilities for loading and outputting HED schema.
- from_dataframes(schema_data, schema_namespace=None, name=None) HedSchema[source]¶
Create a schema from the given string.
- Parameters:
- Returns:
The loaded schema.
- Return type:
- Raises:
HedFileError – If empty/invalid parameters.
Exception – Other fatal I/O or formatting issues.
Notes
The loading is determined by file type.
- from_string(schema_string, schema_format='.xml', schema_namespace=None, schema=None, name=None) HedSchema[source]¶
Create a schema from the given string.
- Parameters:
schema_string (str) – An XML or MEDIAWIKI file as a single long string
schema_format (str) – The schema format of the source schema string. Allowed normal values: .mediawiki, .xml, .json
schema_namespace (str, None) – The name_prefix all tags in this schema will accept.
schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.
name (str or None) – User supplied identifier for this schema
- Returns:
The loaded schema.
- Return type:
- Raises:
If empty string or invalid extension is passed.
Other fatal formatting issues with file
Notes
The loading is determined by file type.
- get_hed_xml_version(xml_file_path) str[source]¶
Get the version number from a HED XML file.
- Parameters:
xml_file_path (str) – The path to a HED XML file.
- Returns:
The version number of the HED XML file.
- Return type:
- Raises:
There is an issue loading the schema
- load_schema(hed_path, schema_namespace=None, schema=None, name=None) HedSchema[source]¶
Load a schema from the given file or URL path.
- Parameters:
hed_path (str) – A filepath or url to open a schema from. If loading a TSV file, this should be a single filename where: Template: basename.tsv, where files are named basename_Struct.tsv, basename_Tag.tsv, etc. Alternatively, you can point to a directory containing the .tsv files.
schema_namespace (str or None) – The name_prefix all tags in this schema will accept.
schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.
name (str or None) – User supplied identifier for this schema
- Returns:
The loaded schema.
- Return type:
- Raises:
HedFileError – Empty path passed.
HedFileError – Unknown extension.
HedFileError – Any fatal issues when loading the schema.
- load_schema_version(xml_version=None, xml_folder=None) HedSchema | HedSchemaGroup[source]¶
Return a HedSchema or HedSchemaGroup extracted from xml_version
- Parameters:
- Returns:
The schema or schema group extracted.
- Return type:
Union[HedSchema, HedSchemaGroup]
- Raises:
HedFileError – The xml_version is not valid.
HedFileError – The specified version cannot be found or loaded.
HedFileError – Other fatal errors loading the schema (These are unlikely if you are not editing them locally).
HedFileError – The prefix is invalid.
Cache management¶
Infrastructure for caching HED schema from remote repositories.
- cache_local_versions(cache_folder) int | None[source]¶
Cache all schemas included with the HED installation.
- cache_xml_versions(hed_base_urls=('https://api.github.com/repos/hed-standard/hed-schemas/contents/standard_schema',), hed_library_urls=('https://api.github.com/repos/hed-standard/hed-schemas/contents/library_schemas',), skip_folders=('deprecated',), cache_folder=None) float[source]¶
Cache all schemas at the given URLs.
- Parameters:
hed_base_urls (str or list) – Path or list of paths. These should point to a single folder.
hed_library_urls (str or list) – Path or list of paths. These should point to folder containing library folders.
skip_folders (list) – A list of subfolders to skip over when downloading.
cache_folder (str) – The folder holding the cache.
- Returns:
- Returns -1 if cache failed for any reason, including having been cached too recently.
Returns 0 if it successfully cached this time.
- Return type:
Notes
The Default skip_folders is ‘deprecated’.
The HED cache folder defaults to HED_CACHE_DIRECTORY.
- The directories on GitHub are of the form:
https://api.github.com/repos/hed-standard/hed-schemas/contents/standard_schema
- get_cache_directory(cache_folder=None) str[source]¶
Return the current value of HED_CACHE_DIRECTORY.
- get_hed_version_path(xml_version, library_name=None, local_hed_directory=None) str | None[source]¶
Get the HED XML file path for a given version.
Searches the local cache first. If the version is not found and local_hed_directory is the default HED cache, the cache is refreshed from GitHub before a second lookup. No network call is made for custom directories.
- Parameters:
- Returns:
The path to the requested HED XML file, or None.
- Return type:
Union[str, None]
- get_hed_versions(local_hed_directory=None, library_name=None, check_prerelease=False) list | dict[source]¶
Get the HED versions in the HED directory.
- Parameters:
local_hed_directory (str) – Directory to check for versions which defaults to hed_cache.
library_name (str or None) – An optional schema library name. None retrieves the standard schema only. Pass “all” to retrieve all standard and library schemas as a dict.
check_prerelease (bool) – If True, results can include prerelease schemas. Default is False, returning only released versions.
- Returns:
List of version numbers or dictionary {library_name: [versions]}.
- Return type:
Schema loader base class¶
- class SchemaLoader(filename, schema_as_string=None, schema=None, file_format=None, name='')[source]¶
Bases:
ABCBaseclass for schema loading, to handle basic errors and partnered schemas
Expected usage is SchemaLoaderXML.load(filename)
SchemaLoaderXML(filename) will load just the header_attributes
- static find_rooted_entry(tag_entry, schema, loading_merged)[source]¶
This semi-validates rooted tags, raising an exception on major errors
- Parameters:
tag_entry (HedTagEntry) – the possibly rooted tag
schema (HedSchema) – The schema being loaded
loading_merged (bool) – If this schema was already merged before loading
- Returns:
- The base tag entry from the standard schema
Returns None if this tag isn’t rooted
- Return type:
Union[HedTagEntry, None]
- Raises:
A rooted attribute is found in a non-paired schema
A rooted attribute is not a string
A rooted attribute was found on a non-root node in an unmerged schema.
A rooted attribute is found on a root node in a merged schema.
A rooted attribute indicates a tag that doesn’t exist in the base schema.
- fix_extra(key)[source]¶
Normalize an extras dataframe by ensuring required columns are present and in canonical order.
- Parameters:
key (str) – The extras dict key identifying which extra dataframe to fix.
- Returns:
The normalized dataframe with required columns added and sorted.
- Return type:
pd.DataFrame
- fix_extras()[source]¶
Fixes the extras after loading the schema, to ensure they are in the correct format.
- classmethod load(filename=None, schema_as_string=None, schema=None, file_format=None, name='')[source]¶
Loads and returns the schema, including partnered schema if applicable.
- Parameters:
filename (str or None) – A valid filepath or None
schema_as_string (str or None) – A full schema as text or None
schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.
file_format (str or None) – If this is an owl file being loaded, this is the format. Allowed values include: turtle, json-ld, and owl(xml)
name (str or None) – Optional user supplied identifier, by default uses filename
- Returns:
The new schema
- Return type:
- property schema¶
The partially loaded schema if you are after just header attributes.
Schema serializers¶
- class Schema2DF[source]¶
Bases:
Schema2BaseConverts a HedSchema to a set of pandas DataFrames, one per schema section.
- process_schema(hed_schema, save_merged=False)¶
Convert a HedSchema object to the subclass’s output format (mediawiki, XML, JSON, or TSV).
This method owns the canonical section-traversal order for all serializers. Each
_output_*call delegates to the format-specific subclass hook.Warning
If a new
HedSectionKeyis added to the schema, a new_output_*call must be inserted here and the matching hook must be implemented in each of the four serializer subclasses (schema2wiki,schema2xml,schema2json,schema2df).- Parameters:
- Returns:
- Format-dependent output object (string, ElementTree, dict, or DataFrame
dict depending on the subclass).
- Return type:
Any
- Raises:
HedFileError – If the schema cannot be saved (e.g., merged multi-library schema).
- class Schema2Wiki[source]¶
Bases:
Schema2BaseConverts a HedSchema to MediaWiki-format text output.
- process_schema(hed_schema, save_merged=False)¶
Convert a HedSchema object to the subclass’s output format (mediawiki, XML, JSON, or TSV).
This method owns the canonical section-traversal order for all serializers. Each
_output_*call delegates to the format-specific subclass hook.Warning
If a new
HedSectionKeyis added to the schema, a new_output_*call must be inserted here and the matching hook must be implemented in each of the four serializer subclasses (schema2wiki,schema2xml,schema2json,schema2df).- Parameters:
- Returns:
- Format-dependent output object (string, ElementTree, dict, or DataFrame
dict depending on the subclass).
- Return type:
Any
- Raises:
HedFileError – If the schema cannot be saved (e.g., merged multi-library schema).
- class Schema2XML[source]¶
Bases:
Schema2BaseConverts a HedSchema to an XML ElementTree representation.
- process_schema(hed_schema, save_merged=False)¶
Convert a HedSchema object to the subclass’s output format (mediawiki, XML, JSON, or TSV).
This method owns the canonical section-traversal order for all serializers. Each
_output_*call delegates to the format-specific subclass hook.Warning
If a new
HedSectionKeyis added to the schema, a new_output_*call must be inserted here and the matching hook must be implemented in each of the four serializer subclasses (schema2wiki,schema2xml,schema2json,schema2df).- Parameters:
- Returns:
- Format-dependent output object (string, ElementTree, dict, or DataFrame
dict depending on the subclass).
- Return type:
Any
- Raises:
HedFileError – If the schema cannot be saved (e.g., merged multi-library schema).
Schema comparison¶
SchemaComparer¶
- class SchemaComparer(schema1, schema2)[source]¶
Bases:
objectClass for comparing HED schemas and generating change logs.
- ANNOTATION_PROPERTY_EXTERNAL = 'AnnotationPropertyExternal'¶
- EXTRAS_SECTION = 'Extras changes'¶
- HED_ID_SECTION = 'HedId changes'¶
- MISC_SECTION = 'misc'¶
- PREFIXES = 'Prefixes'¶
- SECTION_ENTRY_NAMES = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attribute', HedSectionKey.Properties: 'Property', HedSectionKey.Tags: 'Tag', HedSectionKey.UnitClasses: 'Unit Class', HedSectionKey.UnitModifiers: 'Unit Modifier', HedSectionKey.Units: 'Unit', HedSectionKey.ValueClasses: 'Value Class'}¶
- SECTION_ENTRY_NAMES_PLURAL = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'Extras changes': 'Extras', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attributes', HedSectionKey.Properties: 'Properties', HedSectionKey.Tags: 'Tags', HedSectionKey.UnitClasses: 'Unit Classes', HedSectionKey.UnitModifiers: 'Unit Modifiers', HedSectionKey.Units: 'Units', HedSectionKey.ValueClasses: 'Value Classes'}¶
- SOURCES = 'Sources'¶
- compare_differences(attribute_filter=None, title='')[source]¶
Compare two schemas and return a formatted report of all differences.
This is a convenience method that combines gather_schema_changes and pretty_print_change_dict to produce a complete, human-readable comparison report.
- Parameters:
- Returns:
Formatted markdown string describing all differences between the schemas.
- Return type:
- compare_schemas(attribute_filter='inLibrary', sections=(HedSectionKey.Tags,))[source]¶
Compare two schemas section by section, categorizing entries by match status.
This is the core comparison method that categorizes all schema entries into four groups: matches (identical entries), entries only in schema1, entries only in schema2, and entries with the same name but different attributes.
- Parameters:
attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is HedKey.InLibrary.
sections (tuple or None) – Tuple of HedSectionKey values to compare. If None, compares all sections including miscellaneous metadata. Default is (HedSectionKey.Tags,).
- Returns:
- Four dictionaries (matches, not_in_schema1, not_in_schema2, unequal_entries):
matches: Entries identical in both schemas
not_in_schema1: Entries only in schema2
not_in_schema2: Entries only in schema1
unequal_entries: Entries with same name but different attributes
- Return type:
- find_matching_tags(sections=(HedSectionKey.Tags,), return_string=True)[source]¶
Find tags with matching names in both schemas.
This method identifies all entries that exist in both schemas with the same name, regardless of whether their attributes differ.
- Parameters:
- Returns:
- If return_string is True, returns formatted string listing matching tags.
If False, returns dictionary mapping section keys to dictionaries of matching tag entries.
- Return type:
- gather_schema_changes(attribute_filter=None)[source]¶
Generate a structured changelog by comparing the two schemas.
This method performs a comprehensive comparison and produces a categorized change dictionary suitable for version control and documentation. Changes are classified by severity (Major, Minor, Patch, Unknown) and organized by schema section.
- Parameters:
attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is None.
- Returns:
- Dictionary mapping section keys to lists of change dictionaries. Each change
dictionary contains ‘change_type’, ‘change’ (description), and ‘tag’ (affected entry).
- Return type:
- pretty_print_change_dict(change_dict, title='Schema changes', use_markdown=True)[source]¶
Format a change dictionary into a human-readable string.
Converts the structured change dictionary from gather_schema_changes into a formatted text report suitable for display or documentation.
- Parameters:
- Returns:
Formatted string representation of the changes.
- Return type:
Comparison utilities¶
Functions supporting comparison of schemas.
- class SchemaComparer(schema1, schema2)[source]¶
Class for comparing HED schemas and generating change logs.
- ANNOTATION_PROPERTY_EXTERNAL = 'AnnotationPropertyExternal'¶
- EXTRAS_SECTION = 'Extras changes'¶
- HED_ID_SECTION = 'HedId changes'¶
- MISC_SECTION = 'misc'¶
- PREFIXES = 'Prefixes'¶
- SECTION_ENTRY_NAMES = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attribute', HedSectionKey.Properties: 'Property', HedSectionKey.Tags: 'Tag', HedSectionKey.UnitClasses: 'Unit Class', HedSectionKey.UnitModifiers: 'Unit Modifier', HedSectionKey.Units: 'Unit', HedSectionKey.ValueClasses: 'Value Class'}¶
- SECTION_ENTRY_NAMES_PLURAL = {'AnnotationPropertyExternal': 'AnnotationPropertyExternal', 'Extras changes': 'Extras', 'HedId changes': 'Modified Hed Ids', 'Prefixes': 'Prefixes', 'Sources': 'Sources', 'misc': 'Misc Metadata', HedSectionKey.Attributes: 'Attributes', HedSectionKey.Properties: 'Properties', HedSectionKey.Tags: 'Tags', HedSectionKey.UnitClasses: 'Unit Classes', HedSectionKey.UnitModifiers: 'Unit Modifiers', HedSectionKey.Units: 'Units', HedSectionKey.ValueClasses: 'Value Classes'}¶
- SOURCES = 'Sources'¶
- compare_differences(attribute_filter=None, title='')[source]¶
Compare two schemas and return a formatted report of all differences.
This is a convenience method that combines gather_schema_changes and pretty_print_change_dict to produce a complete, human-readable comparison report.
- Parameters:
- Returns:
Formatted markdown string describing all differences between the schemas.
- Return type:
- compare_schemas(attribute_filter='inLibrary', sections=(HedSectionKey.Tags,))[source]¶
Compare two schemas section by section, categorizing entries by match status.
This is the core comparison method that categorizes all schema entries into four groups: matches (identical entries), entries only in schema1, entries only in schema2, and entries with the same name but different attributes.
- Parameters:
attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is HedKey.InLibrary.
sections (tuple or None) – Tuple of HedSectionKey values to compare. If None, compares all sections including miscellaneous metadata. Default is (HedSectionKey.Tags,).
- Returns:
- Four dictionaries (matches, not_in_schema1, not_in_schema2, unequal_entries):
matches: Entries identical in both schemas
not_in_schema1: Entries only in schema2
not_in_schema2: Entries only in schema1
unequal_entries: Entries with same name but different attributes
- Return type:
- find_matching_tags(sections=(HedSectionKey.Tags,), return_string=True)[source]¶
Find tags with matching names in both schemas.
This method identifies all entries that exist in both schemas with the same name, regardless of whether their attributes differ.
- Parameters:
- Returns:
- If return_string is True, returns formatted string listing matching tags.
If False, returns dictionary mapping section keys to dictionaries of matching tag entries.
- Return type:
- gather_schema_changes(attribute_filter=None)[source]¶
Generate a structured changelog by comparing the two schemas.
This method performs a comprehensive comparison and produces a categorized change dictionary suitable for version control and documentation. Changes are classified by severity (Major, Minor, Patch, Unknown) and organized by schema section.
- Parameters:
attribute_filter (HedKey or None) – If provided, only entries with this attribute are compared. Set to None to compare all entries. Default is None.
- Returns:
- Dictionary mapping section keys to lists of change dictionaries. Each change
dictionary contains ‘change_type’, ‘change’ (description), and ‘tag’ (affected entry).
- Return type:
- pretty_print_change_dict(change_dict, title='Schema changes', use_markdown=True)[source]¶
Format a change dictionary into a human-readable string.
Converts the structured change dictionary from gather_schema_changes into a formatted text report suitable for display or documentation.
- Parameters:
- Returns:
Formatted string representation of the changes.
- Return type:
Schema validation¶
Compliance checking¶
- check_compliance(hed_schema, check_for_warnings=True, name=None, error_handler=None)[source]¶
Check a HED schema for compliance.
- Parameters:
hed_schema (HedSchema) – HedSchema object to check for hed3 compliance.
check_for_warnings (bool) – If True, check for formatting issues like invalid characters, capitalization, etc.
name (str) – If present, will use as filename for context.
error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.
- Returns:
- A list of all warnings and errors found. Each issue is a dict.
The returned list has an additional
compliance_summaryattribute (ComplianceSummary) providing a structured report.
- Return type:
- Raises:
ValueError – If hed_schema is not a
HedSchemainstance.
- class SchemaValidator(hed_schema, error_handler)[source]¶
Bases:
objectValidates a loaded HedSchema for compliance.
The five content sections (Tags, UnitClasses, Units, UnitModifiers, ValueClasses) are validated using range and domain metadata that the schema itself provides in its Attributes and Properties sections.
Typical usage is through
check_compliance().- check_annotation_attribute_values()[source]¶
Validate that annotation attribute values reference valid prefixes, external annotations, and sources.
For each entry that has an
annotationattribute, checks that:The value starts with
prefix:idwhereprefix:is defined in the Prefixes extras section andprefix:+idis a row in the ExternalAnnotations extras section.If the annotation references
dc:source, the remaining text after ``dc:source `` must start with a name from the Sources extras section.
- check_attributes()[source]¶
Validate every attribute on every entry in every section.
For each attribute this performs three layers of checking:
Domain — the attribute is valid for the entry’s section. Any attribute not in the section’s
valid_attributeswas already flagged as_unknown_attributesduring loading; those are reported here.Range — the attribute value matches the range type declared on the attribute’s own definition (boolRange, tagRange, etc.).
Semantic — extra attribute-specific rules (e.g. takesValue requires a placeholder entry, deprecatedFrom version must exist).
- check_extras_columns()[source]¶
Validate that all extras DataFrames have non-empty values in required columns.
For each extras section (Sources, Prefixes, ExternalAnnotations), checks that every cell in the required columns defined in
df_constants.extras_column_dicthas a non-empty value.Note
Missing columns are automatically added with empty strings during schema loading (see
base2schema.fix_extra), so only value presence needs to be checked here.
Compliance summary¶
- class ComplianceSummary(schema_name='', schema_version='')[source]¶
Bases:
objectTracks what was checked during schema compliance validation and the results.
This provides a structured report of all checks performed, how many entries were examined, and how many issues were found per check category.
Use
get_summary()for a human-readable text report, or accesscheck_resultsdirectly for programmatic use.- add_sub_check(sub_check_name)[source]¶
Record a named sub-check within the current check.
- Parameters:
sub_check_name (str) – Name of the sub-check (e.g. an attribute validator name).
- record_issues(issue_count)[source]¶
Record issues found during the current check.
- Parameters:
issue_count (int) – Number of issues found.
- record_section(section_key, entries_checked, entries_skipped=0)[source]¶
Record that a section was examined during the current check.
- Parameters:
section_key (HedSectionKey or str) – The section that was checked.
entries_checked (int) – Number of entries examined in this section.
entries_skipped (int) – Number of entries skipped (e.g. deprecated).
- property total_entries_checked¶
Return total entries checked across all checks.
- Returns:
Total number of entries examined.
- Return type:
HED ID validator¶
- class HedIDValidator(hed_schema)[source]¶
Bases:
objectSupport class to validate hedIds in schemas
- verify_tag_id(hed_schema, tag_entry, attribute_name)[source]¶
Validates the hedID attribute values
This follows the template from schema_attribute_validators.py
- Parameters:
hed_schema (HedSchema) – The schema to use for validation
tag_entry (HedSchemaEntry) – The schema entry for this tag.
attribute_name (str) – The name of this attribute.
- Returns:
A list of issues from validating this attribute.
- Return type:
Schema constants¶
HedKey¶
- class HedKey[source]¶
Bases:
objectKnown property and attribute names.
Notes
These names should match the attribute values in the XML/wiki.
- AllowedCharacter = 'allowedCharacter'¶
- AnnotationProperty = 'annotationProperty'¶
- BoolRange = 'boolRange'¶
- ConversionFactor = 'conversionFactor'¶
- DefaultUnits = 'defaultUnits'¶
- DeprecatedFrom = 'deprecatedFrom'¶
- ElementDomain = 'elementDomain'¶
- ExtensionAllowed = 'extensionAllowed'¶
- HedID = 'hedId'¶
- InLibrary = 'inLibrary'¶
- NumericRange = 'numericRange'¶
- Recommended = 'recommended'¶
- RelatedTag = 'relatedTag'¶
- RequireChild = 'requireChild'¶
- Required = 'required'¶
- Reserved = 'reserved'¶
- Rooted = 'rooted'¶
- SIUnit = 'SIUnit'¶
- SIUnitModifier = 'SIUnitModifier'¶
- SIUnitSymbolModifier = 'SIUnitSymbolModifier'¶
- StringRange = 'stringRange'¶
- SuggestedTag = 'suggestedTag'¶
- TagDomain = 'tagDomain'¶
- TagGroup = 'tagGroup'¶
- TagRange = 'tagRange'¶
- TakesValue = 'takesValue'¶
- TopLevelTagGroup = 'topLevelTagGroup'¶
- Unique = 'unique'¶
- UnitClass = 'unitClass'¶
- UnitClassDomain = 'unitClassDomain'¶
- UnitClassRange = 'unitClassRange'¶
- UnitDomain = 'unitDomain'¶
- UnitModifierDomain = 'unitModifierDomain'¶
- UnitPrefix = 'unitPrefix'¶
- UnitRange = 'unitRange'¶
- UnitSymbol = 'unitSymbol'¶
- ValueClass = 'valueClass'¶
- ValueClassDomain = 'valueClassDomain'¶
- ValueClassRange = 'valueClassRange'¶
HedSectionKey¶
HedKeyOld¶
- class HedKeyOld[source]¶
Bases:
objectFully deprecated schema attribute key constants retained for backwards compatibility.
- BoolProperty = 'boolProperty'¶
- ElementProperty = 'elementProperty'¶
- IsInheritedProperty = 'isInheritedProperty'¶
- NodeProperty = 'nodeProperty'¶
- UnitClassProperty = 'unitClassProperty'¶
- UnitModifierProperty = 'unitModifierProperty'¶
- UnitProperty = 'unitProperty'¶
- ValueClassProperty = 'valueClassProperty'¶