Schema¶
HED schema management and validation tools.
Core Schema Classes¶
HedSchema¶
- class hed.schema.hed_schema.HedSchema[source]¶
Bases:
HedSchemaBase
A HED schema suitable for processing.
- __init__()[source]¶
Constructor for the HedSchema class.
A HedSchema can be used for validation, checking tag attributes, parsing tags, etc.
- property version_number: str¶
The HED version of this schema.
- Returns:
The version of this schema.
- Return type:
- property version: str¶
The complete schema version, including prefix and library name(if applicable).
- Returns:
The complete schema version including library name and namespace.
- Return type:
- property library: str¶
The name of this library schema if one exists.
- Returns:
Library name if any.
- Return type:
- property schema_namespace: str¶
Returns the schema namespace prefix.
- Returns:
The schema namespace prefix.
- Return type:
- can_save() bool [source]¶
Returns if it’s legal to save this schema.
You cannot save schemas loaded as merged from multiple library schemas.
- Returns:
True if this can be saved.
- Return type:
- property with_standard: str¶
The version of the base schema this is extended from, if it exists.
- Returns:
HED version or empty string.
- Return type:
- property merged: bool¶
Returns if this schema was loaded from a merged file.
- Returns:
True if file was loaded from a merged file.
- Return type:
- property tags: HedSchemaTagSection¶
Return the tag schema section.
- Returns:
The tag section.
- Return type:
HedSchemaTagSection
- property unit_classes: HedSchemaUnitClassSection¶
Return the unit classes schema section.
- Returns:
The unit classes section.
- Return type:
HedSchemaUnitClassSection
- property units: HedSchemaUnitSection¶
Return the unit schema section.
- Returns:
The unit section.
- Return type:
HedSchemaUnitSection
- property unit_modifiers: HedSchemaSection¶
Return the modifiers classes schema section.
- Returns:
The unit modifiers section.
- Return type:
- property value_classes: HedSchemaSection¶
Return the value classes schema section.
- Returns:
The value classes section.
- Return type:
- property attributes: HedSchemaSection¶
Return the attributes schema section.
- Returns:
The attributes section.
- Return type:
- property properties: HedSchemaSection¶
Return the properties schema section.
- Returns:
The properties section.
- Return type:
- get_schema_versions() list[str] [source]¶
A list of HED version strings including namespace and library name if any of this schema.
- get_formatted_version() str [source]¶
The HED version string including namespace and library name if any of this schema.
- Returns:
A json formatted string of the complete version of this schema including library name and namespace.
- Return type:
- get_save_header_attributes(save_merged: bool = False) dict [source]¶
Returns the attributes that should be saved.
- schema_for_namespace(namespace: str) HedSchema | None [source]¶
Return HedSchema object for this namespace.
- property valid_prefixes: list[str]¶
Return a list of all prefixes this schema will accept
Notes
The return value is always length 1 if using a HedSchema.
- get_extras(extras_key) DataFrame | None [source]¶
Get the extras corresponding to the given key
- Parameters:
extras_key (str) – The key to check for in the extras dictionary.
- Returns:
The DataFrame for this extras key, or None if it doesn’t exist or is empty.
- Return type:
Union[pd.DataFrame, None]
- get_as_dataframes(save_merged=False) dict[DataFrame] [source]¶
Get a dict of dataframes representing this file
- save_as_dataframes(base_filename, save_merged=False)[source]¶
Save as dataframes to a folder of files.
If base_filename has a .tsv suffix, save directly to the indicated location. If base_filename is a directory(does NOT have a .tsv suffix), save the contents into a directory named that. The subfiles are named the same. e.g. HED8.3.0/HED8.3.0_Tag.tsv
- Parameters:
- Raises:
OSError – File cannot be saved for some reason.
- set_schema_prefix(schema_namespace)[source]¶
Set library namespace associated for this schema.
- Parameters:
schema_namespace (str) – Should be empty, or end with a colon.(Colon will be automated added if missing).
- Raises:
HedFileError – The prefix is invalid.
- __eq__(other)[source]¶
Return True if these schema match exactly.
- Parameters:
other (HedSchema) – The schema to be compared.
- Returns:
True if other exactly matches this schema.
- Return type:
Notes
Matches must include attributes, tag names, etc.
- check_compliance(check_for_warnings=True, name=None, error_handler=None) list[dict] [source]¶
Check for HED3 compliance of this schema.
- Parameters:
check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.
name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.
error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.
- Returns:
A list of all warnings and errors found in the file. Each issue is a dictionary.
- Return type:
- get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags) list[HedSchemaEntry] [source]¶
Return tag entries with the given attribute.
- Parameters:
attribute (str) – A tag attribute. Eg HedKey.ExtensionAllowed
key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.
- Returns:
A list of all tags with this attribute.
- Return type:
Notes
The result is cached so will be fast after first call.
- get_tag_entry(name: str, key_class=HedSectionKey.Tags, schema_namespace: str = '') HedSchemaEntry | None [source]¶
Return the schema entry for this tag, if one exists.
- Parameters:
name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required
key_class (HedSectionKey or str) – The type of entry to return.
schema_namespace (str) – Only used on Tags. If incorrect, will return None.
- Returns:
The schema entry for the given tag, or None if not found.
- Return type:
Union[HedSchemaEntry, None]
- find_tag_entry(tag, schema_namespace='') tuple['HedTagEntry' | None, str | None, list[dict]] [source]¶
Find the schema entry for a given source tag.
- Parameters:
- Returns:
The located tag entry for this tag.
The remainder of the tag that isn’t part of the base tag.
A list of errors while converting.
- Return type:
tuple[Union[“HedTagEntry”, None], Union[str, None], list[dict]]
Notes
Works left to right (which is mostly relevant for errors).
- has_duplicates()[source]¶
Returns the first duplicate tag/unit/etc. if any section has a duplicate name
- get_tag_attribute_names_old() dict[str, HedSchemaEntry] [source]¶
Return a dict of all allowed tag attributes.
- Returns:
A dictionary whose keys are attribute names and values are HedSchemaEntry object.
- Return type:
HedSchemaEntry¶
- class hed.schema.hed_schema_entry.HedSchemaEntry(name, section)[source]¶
Bases:
object
A single node in a HedSchema.
The structure contains all the node information including attributes and properties.
- __init__(name, section)[source]¶
Constructor for HedSchemaEntry.
- Parameters:
name (str) – The name of the entry.
section (HedSchemaSection) – The section to which it belongs.
- finalize_entry(schema)[source]¶
Called once after loading to set internal state.
- Parameters:
schema (HedSchema) – The schema that holds the rules.
- has_attribute(attribute, return_value=False) bool | Any [source]¶
Checks for the existence of an attribute in this entry.
- Parameters:
- Returns:
If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.
- Return type:
Union[bool, any]
Notes
The existence of an attribute does not guarantee its validity.
- attribute_has_property(attribute, property_name) bool [source]¶
Return True if attribute has property.
- property section_key¶
HedTagEntry¶
- class hed.schema.hed_schema_entry.HedTagEntry(*args, **kwargs)[source]¶
Bases:
HedSchemaEntry
A single tag entry in the HedSchema.
- __init__(*args, **kwargs)[source]¶
Constructor for HedSchemaEntry.
- Parameters:
name (str) – The name of the entry.
section (HedSchemaSection) – The section to which it belongs.
- has_attribute(attribute, return_value=False)[source]¶
Returns th existence or value of an attribute in this entry.
This also checks parent tags for inheritable attributes like ExtensionAllowed.
- Parameters:
- Returns:
If return_value is False, returns True if the attribute exists and False otherwise. If return_value is True, returns the value of the attribute if it exists, else returns None.
- Return type:
Union[bool, any]
Notes
The existence of an attribute does not guarantee its validity.
- base_tag_has_attribute(tag_attribute)[source]¶
Check if the base tag has a specific attribute.
- Parameters:
tag_attribute (str) – A tag attribute.
- Returns:
True if the tag has the specified attribute. False, if otherwise.
- Return type:
Notes
This mostly is relevant for takes value tags.
- property parent¶
Get the parent entry of this tag
- property parent_name¶
Gets the parent tag entry name
UnitClassEntry¶
- class hed.schema.hed_schema_entry.UnitClassEntry(*args, **kwargs)[source]¶
Bases:
HedSchemaEntry
A single unit class entry in the HedSchema.
- __init__(*args, **kwargs)[source]¶
Constructor for HedSchemaEntry.
- Parameters:
name (str) – The name of the entry.
section (HedSchemaSection) – The section to which it belongs.
- property children¶
Alias to get the units for this class
- Returns:
The unit list for this class
- Return type:
unit_list(list)
- add_unit(unit_entry)[source]¶
Add the given unit entry to this unit class.
- Parameters:
unit_entry (HedSchemaEntry) – Unit entry to add.
UnitEntry¶
- class hed.schema.hed_schema_entry.UnitEntry(*args, **kwargs)[source]¶
Bases:
HedSchemaEntry
A single unit entry with modifiers in the HedSchema.
- __init__(*args, **kwargs)[source]¶
Constructor for HedSchemaEntry.
- Parameters:
name (str) – The name of the entry.
section (HedSchemaSection) – The section to which it belongs.
HedSchemaGroup¶
- class hed.schema.hed_schema_group.HedSchemaGroup(schema_list, name='')[source]¶
Bases:
HedSchemaBase
Container for multiple HedSchema objects.
Notes
The container class is useful when library schema are included.
You cannot save/load/etc. the combined schema object directly.
- __init__(schema_list, name='')[source]¶
Combine multiple HedSchema objects from a list.
- Parameters:
schema_list (list) – A list of HedSchema for the container.
- Returns:
the container created.
- Return type:
- Raises
HedFileError: If multiple schemas have the same library prefixes or empty list passed.
- get_schema_versions() list[str] [source]¶
A list of HED version strings including namespace and library name if any for these schemas.
- get_formatted_version() str [source]¶
The HED version string including namespace and library name if any of this schema.
- Returns:
The complete version of this schema including library name and namespace.
- Return type:
- schema_for_namespace(namespace) HedSchema | None [source]¶
Return the HedSchema for the library namespace.
- check_compliance(check_for_warnings=True, name=None, error_handler=None) list[dict] [source]¶
Check for HED3 compliance of this schema.
- Parameters:
check_for_warnings (bool) – If True, checks for formatting issues like invalid characters, capitalization.
name (str) – If present, use as the filename for context, rather than using the actual filename. Useful for temp filenames when supporting web services.
error_handler (ErrorHandler or None) – Used to report errors. Uses a default one if none passed in.
- Returns:
A list of all warnings and errors found in the file. Each issue is a dictionary.
- Return type:
- get_tags_with_attribute(attribute, key_class=HedSectionKey.Tags) list [source]¶
Return tag entries with the given attribute.
- Parameters:
attribute (str) – A tag attribute. Eg HedKey.ExtensionAllowed
key_class (HedSectionKey) – The HedSectionKey for the section to retrieve from.
- Returns:
A list of all tags with this attribute.
- Return type:
Notes
The result is cached so will be fast after first call.
- get_tag_entry(name, key_class=HedSectionKey.Tags, schema_namespace='') 'HedSchemaEntry' | None [source]¶
Return the schema entry for this tag, if one exists.
- Parameters:
name (str) – Any form of basic tag(or other section entry) to look up. This will not handle extensions or similar. If this is a tag, it can have a schema namespace, but it’s not required
key_class (HedSectionKey or str) – The type of entry to return.
schema_namespace (str) – Only used on Tags. If incorrect, will return None.
- Returns:
The schema entry for the given tag.
- Return type:
- find_tag_entry(tag, schema_namespace='') tuple['HedTagEntry' | None, str | None, list] [source]¶
Find the schema entry for a given source tag.
- Parameters:
- Returns:
The located tag entry for this tag. str: The remainder of the tag that isn’t part of the base tag. list: A list of errors while converting.
- Return type:
Notes
Works left to right (which is mostly relevant for errors).
HedSchemaSection¶
- class hed.schema.hed_schema_section.HedSchemaSection(section_key, case_sensitive=True)[source]¶
Bases:
object
Container with entries in one section of the schema.
- __init__(section_key, case_sensitive=True)[source]¶
Construct schema section.
- Parameters:
section_key (HedSectionKey) – Name of the schema section.
case_sensitive (bool) – If True, names are case-sensitive.
- property section_key¶
- property duplicate_names¶
- get_entries_with_attribute(attribute_name, return_name_only=False, schema_namespace='') list[HedSchemaEntry | str] [source]¶
Return entries or names with given attribute.
- Parameters:
- Returns:
List of HedSchemaEntry or strings representing the names.
- Return type:
list[Union[HedSchemaEntry, str]]
Schema IO and Caching¶
Schema Loading Functions¶
Utilities for loading and outputting HED schema.
- hed.schema.hed_schema_io.load_schema_version(xml_version=None, xml_folder=None) HedSchema | HedSchemaGroup [source]¶
Return a HedSchema or HedSchemaGroup extracted from xml_version
- Parameters:
- Returns:
The schema or schema group extracted.
- Return type:
Union[HedSchema, HedSchemaGroup]
- Raises:
HedFileError – The xml_version is not valid.
HedFileError – The specified version cannot be found or loaded.
HedFileError – Other fatal errors loading the schema (These are unlikely if you are not editing them locally).
HedFileError – The prefix is invalid.
- hed.schema.hed_schema_io.load_schema(hed_path, schema_namespace=None, schema=None, name=None) HedSchema [source]¶
Load a schema from the given file or URL path.
- Parameters:
hed_path (str) – A filepath or url to open a schema from. If loading a TSV file, this should be a single filename where: Template: basename.tsv, where files are named basename_Struct.tsv, basename_Tag.tsv, etc. Alternatively, you can point to a directory containing the .tsv files.
schema_namespace (str or None) – The name_prefix all tags in this schema will accept.
schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.
name (str or None) – User supplied identifier for this schema
- Returns:
The loaded schema.
- Return type:
- Raises:
HedFileError – Empty path passed.
HedFileError – Unknown extension.
HedFileError – Any fatal issues when loading the schema.
- hed.schema.hed_schema_io.from_string(schema_string, schema_format='.xml', schema_namespace=None, schema=None, name=None) HedSchema [source]¶
Create a schema from the given string.
- Parameters:
schema_string (str) – An XML or mediawiki file as a single long string
schema_format (str) – The schema format of the source schema string. Allowed normal values: .mediawiki, .xml
schema_namespace (str, None) – The name_prefix all tags in this schema will accept.
schema (HedSchema or None) – A HED schema to merge this new file into It must be a with-standard schema with the same value.
name (str or None) – User supplied identifier for this schema
- Returns:
The loaded schema.
- Return type:
- Raises:
If empty string or invalid extension is passed.
Other fatal formatting issues with file
Notes
The loading is determined by file type.
- hed.schema.hed_schema_io.from_dataframes(schema_data, schema_namespace=None, name=None) HedSchema [source]¶
Create a schema from the given string.
- Parameters:
- Returns:
The loaded schema.
- Return type:
- Raises:
HedFileError – If empty/invalid parameters.
Exception – Other fatal I/O or formatting issues.
Notes
The loading is determined by file type.
- hed.schema.hed_schema_io.get_hed_xml_version(xml_file_path) str [source]¶
Get the version number from a HED XML file.
- Parameters:
xml_file_path (str) – The path to a HED XML file.
- Returns:
The version number of the HED XML file.
- Return type:
- Raises:
There is an issue loading the schema
- hed.schema.hed_schema_io.parse_version_list(xml_version_list) dict [source]¶
Takes a list of xml versions and returns a dictionary split by prefix
e.g. [“score”, “testlib”] will return {“”: “score, testlib”} e.g. [“score”, “testlib”, “ol:otherlib”] will return {“”: “score, testlib”, “ol:”: “otherlib”}
Cache Management¶
Infrastructure for caching HED schema from remote repositories.
- hed.schema.hed_cache.set_cache_directory(new_cache_dir)[source]¶
Set default global HED cache directory.
- Parameters:
new_cache_dir (str) – Directory to check for versions.
- hed.schema.hed_cache.get_cache_directory(cache_folder=None) str [source]¶
Return the current value of HED_CACHE_DIRECTORY.
- hed.schema.hed_cache.get_hed_versions(local_hed_directory=None, library_name=None, check_prerelease=False) list | dict [source]¶
Get the HED versions in the HED directory.
- Parameters:
local_hed_directory (str) – Directory to check for versions which defaults to hed_cache.
library_name (str or None) – An optional schema library name. None retrieves the standard schema only. Pass “all” to retrieve all standard and library schemas as a dict.
check_prerelease (bool) – If True, results can include prerelease schemas
- Returns:
List of version numbers or dictionary {library_name: [versions]}.
- Return type:
- hed.schema.hed_cache.get_hed_version_path(xml_version, library_name=None, local_hed_directory=None, check_prerelease=False) str | None [source]¶
Get HED XML file path in a directory. Only returns filenames that exist.
- Parameters:
- Returns:
The path to the latest HED version the HED directory.
- Return type:
Union[str, None]
- hed.schema.hed_cache.cache_local_versions(cache_folder) int | None [source]¶
Cache all schemas included with the HED installation.
- hed.schema.hed_cache.cache_xml_versions(hed_base_urls=('https://api.github.com/repos/hed-standard/hed-schemas/contents/standard_schema',), hed_library_urls=('https://api.github.com/repos/hed-standard/hed-schemas/contents/library_schemas',), skip_folders=('deprecated',), cache_folder=None) float [source]¶
Cache all schemas at the given URLs.
- Parameters:
hed_base_urls (str or list) – Path or list of paths. These should point to a single folder.
hed_library_urls (str or list) – Path or list of paths. These should point to folder containing library folders.
skip_folders (list) – A list of subfolders to skip over when downloading.
cache_folder (str) – The folder holding the cache.
- Returns:
- Returns -1 if cache failed for any reason, including having been cached too recently.
Returns 0 if it successfully cached this time.
- Return type:
Notes
The Default skip_folders is ‘deprecated’.
The HED cache folder defaults to HED_CACHE_DIRECTORY.
- The directories on GitHub are of the form:
https://api.github.com/repos/hed-standard/hed-schemas/contents/standard_schema
Schema Constants¶
HedKey¶
- class hed.schema.hed_schema_constants.HedKey[source]¶
Bases:
object
Known property and attribute names.
Notes
These names should match the attribute values in the XML/wiki.
- ExtensionAllowed = 'extensionAllowed'¶
- Recommended = 'recommended'¶
- Required = 'required'¶
- RequireChild = 'requireChild'¶
- TagGroup = 'tagGroup'¶
- TakesValue = 'takesValue'¶
- TopLevelTagGroup = 'topLevelTagGroup'¶
- Unique = 'unique'¶
- UnitClass = 'unitClass'¶
- ValueClass = 'valueClass'¶
- RelatedTag = 'relatedTag'¶
- SuggestedTag = 'suggestedTag'¶
- Rooted = 'rooted'¶
- DeprecatedFrom = 'deprecatedFrom'¶
- ConversionFactor = 'conversionFactor'¶
- Reserved = 'reserved'¶
- SIUnit = 'SIUnit'¶
- UnitSymbol = 'unitSymbol'¶
- DefaultUnits = 'defaultUnits'¶
- UnitPrefix = 'unitPrefix'¶
- SIUnitModifier = 'SIUnitModifier'¶
- SIUnitSymbolModifier = 'SIUnitSymbolModifier'¶
- AllowedCharacter = 'allowedCharacter'¶
- InLibrary = 'inLibrary'¶
- HedID = 'hedId'¶
- UnitClassDomain = 'unitClassDomain'¶
- UnitDomain = 'unitDomain'¶
- UnitModifierDomain = 'unitModifierDomain'¶
- ValueClassDomain = 'valueClassDomain'¶
- ElementDomain = 'elementDomain'¶
- TagDomain = 'tagDomain'¶
- AnnotationProperty = 'annotationProperty'¶
- BoolRange = 'boolRange'¶
- TagRange = 'tagRange'¶
- NumericRange = 'numericRange'¶
- StringRange = 'stringRange'¶
- UnitClassRange = 'unitClassRange'¶
- UnitRange = 'unitRange'¶
- ValueClassRange = 'valueClassRange'¶
HedSectionKey¶
- class hed.schema.hed_schema_constants.HedSectionKey(value)[source]¶
Bases:
Enum
Keys designating specific sections in a HedSchema object.
- Tags = 'tags'¶
- UnitClasses = 'unitClasses'¶
- Units = 'units'¶
- UnitModifiers = 'unitModifiers'¶
- ValueClasses = 'valueClasses'¶
- Attributes = 'attributes'¶
- Properties = 'properties'¶