Validator¶
Validation tools for HED data structures and annotations.
Core validator classes¶
HedValidator¶
- class hed.validator.hed_validator.HedValidator(hed_schema, def_dicts=None, definitions_allowed=False)[source]¶
Bases:
objectTop level validation of HED strings.
This module contains the HedValidator class which is used to validate the tags in a HED string or a file. The file types include .tsv, .txt, and .xlsx. To get the validation issues after creating a HedValidator class call the get_validation_issues() function.
- __init__(hed_schema, def_dicts=None, definitions_allowed=False)[source]¶
Constructor for the HedValidator class.
- Parameters:
hed_schema (HedSchema or HedSchemaGroup) – HedSchema object to use for validation.
def_dicts (DefinitionDict or list or dict) – the def dicts to use for validation
definitions_allowed (bool) – If False, flag definitions found as errors
- validate(hed_string, allow_placeholders, error_handler=None) list[dict][source]¶
Validate the HED string object using the schema.
- Parameters:
hed_string (HedString) – the string to validate.
allow_placeholders (bool) – allow placeholders in the string.
error_handler (ErrorHandler or None) – the error handler to use, creates a default one if none passed.
- Returns:
A list of issues for HED string.
- Return type:
- run_basic_checks(hed_string, allow_placeholders) list[dict][source]¶
Run basic validation checks on a HED string.
- Parameters:
- Returns:
A list of issues found during validation. Each issue is represented as a dictionary.
- Return type:
Notes
This method performs initial validation checks on the HED string, including character validation and tag validation.
It checks for invalid characters, calculates canonical forms, and validates individual tags.
If any issues are found during these checks, the method stops and returns the issues immediately.
The method also validates definition tags if applicable.
- run_full_string_checks(hed_string) list[dict][source]¶
Run all full-string validation checks on a HED string.
- Parameters:
hed_string (HedString) – The HED string to validate.
- Returns:
A list of issues found during validation. Each issue is represented as a dictionary.
- Return type:
Notes
This method iterates through a series of validation checks defined in the checks list.
Each check is a callable function that takes hed_string as input and returns a list of issues.
If any check returns issues, the method stops and returns those issues immediately.
If no issues are found, an empty list is returned.
- pattern_doubleslash = re.compile('([ \\t/]{2,}|^/|/$)')¶
- validate_units(original_tag, validate_text=None, report_as=None, error_code=None, index_offset=0, allow_placeholders=True) list[dict][source]¶
Validate units and value classes
- Parameters:
original_tag (HedTag) – The source tag
validate_text (str) – the text we want to validate, if not the full extension.
report_as (HedTag) – Report the error tag as coming from a different one. Mostly for definitions that expand.
error_code (str) – The code to override the error as. Again mostly for def/def-expand tags.
index_offset (int) – Offset into the extension validate_text starts at
allow_placeholders (bool) – Whether placeholders are allowed (affects value class validation for “#”)
- Returns:
Issues found from units
- Return type:
Specialized validators¶
SidecarValidator¶
- class hed.validator.sidecar_validator.SidecarValidator(hed_schema)[source]¶
Bases:
objectValidates HED annotations in a BIDS JSON sidecar against a HED schema.
- reserved_column_names = ['HED']¶
- reserved_category_values = ['n/a']¶
- __init__(hed_schema)[source]¶
Constructor for the SidecarValidator class.
- Parameters:
hed_schema (HedSchema) – HED schema object to use for validation.
- validate(sidecar, extra_def_dicts=None, name=None, error_handler=None) list[dict][source]¶
Validate the input data using the schema
- Parameters:
sidecar (Sidecar) – Input data to be validated.
extra_def_dicts (list or DefinitionDict) – extra def dicts in addition to sidecar
name (str) – The name to report this sidecar as
error_handler (ErrorHandler) – Error context to use. Creates a new one if None.
- Returns:
A list of issues associated with each level in the HED string.
- Return type:
- validate_structure(sidecar, error_handler) list[dict][source]¶
Validate the raw structure of this sidecar.
- Parameters:
sidecar (Sidecar) – the sidecar to validate
error_handler (ErrorHandler) – The error handler to use for error context.
- Returns:
A list of issues found with the structure.
- Return type:
SpreadsheetValidator¶
- class hed.validator.spreadsheet_validator.SpreadsheetValidator(hed_schema)[source]¶
Bases:
objectValidates HED annotations in a tabular (TSV/Excel) spreadsheet against a HED schema.
- ONSET_TOLERANCE = 1e-07¶
- TEMPORAL_ANCHORS = re.compile('onset|inset|offset|delay')¶
- __init__(hed_schema)[source]¶
Constructor for the SpreadsheetValidator class.
- Parameters:
hed_schema (HedSchema) – HED schema object to use for validation.
- validate(data, def_dicts=None, name=None, error_handler=None) list[dict][source]¶
Validate the input data using the schema
- Parameters:
data (BaseInput) – Input data to be validated.
def_dicts (list of DefDict or DefDict) – all definitions to use for validation
name (str) – The name to report errors from this file as
error_handler (ErrorHandler) – Error context to use. Creates a new one if None.
- Returns:
A list of issues for HED string
- Return type:
DefValidator¶
- class hed.validator.def_validator.DefValidator(def_dicts=None, hed_schema=None)[source]¶
Bases:
DefinitionDictValidates Def/ and Def-expand/, as well as Temporal groups: Onset, Inset, and Offset
- __init__(def_dicts=None, hed_schema=None)[source]¶
Initialize for definitions in HED strings.
- Parameters:
def_dicts (list or DefinitionDict or str) – DefinitionDicts containing the definitions to pass to baseclass
hed_schema (HedSchema or None) – Required if passing strings or lists of strings, unused otherwise.
- validate_def_tags(hed_string_obj) list[dict][source]¶
Validate Def/Def-Expand tags.
- Parameters:
hed_string_obj (HedString) – The HED string to process.
hed_validator (HedValidator) – Used to validate the placeholder replacement.
- Returns:
Issues found related to validating defs. Each issue is a dictionary.
- Return type:
OnsetValidator¶
- class hed.validator.onset_validator.OnsetValidator[source]¶
Bases:
objectValidates onset/offset pairs.
- validate_temporal_relations(hed_string_obj) list[dict][source]¶
Validate onset/offset/inset tag relations
ReservedChecker¶
- class hed.validator.reserved_checker.ReservedChecker[source]¶
Bases:
objectThread-safe singleton that loads reserved tag rules and checks groups for compliance.
- reserved_reqs_path = '/home/runner/work/hed-python/hed-python/hed/validator/data/reservedTags.json'¶
- static get_instance()[source]¶
Return the singleton ReservedChecker instance, creating it on first call.
- Returns:
The shared singleton instance.
- Return type:
- get_reserved(group)[source]¶
Return the list of reserved tags found directly within the given HED group.
- check_reserved_compatibility(group, reserved_tags)[source]¶
Check to make sure that the reserved tags can be used together and no duplicates.
- check_tag_requirements(group, reserved_tags)[source]¶
Check the tag requirements within the group.
- Parameters:
Notes: This is only called when there are some reserved incompatible tags.
- get_group_requirements(reserved_tags) tuple[float, float][source]¶
Returns the maximum and minimum number of groups required for these reserved tags.
Validator utilities¶
CharValidator¶
- class hed.validator.util.char_util.CharValidator(modern_allowed_char_rules=False)[source]¶
Bases:
objectClass responsible for basic character level validation of a string or tag.
- DEFAULT_ALLOWED_PLACEHOLDER_CHARS = '.+-^ _#'¶
- TAG_ALLOWED_CHARS = '-_/'¶
- INVALID_STRING_CHARS = '[]{}~'¶
- INVALID_STRING_CHARS_PLACEHOLDERS = '[]~'¶
- __init__(modern_allowed_char_rules=False)[source]¶
Does basic character validation for HED strings/tags
- Parameters:
modern_allowed_char_rules (bool) – If True, use 8.3 style rules for unicode characters.
- check_invalid_character_issues(hed_string, allow_placeholders) list[dict][source]¶
Report invalid characters.
- Parameters:
- Returns:
Validation issues. Each issue is a dictionary.
- Return type:
Notes
- Invalid tag characters are defined by self.INVALID_STRING_CHARS or
self.INVALID_STRING_CHARS_PLACEHOLDERS
- check_tag_invalid_chars(original_tag, allow_placeholders) list[dict][source]¶
Report invalid characters in the given tag.
- check_for_invalid_extension_chars(original_tag, validate_text, error_code=None, index_offset=0) list[dict][source]¶
Report invalid characters in extension/value.
- Parameters:
original_tag (HedTag) – The original tag that is used to report the error.
validate_text (str) – the text we want to validate, if not the full extension.
error_code (str) – The code to override the error as. Again mostly for def/def-expand tags.
index_offset (int) – Offset into the extension validate_text starts at.
- Returns:
Validation issues. Each issue is a dictionary.
- Return type:
CharRexValidator¶
- class hed.validator.util.char_util.CharRexValidator(modern_allowed_char_rules=False)[source]¶
Bases:
CharValidatorClass responsible for basic character level validation of a string or tag.
- __init__(modern_allowed_char_rules=False)[source]¶
Does basic character validation for HED strings/tags
- Parameters:
modern_allowed_char_rules (bool) – If True, use 8.3 style rules for Unicode characters.
- get_problem_chars(in_str, cname)[source]¶
Return a list of (index, char) pairs for characters in in_str not allowed by the value class cname.
- is_valid_value(in_string, cname)[source]¶
Check whether in_string is a valid whole-word value for class cname.
- Parameters:
- Returns:
Trueif no word-level regex is defined for cname (class imposes no constraint).A
re.Matchobject if in_string matches the word-level regex (valid value).Falseif in_string does not match the word-level regex (invalid value).
- Return type:
True | re.Match | False
UnitValueValidator¶
- class hed.validator.util.class_util.UnitValueValidator(modern_allowed_char_rules=False, value_validators=None)[source]¶
Bases:
objectValidates units.
- DATE_TIME_VALUE_CLASS = 'dateTimeClass'¶
- NUMERIC_VALUE_CLASS = 'numericClass'¶
- TEXT_VALUE_CLASS = 'textClass'¶
- NAME_VALUE_CLASS = 'nameClass'¶
- DIGIT_OR_POUND_EXPRESSION = '^(-?[\\d.]+(?:e-?\\d+)?|#)$'¶
- __init__(modern_allowed_char_rules=False, value_validators=None)[source]¶
Validates the unit and value classes on a given tag.
- Parameters:
value_validators (dict or None) – Override or add value class validators
- check_tag_unit_class_units_are_valid(original_tag, validate_text, report_as=None, error_code=None, allow_placeholders=True) list[dict][source]¶
Report incorrect unit class or units.
- Parameters:
original_tag (HedTag) – The original tag that is used to report the error.
validate_text (str) – The text to validate.
report_as (HedTag) – Report errors as coming from this tag, rather than original_tag.
error_code (str) – Override error codes.
allow_placeholders (bool) – Whether placeholders are allowed (affects value class validation for “#”)
- Returns:
Validation issues. Each issue is a dictionary.
- Return type:
- check_tag_value_class_valid(original_tag, validate_text, report_as=None) list[dict][source]¶
Report an invalid value portion.
- static report_value_errors(error_dict, class_valid, report_as)[source]¶
Build validation issues from per-class character error and validity dicts.
- Parameters:
error_dict (dict) – Mapping of class name to list of (char, index) problem tuples.
class_valid (dict) – Mapping of class name to a validity result (
True,re.Match, orFalse) indicating whether the full value passed word-level format validation for that class.report_as (HedTag) – The tag object used as context in error reporting.
- Returns:
Validation issue dictionaries.
- Return type:
- static report_value_char_errors(class_name, errors, report_as)[source]¶
Build validation issues for specific invalid characters within a value class string.
DuplicateChecker¶
- class hed.validator.util.dup_util.DuplicateChecker[source]¶
Bases:
objectDetects duplicate tags and groups within a HED annotation.
- __init__()[source]¶
Checker for duplications in HED groups.
Notes
This checker has an early out strategy – it returns when it finds an error.
- check_for_duplicates(group) list[dict][source]¶
Find duplicates in a HED group and return the errors found.
- get_hash(group) int | None[source]¶
Return the unique hash for the group as long as no duplicates.
- Parameters:
group (HedGroup) – The HED group to be checked.
- Returns:
Unique hash or None if duplicates were detected within the group.
- Return type:
Union[int, None]
Note: As a side effect, this method will clear the issues list if no duplicates are found.
GroupValidator¶
- class hed.validator.util.group_util.GroupValidator(hed_schema)[source]¶
Bases:
objectValidation for attributes across groups HED tags.
This is things like Required, Unique, top level tags, etc.
- __init__(hed_schema)[source]¶
Constructor for GroupValidator
- Parameters:
hed_schema (HedSchema) – A HedSchema object.
- run_tag_level_validators(hed_string_obj) list[dict][source]¶
Report invalid groups at each level.
- Parameters:
hed_string_obj (HedString) – A HedString object.
- Returns:
Issues associated with each level in the HED string. Each issue is a dictionary.
- Return type:
Notes
This pertains to the top-level, all groups, and nested groups.
- run_all_tags_validators(hed_string_obj) list[dict][source]¶
Report invalid the multi-tag properties in a HED string, e.g. required tags.
- static check_tag_level_issue(original_tag_list, is_top_level, is_group) list[source]¶
Report tags incorrectly positioned in hierarchy.
StringValidator¶
- class hed.validator.util.string_util.StringValidator[source]¶
Bases:
objectRuns checks on the raw string that depend on multiple characters, e.g. mismatched parentheses
- OPENING_GROUP_CHARACTER = '('¶
- CLOSING_GROUP_CHARACTER = ')'¶
- COMMA = ','¶
- run_string_validator(hed_string_obj)[source]¶
Run all string-level structural checks on a HED string object.
TagValidator¶
- class hed.validator.util.tag_util.TagValidator[source]¶
Bases:
objectValidation for individual HED tags.
- CAMEL_CASE_EXPRESSION = '([A-Z]+\\s*[a-z-]*)+'¶
- run_individual_tag_validators(original_tag, allow_placeholders=False, is_definition=False) list[dict][source]¶
Runs the validators on the individual tags.
This ignores most illegal characters except in extensions.
- Parameters:
- Returns:
The validation issues associated with the tags. Each issue is dictionary.
- Return type:
- static check_tag_exists_in_schema(original_tag) list[dict][source]¶
Report invalid tag or doesn’t take a value.
- static check_tag_requires_child(original_tag) list[dict][source]¶
Report if tag is a leaf with ‘requiredTag’ attribute.
- check_capitalization(original_tag) list[dict][source]¶
Report warning if incorrect tag capitalization.
- check_tag_is_deprecated(original_tag) list[dict][source]¶
Return a validation issue if the tag carries the DeprecatedFrom attribute.