Introduction to the HED test suite¶

What is HED?¶

HED (Hierarchical Event Descriptors) is a framework for systematically describing events and experimental metadata in machine-actionable form. HED provides:

Controlled vocabulary for annotating experimental data and events
Standardized infrastructure enabling automated analysis and interpretation
Integration with major neuroimaging standards (BIDS and NWB)

For more information, visit the HED project homepage and the resources page.

What is the HED test suite?¶

The HED test suite (hed-tests repository) is the official collection of JSON test cases for validating HED validator implementations. It provides:

Comprehensive test coverage: 136 test cases covering 33 error codes
Multiple test types: String, sidecar, event, and combo tests
AI-friendly metadata: Explanations, common causes, and correction strategies
Cross-platform consistency: Single source of truth for all validators
Machine-readable specification: Tests document expected validation behavior

Purpose¶

The test suite serves three primary purposes:

Validator validation: Ensure Python, JavaScript, and future implementations produce consistent results
Specification documentation: Provide executable examples of HED validation rules
AI training: Enable AI systems to understand HED validation through structured examples

Getting started¶

Clone the repository¶

Get the test suite from GitHub:

git clone https://github.com/hed-standard/hed-tests.git
cd hed-tests

Repository structure¶

hed-tests/
├── json_test_data/                     # All test data
│   ├── validation_tests/               # 25 validation error test files
│   ├── schema_tests/                   # 17 schema error test files
│   ├── validation_tests.json           # Consolidated validation tests
│   ├── validation_code_dict.json       # Maps error codes to test names
│   ├── validation_testname_dict.json   # Maps test names to error codes
│   ├── schema_tests.json               # Consolidated schema tests
│   ├── schema_code_dict.json           # Maps error codes to test names
│   └── schema_testname_dict.json       # Maps test names to error codes
├── src/
│   ├── scripts/                        # Utility scripts
│   └── schemas/                        # JSON schema for test validation
├── docs/                               # Documentation (this site)
└── tests/                              # Test utilities

Test files are organized by error code in the json_test_data directory. Tests that are relevant to validation of HED annotations are in the validation_tests subdirectory, while the tests that are relevant only to HED schema development are organized in the schema_tests subdirectory.

Test structure¶

Tests for a specific error code are in a single file named by the most likely HED error code and must conform to a JSON schema available in src/schemas/test_schema.json.

A validator might give a different error code

Because the exact error code that a validator assigns to an error depends heavily on the order in which it evaluates types of errors, a given test may produce a different error code.

Each test has a alt_codes key that gives acceptable alternative error codes.

Validating the tests¶

Ensure test files conform to the JSON schema:

Validate a single test file¶

python src/scripts/validate_test_structure.py json_test_data/validation_tests/TAG_INVALID.json

Validate all tests¶

python src/scripts/validate_test_structure.py json_test_data/validation_tests

python src/scripts/validate_test_structure.py json_test_data/schema_tests

### Consolidate Tests

Generate consolidated test files and lookup dictionaries:

```powershell
python src/scripts/consolidate_tests.py

# Creates:
#   - validation_tests.json (all validation tests)
#   - validation_code_dict.json (error codes to test names)
#   - validation_testname_dict.json (test names to error codes)
#   - schema_tests.json (all schema tests)
#   - schema_code_dict.json (error codes to test names)
#   - schema_testname_dict.json (test names to error codes)
```

The consolidation process creates both combined test files and lookup dictionaries for efficient test discovery.

### Check Test Coverage

Analyze test coverage statistics:

```powershell
python src/scripts/check_coverage.py

# Output:
# HED Test Suite Coverage Report
# =====================================
# Total test files: 42
# Total test cases: 136
# Error codes covered: 33
# ...

Generate Test Index¶

Create a searchable test index:

python src/scripts/generate_test_index.py

# Creates: docs/test_index.md

Test file format¶

Each test file contains an array of test case objects in structured JSON format. Below is a complete example showing all available fields:

[
    {
        "error_code": "TAG_INVALID",
        "alt_codes": ["PLACEHOLDER_INVALID"],
        "name": "tag-invalid-in-schema",
        "description": "The tag is not valid in the schema it is associated with.",
        "warning": false,
        "schema": "8.4.0",
        "error_category": "semantic",
        "common_causes": [
            "Misspelling tag names",
            "Using tags that don't exist in the specified schema version",
            "Creating extensions without proper parent tags"
        ],
        "explanation": "HED tags must exist in the specified schema or be valid extensions of existing tags.",
        "correction_strategy": "Use valid schema tags or create proper extensions",
        "correction_examples": [
            {
                "wrong": "ReallyInvalid/Extension",
                "correct": "Item/Object/Man-made-object/Device",
                "explanation": "Replaced non-existent tag with valid schema tag"
            }
        ],
        "definitions": [
            "(Definition/Acc/#, (Acceleration/# m-per-s^2, Red))"
        ],
        "tests": {
            "string_tests": {
                "fails": ["ReallyInvalid", "Label #"],
                "passes": ["Brown-color/Brown"]
            },
            "sidecar_tests": {
                "fails": [{
                    "event_code": {
                        "HED": {
                            "face": "ReallyInvalid"
                        }
                    }
                }],
                "passes": [{
                    "event_code": {
                        "HED": {
                            "face": "Brown-color/Brown"
                        }
                    }
                }]
            },
            "event_tests": {
                "fails": [[
                    ["onset", "duration", "HED"],
                    [4.5, 0, "Label #"]
                ]],
                "passes": [[
                    ["onset", "duration", "HED"],
                    [4.5, 0, "Brown-color/Brown"]
                ]]
            },
            "combo_tests": {
                "fails": [{
                    "sidecar": {
                        "event_code": {
                            "HED": {"face": "ReallyInvalid"}
                        }
                    },
                    "events": [
                        ["onset", "duration", "event_code", "HED"],
                        [4.5, 0, "face", "Red"]
                    ]
                }],
                "passes": [{
                    "sidecar": {
                        "event_code": {
                            "HED": {"face": "Acceleration/5 m-per-s^2"}
                        }
                    },
                    "events": [
                        ["onset", "duration", "event_code", "HED"],
                        [4.5, 0, "face", "Blue"]
                    ]
                }]
            }
        }
    }
]

Field descriptions¶

Core identification fields:

error_code: The primary error code being tested (required)
alt_codes: Alternative error codes that may apply (optional)
name: Unique identifier for this test case (required)
description: Human-readable description of what is being tested (required)
warning: Whether this is a warning (true) or error (false) (required)
schema: HED schema version(s) used - string or array (required)

AI-friendly metadata fields (for machine learning and automated correction):

error_category: Classification like “semantic”, “syntax”, “temporal_logic” (optional)
common_causes: Array of common reasons for this error (optional)
explanation: Detailed explanation of the error for AI systems (optional)
correction_strategy: High-level approach to fixing the error (optional)
correction_examples: Array of wrong/correct/explanation objects (optional)

Context fields:

definitions: Array of HED definition strings required for test validation (optional)

Test data (the tests object contains four test types):

string_tests: Raw HED strings to validate
- fails: Array of strings that should produce the error
- passes: Array of strings that should validate successfully
sidecar_tests: BIDS JSON sidecar objects
- fails: Array of sidecar objects that should produce the error
- passes: Array of sidecar objects that should validate successfully
event_tests: Tabular event data with HED columns (no sidecar)
- fails: Array of event arrays (first row is headers, subsequent rows are data)
- passes: Array of event arrays that should validate successfully
combo_tests: Combined sidecar+events (realistic BIDS scenarios)
- fails: Array of sidecar+events combinations that should fail validation
- passes: Array of sidecar+events combination that should validate successfully

See Test Format Specification for complete documentation and additional optional fields.

What this tests¶

Using the example above, here’s what each test type validates:

string_tests: Direct HED string validation

fails: ["ReallyInvalid"] - This raw HED string should trigger TAG_INVALID error
passes: ["Brown-color/Brown"] - This raw HED string should validate successfully

sidecar_tests: BIDS JSON sidecar validation (metadata files)

Tests that sidecar HED annotations properly flag invalid tags
Validators should detect errors in the sidecar structure before events are processed

event_tests: Tabular event data validation (without sidecar context)

First array in each test is the column headers
Subsequent arrays are data rows with onset, duration, and HED values
Tests standalone event file validation

combo_tests: Combined sidecar + events validation (realistic BIDS scenarios)

Most realistic test case - mirrors actual BIDS dataset structure
Sidecar provides HED annotations for categorical columns
Events reference sidecar entries plus inline HED
Validators must properly merge sidecar and event-level HED

AI metadata usage:

common_causes helps AI systems understand why users make this error
explanation provides context for automated correction suggestions
correction_examples show concrete before/after examples for learning

Lookup dictionaries¶

In addition to the test files, consolidated lookup dictionaries are provided for efficient test discovery and organization:

Code dictionaries¶

Map error codes to lists of test case names that validate that code:

validation_code_dict.json - Validation test lookup:

{
    "TAG_INVALID": [
        "tag-invalid-in-schema",
        "tag-invalid-extension",
        "placeholder-invalid-context"
    ],
    "UNITS_INVALID": [
        "units-invalid-missing",
        "units-invalid-incorrect"
    ]
}

schema_code_dict.json - Schema test lookup (similar structure)

Test name dictionaries¶

Map test case names to all error codes they validate (including alternates):

validation_testname_dict.json - Test case lookup:

{
    "tag-invalid-in-schema": [
        "TAG_INVALID",
        "PLACEHOLDER_INVALID"
    ],
    "character-invalid-non-printing": [
        "CHARACTER_INVALID",
        "TAG_INVALID",
        "VALUE_INVALID"
    ]
}

schema_testname_dict.json - Schema test case lookup (similar structure)

Using the dictionaries¶

The dictionaries enable efficient test queries:

import json

# Find all tests for a specific error code
with open('json_test_data/validation_code_dict.json') as f:
    code_dict = json.load(f)
    
tests_for_tag_invalid = code_dict['TAG_INVALID']
print(f"TAG_INVALID is tested by: {tests_for_tag_invalid}")

# Find all error codes covered by a test
with open('json_test_data/validation_testname_dict.json') as f:
    name_dict = json.load(f)
    
codes = name_dict['tag-invalid-in-schema']
print(f"Test covers error codes: {codes}")

The dictionaries are automatically generated by src/scripts/consolidate_tests.py along with the consolidated test files.

For validator developers¶

If you’re building a HED validator:

Clone this repository or add as a submodule
Parse test JSON files from json_test_data/
Execute tests against your validation implementation
Report discrepancies as issues

See Validator integration guide for detailed integration instructions.

For test contributors¶

Want to add new tests or improve existing ones?

Follow the format: Use the JSON schema in src/schemas/test_schema.json
Include AI metadata: Add explanations and correction examples
Validate your changes: Run validate_test_structure.py
Submit a PR: See CONTRIBUTING.md

Test statistics¶

Current test suite coverage:

42 test files: 25 validation tests + 17 schema tests
136 test cases: Comprehensive error code coverage
33 error codes: All major validation errors
100% AI metadata: Every test includes explanations and corrections

See Test coverage report for detailed statistics.

Error code categories¶

Tests are organized into categories:

Syntax errors¶

CHARACTER_INVALID - Invalid characters in tags
COMMA_MISSING - Missing required commas
PARENTHESES_MISMATCH - Unmatched parentheses
TAG_EMPTY - Empty tag elements

Semantic errors¶

TAG_INVALID - Tags not in schema
TAG_EXTENDED - Invalid tag extensions
VALUE_INVALID - Invalid tag values
UNITS_INVALID - Invalid or missing units

Definition errors¶

DEFINITION_INVALID - Malformed definitions
DEF_INVALID - Invalid definition usage
DEF_EXPAND_INVALID - Definition expansion errors

Sidecar errors¶

SIDECAR_INVALID - Invalid sidecar structure
SIDECAR_BRACES_INVALID - Curly brace errors
SIDECAR_KEY_MISSING - Missing required keys

Schema errors¶

SCHEMA_ATTRIBUTE_INVALID - Invalid schema attributes
SCHEMA_DUPLICATE_NODE - Duplicate schema nodes
SCHEMA_HEADER_INVALID - Invalid schema headers

Temporal Errors¶

TEMPORAL_TAG_ERROR - Temporal tag issues
TEMPORAL_TAG_ERROR_DELAY - Delay tag errors

See Test Index for complete error code listing.

Getting help¶

Documentation resources¶

Test format specification: Complete JSON schema documentation
Validator Integration Guide: How to use tests in your validator
Test Coverage Report: Current coverage statistics
Test Index: Searchable test case index
Contributing Guide: How to add or improve tests

Support¶

Issues, questions, and bugs: Open an issue on GitHub
Contact: Email hed.maintainers@gmail.com

HED resources¶

HED homepage: Project overview
HED specification: Formal validation rules
HED schemas: Vocabulary definitions
HED Python validator: Python implementation
HED JavaScript validator: JavaScript implementation

Next steps¶

Contribute: Read CONTRIBUTING.md to add new tests
View coverage: Check test_coverage.md for statistics

Introduction to the HED test suite¶

What is HED?¶

What is the HED test suite?¶

Purpose¶

Related tools and resources¶

Getting started¶

Clone the repository¶

Repository structure¶

Test structure¶

Validating the tests¶

Validate a single test file¶

Validate all tests¶

Generate Test Index¶

Test file format¶

Field descriptions¶

What this tests¶

Lookup dictionaries¶

Code dictionaries¶

Test name dictionaries¶

Using the dictionaries¶

For validator developers¶

For test contributors¶

Test statistics¶

Error code categories¶

Syntax errors¶

Semantic errors¶

Definition errors¶

Sidecar errors¶

Schema errors¶

Temporal Errors¶

Getting help¶

Documentation resources¶

Support¶

HED resources¶

Next steps¶