HRTFSpec¶

class hrtfpykit.datasets.HRTFSpec(domain='time', signal='ir', positions='all', plane=None, frequencies=None, frequency_bands=None, ears='both', index_by=('subject',), position_one_hot=False, position_index=False, ear_one_hot=False, ear_index=False, frequency_one_hot=False, frequency_index=False, sample_one_hot=False, sample_index=False, transform=None, name=None)¶

Define one HRTF or HRIR array returned by a dataset sample.

HRTFSpec is the primary acoustic value spec for dataset construction. It tells a dataset which HRTF representation should be extracted, which source positions and ears should be kept, which axes should create dataset rows, and which row encodings should be added to sample inputs. The spec is a declarative configuration object; it does not scan resources or load files by itself.

During dataset construction, the spec makes the HRTF resource family required and contributes to the row layout. During indexing, the dataset loads the subject HRTF object, applies the optional dataset level HRTF transform, applies this spec transform when provided, and extracts the requested IR or TF value.

The transform callable is used when this spec should read a modified HRTF version before extracting values. It receives the loaded HRTF object and must return the HRTF object that should be used by this spec. It does not receive the final NumPy array.

If the spec is passed to inputs, its value appears under dataset[0]["inputs"][name]. If it is passed to target, its value appears under dataset[0]["target"][name]. When name is None, the default key is "hrtf". The returned value is a numpy.ndarray.

The output axes follow the selected HRTF state. Axes named in index_by are selected from the current dataset row and removed from the returned value. Axes not named in index_by remain in the output. The natural acoustic order is source positions, ears, and then samples for domain="time" or frequency bins for domain="frequency". Selecting one ear with ears="left" or ears="right" squeezes the ear axis unless ear is part of index_by. For frequency-domain specs, frequencies selects sparse nearest bins and frequency_bands selects inclusive native-grid intervals after HRTF transforms are applied. These selectors affect returned arrays and frequency-indexed rows without modifying the underlying HRTF object.

Parameters:

domain ({time, frequency}, default=``time``) – Acoustic domain to return. time returns HRIR sample data; frequency returns HRTF frequency data.
signal ({ir, tf_complex, tf_real, tf_imag, tf_magnitude, tf_magnitude_db, tf_phase}, default=``ir``) – Signal component to extract from the loaded HRTF object. ir is valid only when domain="time". Frequency-domain specs must use one of the tf_* signal names.
positions ({all} or sequence of int, default=``all``) – Source position indices to include.
plane (str, tuple, dict, or None, default=None) – Optional horizontal, median, or frontal plane selector.
frequencies (float, sequence of float, numpy.ndarray, or None, default=None) – Optional explicit frequency queries in hertz for frequency-domain values. Each query resolves to the nearest available TF bin after HRTF transforms are applied. Duplicate resolved bins are removed while preserving query order. Mutually exclusive with frequency_bands.
frequency_bands ((float, float), sequence of (float, float), numpy.ndarray, or None, default=None) – Optional inclusive frequency bands in hertz for frequency-domain values. Every available TF bin inside any requested band is kept in native grid order after HRTF transforms are applied. Mutually exclusive with frequencies.
ears ({both, left, right} or sequence of str, default=``both``) – Ear axis selection when the spec is indexed by ear.
index_by (str or tuple of str, default=(subject,)) – Dataset row axes for this spec. For domain="time", supported canonical combinations are ("subject",), ("subject", "position"), ("subject", "ear"), ("subject", "samples"), ("subject", "position", "ear"), ("subject", "position", "samples"), ("subject", "ear", "samples"), and ("subject", "position", "ear", "samples"). For domain="frequency", replace "samples" with "frequency". "source", "sources", and "positions" are accepted as aliases for "position" and normalize to "position".
position_one_hot (bool, default=False) – Whether row context encodings are exposed in the sample inputs.
position_index (bool, default=False) – Whether row context encodings are exposed in the sample inputs.
ear_one_hot (bool, default=False) – Whether row context encodings are exposed in the sample inputs.
ear_index (bool, default=False) – Whether row context encodings are exposed in the sample inputs.
frequency_one_hot (bool, default=False) – Whether frequency/sample context encodings are exposed in the sample inputs.
frequency_index (bool, default=False) – Whether frequency/sample context encodings are exposed in the sample inputs.
sample_one_hot (bool, default=False) – Whether frequency/sample context encodings are exposed in the sample inputs.
sample_index (bool, default=False) – Whether frequency/sample context encodings are exposed in the sample inputs.
transform (callable or None, default=None) – Optional HRTF transform applied after the dataset level dataset_hrtf_transform and before IR or TF values are extracted.
name (str or None, default=None) – Optional public key used in sample inputs or sample targets.

Returns:

Specification object consumed by dataset construction.

Return type:

HRTFSpec

Examples

>>> import numpy as np
>>> from hrtfpykit.datasets import HRTFSpec, HUTUBS
>>> dataset = HUTUBS(
...     root="datasets/hutubs",
...     inputs=HRTFSpec(
...         domain="frequency",
...         signal="tf_magnitude_db",
...         frequency_bands=[(0.0, 16000.0)],
...         ears="left",
...         index_by=("subject",),
...         name="hrtf",
...     ),
... )
>>> sample = dataset[0]
>>> hrtf = sample["inputs"]["hrtf"]
>>> print(type(hrtf).__name__)
ndarray
>>> print(hrtf.shape)
(440, 93)
>>> print(np.round(hrtf[:2, :3], 2))
[[...]]