SHSpec

class hrtfpykit.datasets.SHSpec(sh_order, ears='both', index_by=('subject',), ear_one_hot=False, ear_index=False, frequency_one_hot=False, frequency_index=False, epsilon=1e-06, transform=None, name=None)

Define spherical harmonic HRTF coefficients returned by a sample.

SHSpec defines a spherical-harmonic HRTF representation derived from each selected subject HRTF. During indexing, the dataset loads the subject HRTF, applies the optional dataset level HRTF transform, applies this spec transform when provided, and runs the spherical harmonic transform on the resulting HRTF state.

The transform callable is used when SH coefficients should be calculated from a modified HRTF version. It receives the loaded HRTF object before the spherical harmonic transform and must return the HRTF object that should be used. It does not receive the calculated coefficient array.

If the spec is passed to inputs, its value appears under dataset[0]["inputs"][name]. If it is passed to target, its value appears under dataset[0]["target"][name]. When name is None, the default key is "sh". The returned value is a numpy.ndarray.

The first output axis is always the coefficient axis and has (sh_order + 1) ** 2 values. With ears="both", the output keeps an ear axis between coefficients and frequency bins. With ears="left" or ears="right", the ear axis is squeezed unless ear is included in index_by. Including frequency in index_by selects one frequency bin per row.

Parameters:
  • sh_order (int) – Spherical harmonic order used for the decomposition.

  • ears ({both, left, right} or sequence of str, default=``both``) – Ear selection when the dataset is indexed by ear.

  • index_by (str or tuple of str, default=(subject,)) – Dataset row axes. SH specs support ear and frequency indexing.

  • ear_one_hot (bool, default=False) – Whether ear context encodings are exposed in sample inputs.

  • ear_index (bool, default=False) – Whether ear context encodings are exposed in sample inputs.

  • frequency_one_hot (bool, default=False) – Whether frequency context encodings are exposed in sample inputs.

  • frequency_index (bool, default=False) – Whether frequency context encodings are exposed in sample inputs.

  • epsilon (float, default=1e-6) – Numerical regularization used by the SH transform.

  • transform (callable or None, default=None) – Optional HRTF transform applied before spherical harmonic decomposition. This transform receives the loaded HRTF object, not the calculated coefficient array.

  • name (str or None, default=None) – Optional public key used in sample dictionaries.

Returns:

Specification object consumed by dataset construction.

Return type:

SHSpec

Examples

>>> import numpy as np
>>> from hrtfpykit.datasets import HUTUBS, SHSpec
>>> dataset = HUTUBS(
...     root="datasets/hutubs",
...     inputs=SHSpec(
...         sh_order=9,
...         ears="left",
...         index_by=("subject",),
...         name="sh",
...     ),
... )
>>> sh_value = dataset[0]["inputs"]["sh"]
>>> print(type(sh_value).__name__)
ndarray
>>> print(sh_value.shape)
(100, 129)
>>> print(np.round(sh_value[:2, :3], 3))
[[...]]