ILDSpec

class hrtfpykit.datasets.ILDSpec(positions='all', plane=None, index_by=('subject',), position_one_hot=False, position_index=False, frequency_one_hot=False, frequency_index=False, mode='broad-band', output='db', fft_length=None, epsilon=1e-12, transform=None, name=None)

Define interaural level difference values returned by a sample.

ILDSpec defines an ILD feature derived from each selected subject HRTF. It does not store ILD arrays itself. During indexing, the dataset loads the subject HRTF, applies the optional dataset level HRTF transform, applies this spec transform when provided, computes ILD from the resulting HRIR data, and returns the selected value.

The transform callable is used when ILD should be calculated from a modified HRTF version. It receives the loaded HRTF object before ILD calculation and must return the HRTF object that should be used for the metric. It does not receive the calculated ILD array.

If the spec is passed to inputs, its value appears under dataset[0]["inputs"][name]. If it is passed to target, its value appears under dataset[0]["target"][name]. When name is None, the default key is "ild". The returned value is a numpy.ndarray.

In mode="broad-band", subject only rows return one value per selected source position, while position indexed rows return one 0D array. In mode="frequency-dependent", a frequency axis is kept unless frequency is included in index_by.

Parameters:
  • positions ({all} or sequence of int, default=``all``) – Source position indices used before ILD calculation.

  • plane (str, tuple, dict, or None, default=None) – Optional plane selector used instead of explicit position indices.

  • index_by (str or tuple of str, default=(subject,)) – Dataset row axes. Frequency indexing requires mode set to frequency-dependent.

  • position_one_hot (bool, default=False) – Whether position context encodings are exposed in sample inputs.

  • position_index (bool, default=False) – Whether position context encodings are exposed in sample inputs.

  • frequency_one_hot (bool, default=False) – Whether frequency context encodings are exposed in sample inputs.

  • frequency_index (bool, default=False) – Whether frequency context encodings are exposed in sample inputs.

  • mode (str, default=``broad-band``) – ILD mode forwarded to the DSP metric.

  • output (str, default=``db``) – Output scale or representation requested from the ILD metric.

  • fft_length (int or None, default=None) – FFT length used for frequency dependent ILD calculation.

  • epsilon (float, default=1e-12) – Numerical floor used by level ratio calculations.

  • transform (callable or None, default=None) – Optional HRTF transform applied before ILD calculation. This transform receives the loaded HRTF object, not the calculated ILD value.

  • name (str or None, default=None) – Optional public key used in sample dictionaries.

Returns:

Specification object consumed by dataset construction.

Return type:

ILDSpec

Examples

>>> import numpy as np
>>> from hrtfpykit.datasets import HUTUBS, ILDSpec
>>> dataset = HUTUBS(
...     root="datasets/hutubs",
...     inputs=ILDSpec(
...         mode="broad-band",
...         index_by=("subject", "position"),
...         position_index=True,
...         name="ild",
...     ),
... )
>>> sample = dataset[0]
>>> ild_value = sample["inputs"]["ild"]
>>> print(type(ild_value).__name__)
ndarray
>>> print(ild_value.shape)
()
>>> print(sample["inputs"]["position_index"])
0
>>> np.asarray(np.round(ild_value, 3))
array(...)