A NEW APPROACH FOR HEAD REALTED TRASFER FUNCTION INTERPOLATION/ CUSTOMIZATION
Motivation: A HRIR is a function of time, elevation, azimuth, and the individual. Measuring a HRIR experimentally for a fine sampling grid of elevation and azimuth is a time consuming and laborious process. In most spatial audio systems some sort of interpolation of the HRIR with respect to elevation and azimuth is needed. Interpolation is a very deceptive word here since any trivial interpolation of the HRIRs without taking into consideration the physical significance of the various features of the HRIRs and their dependency on elevation,azimuth and individuals would not make any sense. For interpolation with respect to elevation and azimuth different approaches used include direct time domain interpolation of HRIRs, direct frequency domain interpolation of HRTFs and interpolation of principal components in the PCA domain. These methods do not take into account the perceptual significance of the different features in the HRIR or the HRTF. For example if we just use linear interpolation in the frequency domain we are just averaging the magnitude of the spectrum of the closest sampling points which may distort the perceptually significant features. Ideally interpolation should be done in the perceptually important feature domain. Also a set of generic HRIR would not work satisfactorily since it has been shown that HRIRs are specific to an individual and if we use HRIRs of some other person the elevation perception will be very poor. Due to the difference in the anatomy of the humans and also the shape of the pinna the HRIRs for different persons will be quite different.To demostrate this point Figure 1 shows the HRTF as a function of elevation for zero azimuth along with their ear shapes for different subjects. It can be observed that there are significant differences in the HRTF. Also psychophysical experiments have been done to confirm that a large deterioration in the ability of subjects to localize sounds vertically when they listened to recordings made with microphone placed in other subject's ear canals. Several approaches have been studied for customization of HRIRs. We call both the interpolation of HRIR with respect to elevation and azimuth and customization to account for individual differences as customization of HRIRs for any elevation,azimuth and person. So the goal of our project is given a set of HRIRs experimentally measured for a certain number of individuals at a fixed sampling of elevation and azimuth, to customize this database for any new elevation,azimuth and individual.
Methodolgy:
We solve this problem in three steps called as Feature identification, Analysis
or feature extraction,Relation between features and physical measurements and
Synthesis. In the Feature identification phase we study the database which we
have for different elevation, azimuth and individuals and identify that there
are indeed features in the HRIR(in any domain) which are responsible for source
localization and whose physical significance can be conjectured if not fully
explained at this stage. Also these features should vary with elevation, azimuth
and the individual. Once the dominant features are identified in the analysis
phase we devise signal processing techniques to extract these dominant features
from the HRIR automatically. If only the dominant features completely modelled
our hearing system then the residual HRIR should be random noise. However since
there always will be a lot of features which are difficult to explain we retain
the residual HRIRs for all the individuals and all elevations and azimuths. The
residual HRIRs will be later used in the synthesis phase. Once the dominant set
of features are extracted from the HRIR we need to correlate these features with
the elevation, azimuth and different persons and quantify the dependencies. In
the synthesis if we are given a set of anthropometric measurements for a new
person and asked the HRIR for any elevation and azimuth we get the dominant
features based on the elevation,azimuth and the anthropometric measurements and
form a Impulse response which has only the dominant features. Then the closest
residual HRIR is selected from the database and superimposed on the synthesized
Impulse response to get the final HRIR. Figure 2 shows the analysis and the
synthesis phase ina block diagram.
Applications: Psychophysical experiments have confirmed that the features
we extracted are perceptually significant for source localization. So instead of
convolving with the complete HRIR we can build simplified models based on the
features extracted. This is different from the approaches in the literature
which use geometrical models to build simplified models. In our case we extract
the features from the experimentally measured HRIR. Also interpolation can be
done in the feature domain. Based on the features extracted these features can
be related to the physical dimensions of the human anatomy and the pinna and
customization of the HRIR is possible.