A NEW APPROACH FOR HEAD REALTED TRASFER FUNCTION INTERPOLATION/ CUSTOMIZATION


Motivation: A HRIR is a function of time, elevation, azimuth, and the individual. Measuring a HRIR experimentally for a fine sampling grid of elevation and azimuth is a time consuming and laborious process. In most spatial audio systems some sort of interpolation of the HRIR with respect to elevation and azimuth is needed. Interpolation is a very deceptive word here since any trivial interpolation of the HRIRs without taking into consideration the physical significance of the various features of the HRIRs and their dependency on elevation,azimuth and individuals would not make any sense. For interpolation with respect to elevation and azimuth different approaches used include direct time domain interpolation of HRIRs, direct frequency domain interpolation of HRTFs and interpolation of principal components in the PCA domain. These methods do not take into account the perceptual significance of the different features in the HRIR or the HRTF. For example if we just use linear interpolation in the frequency domain we are just averaging the magnitude of the spectrum of the closest sampling points which may distort the perceptually significant features. Ideally interpolation should be done in the perceptually important feature domain. Also a set of generic HRIR would not work satisfactorily since it has been shown that HRIRs are specific to an individual and if we use HRIRs of some other person the elevation perception will be very poor. Due to the difference in the anatomy of the humans and also the shape of the pinna the HRIRs for different persons will be quite different.To demostrate this point Figure 1 shows the HRTF as a function of elevation for zero azimuth along with their ear shapes for different subjects. It can be observed that there are significant differences in the HRTF. Also psychophysical experiments have been done to confirm that a large deterioration in the ability of subjects to localize sounds vertically when they listened to recordings made with microphone placed in other subject's ear canals. Several approaches have been studied for customization of HRIRs. We call both the interpolation of HRIR with respect to elevation and azimuth and customization to account for individual differences as customization of HRIRs for any elevation,azimuth and person. So the goal of our project is given a set of HRIRs experimentally measured for a certain number of individuals at a fixed sampling of elevation and azimuth, to customize this database for any new elevation,azimuth and individual.

Methodolgy: We solve this problem in three steps called as Feature identification, Analysis or feature extraction,Relation between features and physical measurements and Synthesis. In the Feature identification phase we study the database which we have for different elevation, azimuth and individuals and identify that there are indeed features in the HRIR(in any domain) which are responsible for source localization and whose physical significance can be conjectured if not fully explained at this stage. Also these features should vary with elevation, azimuth and the individual. Once the dominant features are identified in the analysis phase we devise signal processing techniques to extract these dominant features from the HRIR automatically. If only the dominant features completely modelled our hearing system then the residual HRIR should be random noise. However since there always will be a lot of features which are difficult to explain we retain the residual HRIRs for all the individuals and all elevations and azimuths. The residual HRIRs will be later used in the synthesis phase. Once the dominant set of features are extracted from the HRIR we need to correlate these features with the elevation, azimuth and different persons and quantify the dependencies. In the synthesis if we are given a set of anthropometric measurements for a new person and asked the HRIR for any elevation and azimuth we get the dominant features based on the elevation,azimuth and the anthropometric measurements and form a Impulse response which has only the dominant features. Then the closest residual HRIR is selected from the database and superimposed on the synthesized Impulse response to get the final HRIR. Figure 2 shows the analysis and the synthesis phase ina block diagram.

Applications: Psychophysical experiments have confirmed that the features we extracted are perceptually significant for source localization. So instead of convolving with the complete HRIR we can build simplified models based on the features extracted. This is different from the approaches in the literature which use geometrical models to build simplified models. In our case we extract the features from the experimentally measured HRIR. Also interpolation can be done in the feature domain. Based on the features extracted these features can be related to the physical dimensions of the human anatomy and the pinna and customization of the HRIR is possible.