Single-Channel Speaker Distance Estimation in Reverberant Environments
Neri, Michael; Politis, Archontis; Krause, Daniel; Carli, Marco; Virtanen, Tuomas (2023)
Avaa tiedosto
https://urn.fi/URN:NBN:fi:tuni-2023121310802
Kuvaus
Tiivistelmä
We introduce the novel task of continuous-valued speaker distance estimation which focuses on estimating non-discrete distances between a sound source and microphone, based on audio captured by the microphone. A novel learning-based approach for estimating speaker distance in reverberant environments from a single omnidirectional microphone is proposed. Using common acoustic features, such as the magnitude and phase of the audio spectrogram, with a convolutional recurrent neural network results in errors on the order of centimeters in noiseless audios. Experiments are carried out by means of an image-source room simulator with convolved speeches from a public dataset. An ablation study is performed to demonstrate the effectiveness of the proposed feature set. Finally, a study of the impact of real background noise, extracted from the WHAM! dataset at different signal-to-noise ratios highlights the discrepancy between noisy and noiseless scenarios, underlining the difficulty of the problem.
Kokoelmat
- TUNICRIS-julkaisut [19292]