Multiconditional Modeling is widely used to create noise-robust speaker recognition systems. However, the approach is computationally intensive. An alternative is to optimize the training condition set in order to achieve maximum noise robustness while using the smallest possible number of noise conditions during training. This paper establishes the optimal conditions for a noise-robust training model by considering audio material at different sampling rates and with different coding methods. Our results demonstrate that using approximately four training noise conditions is sufficient to guarantee robust models in the 60 dB to 10 dB Signal-to-Noise Ratio (SNR) range.
Automatic speaker recognition, Gaussian mixture models, Multiconditional models, noise robustness, computing power.
To return to the Volume/Number webpage, click here.
THE INTERNATIONAL JOURNAL OF FORENSIC COMPUTER SCIENCE - IJoFCS
Volume 3, Number 1, pp 60-69, DOI: 10.5769/J200801006 or http://dx.doi.org/10.5769/J200801006
Noise Robust Speaker Recognition using Reduced Multiconditional Gaussian Mixture Models
By Frederico D’Almeida, Francisco Assis Nascimento, Pedro Berger, and Lúcio Silva
To download this paper, click here