Beta 1

Title Speaker Identification - Features, Models and Robustness
Author Ghassemian, Mehrdad
Strange, Kasper
Supervisor Hansen, Lars Kai (Intelligent Signal Processing, Department of Informatics and Mathematical Modeling, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark)
Institution Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark
Thesis level Master's thesis
Year 2009
Abstract This Master's thesis presents an investigation of the features and models used when constructing a robust speaker identification system using the TIMIT speaker database. Investigations of the k-Means clustering algorithm and the Gaussian mixture models (GMM) for speaker modelling show an improvement in the identification rate when using the GMM speaker models. The features for the speaker identification should emphasize the individual differences in the speech while suppressing the phonetic information, the exact opposite is the case for the features used for speech recognition. However the same features, the MFCCs, have been used for both tasks. Using the Fisher's F-ratio to measure the frequency regions containing the most discriminative speaker information we present a new set of features, the FRFCCs. They emphasize the regions with speaker discriminative information and suppress the phonetic information in the speech. The Fisher's F-ratio shows that the regions around the fundamental frequency (100 Hz) and the third (2500 Hz) and fourth (3500 Hz) formant contain large speaker information, while the region around the first formant (500 Hz) contains only phonetic information. By adding noise to the TIMIT database we show that using the FRFCC features yield a better and more robust automatic speaker identification system. Finally testing on speech from Danish TV we show that using the FRFCCs instead of the MFCCs gives an improvement of 91%.
Series IMM-M.Sc.-2009-14
Original PDF ep09_14_web.pdf (1.28 MB)
Admin Creation date: 2009-03-11    Update date: 2009-08-06    Source: dtu    ID: 239906    Original MXD