Beta 1


Title Speech Reconstruction from Binary Masked Spectrograms Using Vector Quantized Speaker Models
Author Jensen, Michael K.
Nielsen, Søren Skou
Supervisor Hansen, Lars Kai (Department of Informatics and Mathematical Modeling, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark)
Institution Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark
Thesis level Master's thesis
Year 2006
Abstract Several source separation techniques use binary masking on spectrograms to separate two or more speakers from each other. In this thesis, the possibilities for obtaining the best quality signal, reconstructed from masked spectrograms through vector quantized models of speakers, is investigated. The advantages and disadvantages of such an approach are examined. Additionally, the task of signal reestimation from a spectrogram is investigated using several algorithms. Vector quantization of speakers can be used to improve on binary masked spectrograms but the approach is not shown to produce high quality speech. It is also concluded that phase information is very important for high quality speech reconstruction, and parameters for optimal phase reestimation are suggested.
Imprint Department of Informatics and Mathematical Modeling, Technical University of Denmark, DTU : DK-2800 Kgs. Lyngby, Denmark
Pages 214
Keywords signal processing; data clustering; mel ¯ltering; voiced unvoiced detection; k-means; vector quantization; signal estimation; phase reconstruction; spectrogram reconstruction.
Fulltext
Original Postscript imm4725.ps (12.80 MB)
Derived PDF imm4725.pdf (6.73 MB)
Admin Creation date: 2006-10-06    Update date: 2012-12-18    Source: dtu    ID: 191641    Original MXD