Text Dependent Speaker Recognition System Using Vector Quantization

Isha Dhawan; Dr. Neelu Jain

Text Dependent Speaker Recognition System Using Vector Quantization

Isha Dhawan, Dr. Neelu Jain

Abstract

Speech is the most important and primary mode of communication among human being and also the most natural and efficient form of exchanging information among humans. Various fields for research in speech processing are Speech Recognition, Speaker Recognition, speech synthesis, speech coding etc. This paper presents a detailed study of text-dependent Speaker Recognition system used to identify an unknown speaker. This recognition system uses vector quantization (VQ) as the modeling technique. The features of the speech signal are extracted using Mel Frequency Cepstum Coefficients (MFCC) followed by the VQ technique. K-means clustering algorithm has been used to obtain the vector quantized codebook. Highest accuracy is obtained using hanning window and mel perceptual feature extraction realized with 35 filter bank. The accuracy also improves as the number of vectors in the VQ codebook is increased from 64 to 100.

Keywords

Feature extraction, K Mean clustering, Mel Frequency Cepsrtum Coefficients, Vector Quantization

Full Text:

PDF

References

Campell, W.M., Assaleh, K.T., and Broun, C.C.: ‗Speaker recognition with polynomial classifiers‘, IEEE Trans. on Speech Audio Process.2002, 10, (4), pp. 205–211

Farrell, K.R., Mammone, R.J., and Assaleh, K.T.: ‗Speaker recognition using neural networks and conventional classifiers‘, IEEE Trans. on Speech Audio Process., 1994, 2, (1), pp. 194–205

Campbell, J.P.: ‗Speaker recognition: a tutorial‘, Proc. IEEE, 1997, 85, pp. 1437–1462

Mammone, R., Zhang, X., and Ramachandran, R.: ‗Robust speaker recognition—a feature-based approach‘, IEEE Signal Process. Mag., 1996, 13, pp. 58–71

Matsui, T., and Furui, S.: ‗Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMM‘s‘, IEEE Trans. Speech Audio Process., 1994, 2, (3), pp. 456–459.

Soong, F.K., Rosenberg, A.E., Rabiner, L.R., and Juang, B.-H.: ‗A vector quantization approach to speaker recognition‘. ICASSP-85, 1985, pp. 387–390

Sakoe, H., and Chiba, S.: ‗Dynamic programming algorithm optimization for spoken word recognition‘, IEEE Trans. Acoust. Speech Signal Process., 1978, 26, pp. 43–49

Rabiner, L., and Juang, B.: ‗Fundamentals of speech recognition‘ (Prentice-Hall, London, 1993)

Tishby, N.Z.: ‗On the application of mixture AR hidden Markov models to text independent speaker recognition‘, IEEE Trans. Acoust. Speech Signal Process., 1991, pp. 563–570

Oglesby, J., and Mason, J.S.: ‗Optimisation of neural models for speaker identification‘. ICASSP-90, 1990, pp. 261–264

Wan, V. and Renals, S., ―Speaker Verification using Sequence Discriminant Support Vector Machines,‖ IEEE Trans.Speech and Audio Processing, vol. 13(2):203-210, 2005

Reynolds, D.A.: ‗Speaker identification and verification using Gaussian Mixture speaker models‘, Speech Commun., 1995, 17, pp. 91–108

Ali Zulfiqar, Aslam Muhammad, Martinez Enriquez A. M: ‖A Speaker Recognition system using MFCC features with VQ technique.‖, 2009 Third International Symposium on Intelligent Information Technology Application

Guangyu Zhou and Wasfy B. Mikhae :‖Speaker Identification based on Discriminative Vector Quantization‖, Proc. IEEE ,2004

Xiao-ting LUO,Li-xin JI,Shao-mei LI: ―Weighted Distortion Measure on Standard Deviation for VQ-Based Speaker Identification‖Proc. IEEE, 2010

CMU. http:// cmusphinx.sourceforge.net/sphinx4/ javadoc/edu/cmu/sphinx/frontend/frequencywarp/melfrequencyfilterbank.html.

J. MacQueen,‖Some methods for classification and analysis of multivariate observations‖Proc. Of Fifth Berkeley Symposium on Mathematical Statistics and Probability , June 21-July 18, 1965 and December 27, 1965-January 7, 1966,pp. 281-297

Lawrence R. Rabiner, B. H. Juang,Fundamentals of Speech Recognition,2nd Indian Reprint, Pearson Education,Delhi,1993, pp.133-167,357-422

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me