Open Access Open Access  Restricted Access Subscription or Fee Access

Text Dependent Speaker Recognition Using Linear Prediction Coefficients-Dynamic Time Warping

Santosh S. Rokade, Srija Unnikrishnan, Ranjushree Pal

Abstract


This project proposes a text-dependent speaker identification system. Isolated digits 0-9 and their concatenations are used for speaking text. For each speech signal Linear Prediction Coefficients (LPC) are extracted and formed as feature vectors. Dynamic Time Warping (DTW) is used to measure distances between referenced and evaluated vectors. These distances, indicating nearness of unknown vectors to references, incorporated with K-Nearest Neighbor (K") decision technique are used for the identification process. In the verification test of the experiment. Consequently, we have experimented on the use of LPC with both DTW (using KNN as a decision rule) and ANN (a well-known Multilayer Perception (MLP) with back propagation learning algorithm). The systems were tested with 0-9 isolated digits. It has been shown that DTW with KNN gives better performance. It is affected by an attempt of ANN to recognize all of training patterns including any low quality voice. In this paper, further successive progress of our system is to deeply experiment on the use of DTW with KNN with concatenated digit, which will be a form of speaking-text in our application at last. This research will also purpose in selection of some acceptable digits to be included in our text-prompted speaker identification system.

Keywords


LPC, DTW

Full Text:

PDF

References


Santosh S. Rokade, Srija Unnikrishnan , Ranjushree Pal, ― Speaker Recognition using Linear Prediction Coefficients-Dynamic Time Warping‖

J. P. Campbell, Jr., "Prolog to Speaker Recognition: A Tutorial", Proceedings of IEEE, Vol. 85, No. 9, p. 1436-1462, September 1997.

C. Wutiwiwatchai, V. Achariyakulporn, and C. Tanprasert,―Text-dependent SpeakerIdentification using LPC and DTW for Thai Language‖, 1999 IEEE 10th Region Conference (TENCON’99), Vol. 1, September 1999.

G. R. Doddington, "Speaker Recognition-Identifying People by their Voices", Proceedings of IEEE, Vol. 73, No. 11, p.1651-1664, November 1985.

C. Zhongbao, Y. Zhenli, and Z. Lihe, "Automatic Speaker Verification using the Neural Network and Combined LPC Parameters", IEEE TENCON'93, p.

F. K. Soong, A. E. Rosenberg, L. R. Rabiner, and B H. Juang, "A Vector Quantization Approach to Speaker Recognition", Proceedings of International Conference on Acoustics, Speech, and Signal Processing, p. 387-390, 1985.

T. Matsui, and S. Furui, "Comparison of Textindependent Speaker Recognition Methods using VQ distortion and Discrete/continuous HMMs", IEEE Transactions on Speech and Audio Processing, Vol. 2, p. 456-459, July 1994

R. Sethuraman, and J. N. Gowdy, "A Cepstral Based Speaker Recognition System", Twenty-First Southeastern Symposium on System Theory, p. 503-507, 1989.

Rosenberg, A.E., "Automatic speaker verification: A Review, E.-" vol. 64, pp. 475-487, Apr. 1976.

Atal, B.S. ―Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification." J. Acoust. Soc. Amer., vol. 55, No. 6, pp.1304-1312, June 1974.

Wolf, J.J. "Efficient Acoustic Parameters for Speaker Recognition." J.Soc. AmeL, vol. 51, pp. 2044-2056, June 1972.

Rabiner, L.R., Rosenberg. A.E. and Levinson, SE., "Considerations in Dynamic Time Warping Algorithms for Discrete Word Recognition." lEEE m s . ASSP, vol. ASSP-26, pp. 575-582, Dec. 1978

Makhoul, J. "Linear Prediction: A Tutorial Review." proc. IEEE, vol. 63, No. 4, pp. 561-580, Apr. 1975.

Rabiner, L. R. and Schafer, R.W., Diaital Processina of SDeech S ianals. Englewood Cliffs N.J. Prentice-Hall, 1978.

Atal, B.S. and Hanauer, S.L. "Speech Analysis and Synthesis by Linear Prediction of the Speech Wave," 3. Acoust. Soc. Amer., vol. 50,NO. 2, pp. 637-655, 1971.

Sambur, M.R. "Selection of Acoustic Features for Speaker Identif,icaItion." vol. ASSP-23, pp.176-1 82,Apr. 1975.

Sambur, M.R. "Speaker Recognition Using Orthogonal Linear Prediction." IEEE Trans. ASSP, vol. ASSP-29, pp.

Rabiner, L.R. and Sambur, M.R. " An Algorithm for Determining the End Points of Isolated Utterances." Bell Svst. Tech. d ., vo1.54, pp. 297-315. Feb. 1975.

Furui, S. "Cepstral Analysis Technique for Automatic Speaker Verification." IEEE Trans. ASSP, vol. ASSP-29, No. 2, pp. 254-272, Apr. 1981.

R. J. Mammone, X. Zhang, and R. P. Ramachandran,"Robust Speaker Recognition, A Feature-based Approach", IEEE Signal Processing Magazine, p. 58-71, September 1996.

Sung-Bae Cho, Feature Extraction for lifelog management, 342-350, 1981, September 25, 2008

Sung-Bae Cho, Feature Extraction for lifelog management, September 25, 2008

Tomoko Matsui and Sadaoki Furui,‖ Comparison of Text-Independent Speaker Recognition Methods Using VQ-Distortion‖, IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 2, NO. 3, JULY 1994.

Murty, K.S.R.; Yegnanarayana, B.; Dept. of Comput. Sci. & Eng., Indian Inst. of Technol.-Madras, Chennai, India ,‖ Combining evidence from residual phase and MFCC features for speaker recognition‖, IEEE Signal Processing Society,Vol 13,, JAN 2006


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.