Open Access Open Access  Restricted Access Subscription or Fee Access

Speech Identification using MFCC Algorithm on Arm Platform

Mandar N. Kakade, Dr.A.D. Jadhav

Abstract


Digital processing of speech signal and speech recognition algorithm is very important for fast and accurate automatic speech recognition technology. The speech is a signal of infinite information. A direct analysis of the complex speech signal is due to too much information contained in the signal. Therefore the digital signal processes such as feature extraction and feature matching are introduced to represent the speech signal. Several methods such as Liner Predictive Coding (LPC), Hidden Markov Model (HMM), Dynamic Time Warping (DTW) etc are used to identify a speech. The extraction and matching process is implemented right after the pre processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) has been used as speech modelling techniques. Since it’s obvious that the speech signal tends to have different temporal rate, the alignment is important to produce the better performance. This paper introduces MFCC to extract features and DTW to compare the test patterns for speech identification. In this paper, same algorithms are implemented onto ARM platform as well as MATLAB.


Keywords


Feature Extraction, Feature Matching, Mel Frequency Cepstral Coefficient (MFCC), Dynamic Time Warping (DTW)

Full Text:

PDF

References


H. Combrinck and E. Botha, “On the mel-scaled cepstrum,” department of Electrical and Electronic Engineering, University of Pretoria.

M. Brown and L. Rabiner, “Dynamic time warping for isolated word recognition based on ordered graph searching techniques,” in Intl. Conf. on Acoust., Speech, Signal Processing, ICASSP’82, vol. 7, May 1982, pp. 1255–1258.

E. J. Keogh and M. J. Pazzani, “Derivative dynamic time warping,” department of Information and Computer Science, University of California, Irvine.

F. Soong, E. Rosenberg, B. Juang, and L. Rabiner, "A Vector Quantization Approach to Speaker Recognition", AT&T Technical Journal, vol. 66, March/April 1987, pp. 14-26

Stan Salvador and Pjilip Chan, Fast DTW: Toward Accurate Dynamic Time Warping in Linear time space,Florida Institute of Technology, Melbourne

Toni M. Rath and R. Manmatha, Word Image Matching Using Dynamic Time Warping, University of Massachusetts, Amherst

Hiroaki Sakoe and Seibi Chiba, Dynamic Programming algorithm Optimization for spoken word Recognition, IEEE transaction on Acoustic speech and Signal Processing, February 1978.

S. Furui, “Speaker independent isolated word recognition using dynamic features of speech spectrum”, IEEE Transactions on Acoustic, Speech, Signal Processing, Vol. ASSP-34, No. 1, pp. 52-59, February 1986.

S. Furui, “An overview of speaker recognition technology”, ESCA Workshop on Automatic Speaker Recognition, Identification and Verification, pp. 1-9, 1994.

F.K. Song, A.E. Rosenberg and B.H. Juang, “A vector quantisation approach to speaker recognition”, AT&T Technical Journal, Vol. 66-2, pp. 14-26, March 1987.

Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques” JOURNAL OF COMPUTING, VOLUME 2, ISSUE 3, MARCH 2010, ISSN 2151-9617

Md. Rashidul Hasan, Mustafa Jamil, Md. Golam Rabbani Md. Saifur Rahman” SPEAKER IDENTIFICATION USING MEL FREQUENCY CEPSTRAL COEFFICIENTS” 3rd International Conference on Electrical & Computer Engineering,,ICECE 2004


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.