Open Access Open Access  Restricted Access Subscription or Fee Access

Arabic Speech Recognition by MFCC and Bionic Wavelet Transform using a Multi Layer Perceptron for Voice Control

M. Ben Nasr, M. Talbi, A. Cherif

Abstract


In this paper, we have proposed a new technique of
Arabic speech recognition with mono-locutor and a reduced
vocabulary. This technique consists at first step in using our proper speech database containing Arabic speech words which are recorded by a mono-locutor for a voice command. The second step consists in features extraction from those recorded words. The third step consists in classifying those extracted features. The features extraction is performed by computing at first, the Mel Frequency Cepstral Coefficients (MFCCs) from each recorded word, then the Bionic Wavelet Transform (BWT) was applied to the vector obtained from
the concatenation of the obtained MFCCs. The obtained bionic
wavelet coefficients were then concatenated to construct one input of a Multi-Layer Perceptual (MLP) used for features classification. In the MLP learning and test phases, we have used eleven Arabic words each if them was repeated twenty five times by the same locutor. A simulation program used to test the performance of the proposed technique showed a classification rate equals to 99.39%.


Keywords


Arabic Speech Recognition, Bionic Wavelet Transforms (BWT), Feature Extraction, Mel-Frequency Cepstral Coefficients (MFCC), Multi-Layer Perceptron (MLP).

Full Text:

PDF

References


Z. BENKHELLAT and A. BELMEHD, “Utilisation des Algorithmes

Génétiques pour la Reconnaissance de la Parole,” SETIT 2009.

F. Maouche and M.Benmohamed, “Automatic Recognition of Arabic

words by genetic algorithm and MFCC modeling,” Faculty of

Informatics, Mentouri University, Constantine, Algeria.

I. Patel and Dr. Y. Srinivas Rao2, “speech recognition using HMM with

MFCC- an analysis using frequency specral decomposion technique,”

signal & image processing : an international journal(SIPIJ) vol.1, no.2,

December 2010.

A.M. Othman and M. H. Riadh, “Speech Recognition Using Scaly Neural

Networks,” World Academy of Science, Engineering and Technology 38

A. sadiqui and N. Chenfour, “Réalisation d’un système de

reconnaissance automatique de la parole arabe base sur CMU Sphinx,”

Annals .Computer Science Series 8th Tome 1st Fasc.-2010.

A.M. Alimi and M. Ben Jemaa, “Beta Fuzzy Neural Network

Application in Recognition of Spoken Isolated Arabic Words,”

International Journal of Control and Intelligent Systems, Special Issue on

Speech Processing Techniques and Application, Vol.30,No.2,2002.

M. Alghamdi, M. Elshafie and H. Al-Muhtaseb, “Arabic broadcast news

transcription system,” Journal of Speech Technology, April, 2009.

H. Tabbal, W. Al-Falou and B. Monla, “Analysis and Implementation of

an Automated Delimiter of Quranic Verses in Audio Files using Speech

Recognition Techniques Robust Speech Recognition and

Understanding,” Chapter of the Book "Robust Speech Recognition and

Understanding", edited by: Michael Grimm and Kristian KroschelP

pp.460. June 2007.

J. Park, F. Diehl, M. Gales, M. Tomalin, and P. Woodland, “Training and

Adapting MLP Features for Arabic Speech Recognition,” Proc. Of IEEE

Conf. Acoust. Speech Signal Process. (ICASSP), 2009.

W. Al-Sawalmeh, K. Daqrouq, O. Daoud and A. Al-Qawasmi “Speaker

Identification System-based Mel Frequency and Wavelet Transform

using Neural Network Classifier,” European Journal of Scientific

Research ISSN Vol.41 No.4 (2010), pp.515-525.

D. Kewley-Port and Y. Zheng, “Auditory models of formant frequency

discrimination for isolated vowels,” Journal of the Acoustical Society of

America, 103(3), 1998, pp. 1654-1666.

Md. Rabiullslam, Md. F. Rahmant and M.A. Goffar Khant,

“Improvement of Speech Enhancement Techniques for Robust Speaker

Identification in Noise,” Proceedings of 2009 12th International

Conference on Computer and Information Technology (ICCIT 2009)

-23 December,2009, Dhaka, Bangladesh.

J. Park, F. Diehl, M. Gales, M. Tomalin, and P. Woodland, “Training and

Adapting MLP Features for Arabic Speech Recognition,” Proc. Of IEEE

Conf. Acoust. Speech Signal Process. (ICASSP), 2009.

A. Zabidi, W. Mansor, L. Y. Khuan, I. M. Yassin and R. Sahak, “The

effect of F-Ratio in the Classification of Asphyxiated Infant Cries Using

Multilayer Perceptron Neural Network,” IEEE EMBS Conference on

Biomedical Engineering & Sciences (IECBES 2010), Kuala Lumpur,

Malaysia, 30th November 2010 - 2nd December.

A. Zabidi, et al., “Mel-Frequency Cepstrum Coefficient Analysis of

Infant Cry with Hypothyroidism,” presented at the 2009 5th Int.

Colloquium on Signal Processing & Its Applications, Kuala Lumpur,

Malaysia, 2009.

X. Yuan, “Auditory Model-Based Bionic Wavelet Transform for Speech

Enhancement", Master's thesis, Marquette University, Milwaukee, WI,

USA, 2003.

O. Sayadi and M.B. Shamsollahi, “Multiadaptive Bionic Wavelet

Transform: Application to ECG Denoising and Baseline Wandering

Reduction,” EURASIP Journal of Applied Signal Processing,

(Article ID 41274):11 pages, 2007.

J. Yao and Y. T. Zhang, “Bionic wavelet transform: A new

time-frequency method based on an auditory model,” IEEE Transactions

on Biomedical Engineering, vol.48, no.8, pp.856-863, 2001.

J. Yao and Y. T. Zhang, “The application of bionic wavelet transform to

speech signal processing in cochlear implants using neural network

simulations,”IEEE Transactions on Biomedical Engineering, vol.49,

no.11, pp.1299-1309, 2002.

M. Talbi, L. Salhi, S. Abid, A. Cherif, “Recurrent Neural Network and

Bionic Wavelet Transform for speech enhancement,” Int. J. Signal and

Imaging Systems Engineering, vol.3, no.2, pp.93-101, 2010.

Dr.R.L.K.Venkateswarlu, Dr. R. V. Kumari and G.Vani Jayasri “Speech

Recognition using Radial basis Function Neural Network,” IEEE, 2011.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.