A New Approach for Coding of Speech Signals using Auto Associative Neural Networks

B. Kirubagari; S. Palanivel

A New Approach for Coding of Speech Signals using Auto Associative Neural Networks

B. Kirubagari, S. Palanivel

Abstract

Digital Speech coding is a procedure to represent a digitized speech signal using as few bits as possible, maintaining the speech quality and its intelligibility at the same time. In this paper a new direction in research on speech coding using auto associative neural networks (AANN) is discussed. The AANN acts as a combination of encoder and decoder. The feature extractor extracts the necessary features from the input speech. Instead of coding the speech signal the Linear Predictive coefficients (LPC) and discrete cosine transform (DCT) features of the speech signal which acts as the compressed value of the speech, is passed to the neural network. The signal reconstructor reconstructs the signal based on the decompressed features and the weight matrix. Different features are extracted and the results are compared. The signal to noise ratio (SNR) shows the efficiency of the algorithm. Some of the applications for which this coder is suitable are videoconferencing, streaming audio, archival, and messaging.

Keywords

Auto Associative Neural Networks, Discrete Cosine Transform, Linear Predictive Coefficients, Speech Coding.

Full Text:

PDF

References

Pablo Zegers, “Speech recognition using neural networks”, Master Thesis, Department Of Electrical And Computer Engineering, The University Of Arizona, 1998.

Yannis Agiomyrgiannakis and Yannis Stylianou, “Conditional Vector Quantization for Speech Coding”, IEEE Transactions On Audio, Speech, And Language Processing, Vol. 15, No. 2, February 2007

R.J. Sluijter, “The Development of Speech Coding and the First Standard Coder for Public Mobile Telephony”, Doctorate Thesis, Philips Research Laboratories, Eindhoven, Netherlands,

S. Haykin, “Neural networks: A comprehensive foundation”, Pearson Education, Singapore, 2001

B. Yegnanarayana, “Artificial neural networks”, Prentice-Hall of India, New-Delhi, 1999

R.P. Lippmann, “An introduction to computing with neural nets”, IEEEASSP, 1989, April, 4, 4-22,

M. Gori and F. Scarselli, “Are multilayer perceptrons adequate for pattern recognition and verification”, IEEE Trans. Medical Imaging, vol. 20, No. 11, November 1998.

S.Palanivel, “Person Authentication Using Speech, Face and Visual Speech”, Ph.D Thesis, Department of Computer Science and Engineering, Indian Institute of Technology, Madras, 2004.

Mansour Sheikhan, Sahar Garoucy, “Reducing the Codebook Search Time in G.728 Speech Coder Using Fuzzy ARTMAP Neural Networks”, World Applied Sciences Journal 8 (10): 1260-1266, 2010

Birgmeier, M., “Nonlinear Prediction of Speech Signals Using Radial Basis Function Networks”, Proceedings of the European Signal Processing Conference, vol. 1, pp: 459-462, 1996.

Faundez, M., “Adaptive Hybrid Speech Coding with a MLP/LPC Structure", Proceedings of the International Work-Conference on Artificial and Natural Neural Networks, 11: 814-823, 1999.

Sassi, S.B., R. Braham and A. Belghith, “Neural Speech Synthesis System for Arabic Language Using CELP Algorithm”, Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications, pp: 119-121, 2001.

Faúndez-Zanuy, M., “Nonlinear Speech Coding with MLP, RBF and Elman Based Prediction”, Lecture Notes in Computer Science, 2687: 671-678, 2003.

Sheikhan, M., V. Tabataba Vakili and S. Garoucy, “Complexity Reduction of LD-CELP Speech Coding in Prediction of Gain Using Neural Networks”, World Applied Sciences Journal, 7 (Special Issue of Computer & IT): 38-44, 2009.

Easton, M.G. and C.C. Goodyear, “A CELP Codebook and Search Technique Using a Hopfield Net”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp: 685-688, 1991.

Hernandez-Gomez, L.A. and E. Lopez-Gonzalo, “Phonetically-Driven CELP Coding Using Self- Organizing Maps”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 2: 628-631,1993.

Zhang, G., K. Xie, Z. Zhao and C. Xue, “The LD-CELP Gain Filter Based on BP NN”, Lecture Notes in Computer Science, 3973: 150-155, 2006.

Wu, S., G. Zhang, X. Zhang and Q. Zhao, “A LD-CELP Speech Coding Algorithm Based on Modified SOFM Vector Quantizer”, Proceedings of the International Symposium on Intelligent Information Technology Applications, pp: 408-411,2008.

Huong, V., B.J. Min, D.C. Park and D.M. Woo, “A New Vocoder Based on AMR 7.4 Kbit/s Mode in Speaker Dependent Coding System”, Proceedings of the ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp: 163-167,2008.

Sheikhan, M., V. Tabataba Vakili and S. Garoucy, 30. Carpenter, G.A., S. Grossberg, N. Markuzon, “Codebook Search in LD-CELP Speech Coding Algorithm Based on Multi-SOM Structure”, World Applied Sciences Journal, 7 (Special Issue of Incremental Supervised Learning of Analog Computer & IT): 59-68,2009.

B.S. Atal, “Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification”, J. Acoust. Soc. of Amer., vol.55, pp.1304-1312, June, 1974

A.E. Rosenberg and M. Sambur,“ New Techniques for Automatic Speaker Verification”, IEEE Trans. Acoust. Speech, Signal Processing, vol.23, no. 2, pp.169-175, 1975.

Jeremy Bradbury, “Linear Predictive Coding”, December 5, 2000.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me