Environment and Sensor Robustness in Automatic Speech Recognition
Utpal Bhattacharjee, Department of Computer Science and Engineering, Rajiv Gandhi University, Rono Hills, Doimukh, Arunachal Pradesh, India.
Manuscript received on January 06, 2013. | Revised Manuscript received on January 12, 2013. | Manuscript published on January 15, 2013. | PP: 31-37 | Volume-1, Issue-2, January 2013. | Retrieval Number: B0128011213/2013©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Most of the presently available speech recognition systems work efficiently only in some ideal conditions. This is due to the fact that these systems are based on some assumptions related to the operating conditions. The system works efficiently if the actual working environment is identical with the environment for which the system is built. Performance of the speech recognition system considerably degrades if mismatch between the training and the testing environment occurs. In the present study, mismatch due to sensor variability and environment has been considered and Cepstral Mean Normalization (CMN) and Spectral subtraction methods have been investigated as front-end methods for the reduction of noise. A Hidden Markov Model (HMM) based speech recognition system has been built with Mel-Frequency Cepstral Coefficient (MFCC) as feature vector. It has been observed that there is a 15% enhancement of system performance in channel and environment mismatched condition compared to baseline performance when CMN and spectral subtraction methods have been applied for noise reduction.
Keywords: Robust Speech Recognition, MFCC, CMN, Spectral Subtraction