Home > CSC-OpenAccess Library > Manuscript Information
EXPLORE PUBLICATIONS BY COUNTRIES |
EUROPE | |
MIDDLE EAST | |
ASIA | |
AFRICA | |
............................. | |
United States of America | |
United Kingdom | |
Canada | |
Australia | |
Italy | |
France | |
Brazil | |
Germany | |
Malaysia | |
Turkey | |
China | |
Taiwan | |
Japan | |
Saudi Arabia | |
Jordan | |
Egypt | |
United Arab Emirates | |
India | |
Nigeria |
Parameters Optimization for Improving ASR Performance in Adverse Real World Noisy Environmental Conditions
Urmila Shrawankar, Vilas Thakare
Pages - 58 - 70 | Revised - 15-09-2012 | Published - 25-10-2012
MORE INFORMATION
KEYWORDS
ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction
ABSTRACT
From the existing research it has been observed that many techniques and methodologies are available for performing every step of Automatic Speech Recognition (ASR) system, but the performance (Minimization of Word Error Recognition-WER and Maximization of Word Accuracy Rate- WAR) of the methodology is not dependent on the only technique applied in that method. The research work indicates that, performance mainly depends on the category of the noise, the level of the noise and the variable size of the window, frame, frame overlap etc is considered in the existing methods.
The main aim of the work presented in this paper is to use variable size of parameters like window size, frame size and frame overlap percentage to observe the performance of algorithms for various categories of noise with different levels and also train the system for all size of parameters and category of real world noisy environment to improve the performance of the speech recognition system.
This paper presents the results of Signal-to-Noise Ratio (SNR) and Accuracy test by applying variable size of parameters. It is observed that, it is really very hard to evaluate test results and decide parameter size for ASR performance improvement for its resultant optimization.
Hence, this study further suggests the feasible and optimum parameter size using Fuzzy Inference System (FIS) for enhancing resultant accuracy in adverse real world noisy environmental conditions.
This work will be helpful to give discriminative training of ubiquitous ASR system for better Human Computer Interaction (HCI).
Keywords: ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction (HCI)
1 | Smruti, S., Sahoo, J., Dash, M., & Mohanty, M. N. (2015, January). An Approach to Design an Intelligent Parametric Synthesizer for Emotional Speech. In Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014 (pp. 367-374). Springer International Publishing. |
2 | Lanzola, G., Parimbelli, E., Micieli, G., Cavallini, A., & Quaglini, S. (2014). Data quality and completeness in a web stroke registry as the basis for data and process mining. Journal of healthcare engineering, 5(2), 163-184. |
3 | Shrawankar, U., & Thakare, V. (2013). An Adaptive Methodology for Ubiquitous ASR System. arXiv preprint arXiv:1303.3948. |
4 | Sunitha, K. V., & Sharada, A. (2012). Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor. International Journal of Human Computer Interaction (IJHCI), 3(4), 83. |
“LOOKING AHEAD: Grand Challenges In Speech And Language Processing”, IEEE Signal Processing Magazine [179] January 2012 | |
B.-H.Juang, "Speech Recognition in Adverse Environments," Computer Speech and Language, pp. 275--294, 5, 1991. | |
I Mporas, T Ganchev, M Siafarikas, N Fakotakis, “Comparison of Speech Features on the Speech Recognition Task”, Journal of Computer Science Vol 3 (8): pp 608-616, 2007 | |
J. C. Bezdek and S. K. Pal, Eds., “Fuzzy Models for Pattern Recognition Methods That Search for Structures in Data”. New York: IEEE Press, 1992. | |
J. Ramírez, J. M. Górriz and J. C. Segura, “Voice Activity Detection. Fundamentals and Speech Recognition System Robustness”, I-Tech, Vienna, Austria, June 2007 | |
L R Rabiner, “A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition”, proceedings of the IEEE, Vol. 77, No. 2, Feb 1989. | |
L. A. Zadeh, “Fuzzy sets,” Inform. Control, vol. 8, pp. 338–353, 1965 | |
Loizou, P., “Speech Enhancement: Theory and Practice”. CRC Press LLC, Boca Raton,Florida.,2007 | |
Loizou, P., Kim, G., “Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions”. IEEE Trans. Acoust. Speech Signal Process.19, 47–56., 2011 | |
MathWorks - MATLAB and Simulink for Technical Computing, www.mathworks.com/. | |
Motlícek P.: “Feature Extraction in Speech Coding and Recognition”, Report, Portland, US,Oregon Graduate Institute of Science and Technology, pp. 1-50, 2002 | |
Qifeng Zhu and Abeer Alwan, “On The Use Of Variable Frame Rate Analysis In Speech Recognition”, ICASSP, 2000 | |
Suhadi Suhadi, Carsten Last, and Tim Fingscheidt, “A Data-Driven Approach to A Priori SNR Estimation”, IEEE Transactions On Audio, Speech, And Language Processing, Vol. 19, No.1, January 2011, pg 186- 195 | |
T. Takagi and M. Sugeno, “Fuzzy identification of systems and its applications to modeling and control,” IEEE Trans. Syst., Man, Cybern., vol. SMC-15, no. 1, pp. 116–132, Jan. 1985. | |
Y.Gong, "Speech Recognition in Noisy Environments: A Survey," Speech Communication,Vol. 12, No. 3, pp. 231--239, June, 1995. | |
Miss Urmila Shrawankar
G H Raisoni College of Engg., Nagpur - India
urmilas@rediffmail.com
Dr. Vilas Thakare
SGB Amravati University, Amravati - India
|
|
|
|
View all special issues >> | |
|
|