Home   >   CSC-OpenAccess Library   >    Manuscript Information
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
Prateek Srivastava, Reena Panda, Sankarsan Rauta
Pages - 128 - 139     |    Revised - 15-09-2012     |    Published - 24-10-2012
Volume - 6   Issue - 4    |    Publication Date - October 2012  Table of Contents
MORE INFORMATION
KEYWORDS
Speaker Recognition , Gaussian Mixture Model, Cepstral Mean Subtraction, Mel Frequency Cepstral Coefficients, Gender classification
ABSTRACT
Automatic speaker recognition system is used to recognize an unknown speaker among several reference speakers by making use of speaker-specific information from their speech. In this paper, we introduce a novel, hierarchical, text-independent speaker recognition. Our baseline speaker recognition system accuracy, built using statistical modeling techniques, gives an accuracy of 81% on the standard MIT database and our baseline gender recognition system gives an accuracy of 93.795%. We then propose and implement a novel state-space pruning technique by performing gender recognition before speaker recognition so as to improve the accuracy/timeliness of our baseline speaker recognition system. Based on the experiments conducted on the MIT database, we demonstrate that our proposed system improves the accuracy over the baseline system by approximately 2%, while reducing the computational time by more than 30%.
CITED BY (1)  
1 Sunitha, K. V., & Sharada, A. (2012). Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor. International Journal of Human Computer Interaction (IJHCI), 3(4), 83.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
Atal, B.S“Automatic recognition of speakers from their voices,” Proc. IEEE, vol. 64, pp. 460–475, 1976.
Brett Richard Wildermoth,'Text Independent Speaker Recognition using source based features', January 2001, Griffith university , Australia.
Campbell W, Sturim D, Reynolds D, Solomonoff A. SVM-based speaker verification using a GMM supervector kernel and NAP variability compensation. In: Proceedings of the international conference on acoustics, speech and signal processing; 2006. p. 1–97.
D. A. Reynolds, A Gaussian mixture modeling approach to text independent speaker identification, Ph.D. thesis, Georgia Institute of Technology, Atlanta, Ga, USA, September 1992.
D. A. Reynolds, “An Overview of Automatic Speaker Recognition Technology”, ICASSP 2002, pp 4072-4075.
D. A. Reynolds, “An Overview of Automatic Speaker Recognition Technology”, ICASSP 2002,pp 4072-4075.
D. Reynolds, R. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE transactions on speech and audio processing, Vol. 3, No1, 1995,pp. 72-83
EvgenyKarpov, ‘Real-Time Speaker Identification’, University of Joensuu Department of Computer Science Master’s Thesis
H. Gish and M. Schmidt, “Text Independent Speaker Identification”, IEEE Signal Processing Magazine, Vol. 11, No. 4, 1994, pp. 18-32.
H. Gish and M. Schmidt, “Text Independent Speaker Identification”, IEEE Signal Processing Magazine, Vol. 11, No. 4, 1994, pp. 18-32.
Herbert, M., 2008. Text-dependent speaker recognition. In: Benesty, J., Sondhi, M., Huang,Y. (Eds.), Springer Handbook of Speech Processing. Springer-Verlag, Heidelberg, pp. 743–762.
J .M.Naik ,”Speaker Verifiaction-A tutorial”, IEEE Communications Magazine, January 1990,pp.42-48.
J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-Time Processing of Speech Signals,Piscataway (N.J.), IEEE Press, 2000.
J.P. Campbell, “Speaker Recognition: A Tutorial”, Proc. of the IEEE, vol. 85, no. 9, Sept 1997, pp. 1437-1462
J.R Deller, J.H.L. Hansen, J .G. Proakis, Discrete –Time processing of speech signals,Piscataway (N.J.),/IEEE Press,2000
Jeff A. Bilmes , “A Gentle Tutorial of the EM Algorithm and its application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models”, TR-97-021, April 1998
leonard, R. ,G. ,' A Database for speaker independent digit recognition' , Proc. ICASSP 84 ,Volume 3, p. 42.11, 1984
MohaddesehNosratighods ,EliathambyAmbikairajah ,and Julien Epps “SPEAKER VERIFICATION USING A NOVEL SET OF DYNAMIC FEATURES”
Mohamed FaouziBenZeghibaa, ‘Joint Speech And Speaker recognition' IDIAP RR 05- 28,February 2005
S. Furui, Digital Speech Processing, Synthesis and Recognition, New York, Marcel Dekker,2001.
S. Roberts, D. Husmeier, I. Rezek, andW.Penny, “Bayesian approaches to gaussian mixture modeling,” IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 1133–1142, Nov. 1998.
SadaokiFurui“Speaker-dependent-feature extraction, recognition and processing techniques,” Speech Commun., vol. 10, pp. 505–520, 1991.
X. Huang, A. Acero and H.-W.Hon, Spoken language processing, Upper Saddle River, New Jersey, Prentice Hall PTR, 2001.
X. Huang, A.Acero and H.-W.Hon, Spoken language processing, Upper Saddle River, New Jersey, Prentice Hall PTR, 2001.
Mr. Prateek Srivastava
Advanced micro devices - India
prateek.k.srivastava@gmail.com
Miss Reena Panda
National Institute of Technology - India
Mr. Sankarsan Rauta
National Institute of Technology - India


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS