Home   >   CSC-OpenAccess Library   >    Manuscript Information
SMATalk: Standard Malay Text to Speech Talk System
Othman O. Khalifa, Zakiah Hanim Ahmad, Aisha-Hassan A. Hashim, Teddy Suya Gunawan
Pages - 1 - 16     |    Revised - 15-10-2008     |    Published - 15-11-2008
Volume - 2   Issue - 5    |    Publication Date - October 2008  Table of Contents
MORE INFORMATION
KEYWORDS
Phones prosody, speech synthesis, Standard Malay, DSP, Natural Language Processing
ABSTRACT
This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS. The proposed system using sinusoidal method and some pre- recorded wave files in generating speech for the system. The use of phone database significantly decreases the amount of computer memory space used, thus making the system very light and embeddable. The overall system was comprised of two phases the Natural Language Processing (NLP) that consisted of the high-level processing of text analysis, phonetic analysis, text normalization and morphophonemic module. The module was designed specially for SM to overcome few problems in defining the rules for SM orthography system before it can be passed to the DSP module. The second phase is the Digital Signal Processing (DSP) which operated on the low-level process of the speech waveform generation. A developed an intelligible and adequately natural sounding formant-based speech synthesis system with a light and user-friendly Graphical User Interface (GUI) is introduced. A Standard Malay Language (SM) phoneme set and an inclusive set of phone database have been constructed carefully for this phone-based speech synthesizer. By applying the generative phonology, a comprehensive letter-to-sound (LTS) rules and a pronunciation lexicon have been invented for SMaTTS. As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean Opinion Score (MOS) obtained. The overall performance of the system as well as the room for improvements was thoroughly discussed.
CITED BY (4)  
1 Anoop, V., & Rao, P. V. (2013). Speech Signal Quality Improvement Using Cuckoo Search Algorithm. International Journal of Engineering Innovations and Research, 2(6), 519.
2 Acharjee, P. B., Talukdar, J., Das, A., & Talukdar, P. H. (2013, November). Dialect variation and associated G2P rules with reference to Bodo language. In Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013 International Conference (pp. 1-4). IEEE.
3 Ng, K. M., & Khairi, N. M. (2011, November). Decision rules for allophone synthesis of Malay text-to-speech system. In Control System, Computing and Engineering (ICCSCE), 2011 IEEE International Conference on (pp. 109-113).IEEE.
4 Azam, M. K. S. (2010).Transhumanism: natural language and innovative concepts in communication.
1 Google Scholar 
2 ScientificCommons 
3 Academic Index 
4 CiteSeerX 
5 refSeek 
6 iSEEK 
7 Socol@r  
8 ResearchGATE 
9 Bielefeld Academic Search Engine (BASE) 
10 Scribd 
11 WorldCat 
12 SlideShare 
13 PDFCAST 
14 PdfSR 
15 Free-Books-Online 
16 Search-Docs 
Allen J., Hunnicut S., Klatt D. (1987). “From Text To Speech, The MITTALK System”. Cambridge University Press, USA.
Bamini, P. K. (2003). FGPA-based Implementation of Concatenative Speech Synthesis Algorithm. Master thesis, Dept. of Computer Science and Engineering, University of South Florida
Benjamin, Nettre. (2000). Synthesis by Concatenation.for Text-to-Speech. Tokyo Institute of Technology.
Bozkurt, Baris and Dutoit, Thierry. (2001). An Implementation and Evaluation of Two Diphone-Based Synthesizers for Turkish, Proc. 4th ISCA Tutorial and Research Workshop on Speech Synthesis, 247-250.
Childers, Donald G. (1999). Speech Processing and Synthesis Toolboxes. John Wiley & Sons, New York.
Dutoit T. (1996), “A Short Introduction to Text-to-Speech Synthesis”. TTS research team, TCTS Lab., Mons, Belgium
Dutoit, Thierry (1993). High Quality Text-To-Speech Synthesis of the French Language. Doctorial dissertation, Faculte Polytechnique de Mons.
Dutoit, Thierry (1997). An Introduction To Text-To-Speech Synthesis. Kluwer Academics Publisher, The Netherlands.
Dutoit, Thierry (1999) Short Introduction To Text-To-Speech Synthesis. Retrieved April 16, 2005. http://tcts.fpms.ac.be/synthesis/introtts_old.html
Ferencz A., Zaiu D., Ferencz M., Ratiu T., Toderean G. (1989). “A Text-To-Speech System for the Romanian Language”
Ferencz A., Zaiu D., Ferencz M., Ratiu T., Toderean G. (1989). “A Text-To-Speech System for the Romanian Language”for the Romanian Language”Z
Härmä, Aki and Laine, Unto K. (2001), A Comparison of Warped and Conventional Linear Predictive Coding. IEEE Transactions on Speech and Audio Processing, vol. 9, 579-588.
Helander, Elina (2005). SGN-1656 Signal Processing Laboratory. Retrieved January 11, 2005.
Howitt, Andrew Wilson (1995). Linear Predictive Coding. Retrieved July 10, 2006 http://www.otolith.com/otolith/olt/lpc.html
http://citeseer.nj.nec.com/309369.html.
http://citeseer.nj.nec.com/miller98pronunciation.html
http://cslu.cse.ogi.edu/HLTsurvey/ch5node5.html#SECTION53
http://tcts.fpms.ac.be/synthesis/introtts.html
http://ww.cs.tut.fi/kurssit/SGN-4010/.
http://www.mindspring.com/~dmaxey/ssshp/dk_737a.htm
http://www.racai.ro/books/awde/ferencz.html
http://www.racai.ro/books/awde/ferencz.html
http://www.techonline.com/community/ed_resource/feature_article/21068__JD7349406658E L
Kee, Tan Yeow, Seong, Teoh Boon and Haizhou, Li. (2004). Grapheme to Phoneme Conversion for Standard Malay.
Klabbers, Esther A. M. (2000). Segmental and Prosodic Improvements to Speech Generation. PhD dissertation. Technische Universiteit Eindhoven, The Netherlands.
Klatt D.H. (1987). “Review of Text-to-Speech Conversion for English”. Washington, USA.
Laws, Mark R. (2003). Speech Data Analysis for Diphone Construction of a Maori Online Text- to- Speech Synthesizer, SIP 2003, 103-108
Lehana, P. K. and Pandey, P. CP.K. Lehana and P.C. Pandey (2004). Harmonic Plus Noise Model Based Speech Synthesis in Hindi And Pitch Modification. Proc. 18th International Congress on Acoustics, ICA 2004, 3333-3336
Lemmetty, Sami (1999). Review of Speech Synthesis. Master thesis, Dept. of Electrical and Communications Engineering, Helsinky University of Technology
Malay Language, retrieved 2006, May. http://en.wikipedia.org/wiki/Malay_language
Miller C.A. (1998). “Pronounciation Modeling in Speech Synthesis”. Presented to the Faculties of University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy, University of Pennsylvania, Pennsylvania, USA.
Samsudin, Nur-Hana and Kong, Tang Enya. (2004, October). A Simple Malay Speech Synthesizer Using Syllable Concatenation Approach, MMU International Symposium on Information and Communications Technologies 2004 (M2USIC 2004).
Sankaranarayanan, A. (2002). A Text-Independent Approach to Speaker Identification. Retrieved
Seong, Teoh Boon. (1994). The Sound System of Malay Revisited. Percetakan Dewan Bahasa Dan Pustaka. Selangor, Malaysia.
Sproat R. (1998), “Text Interpretation for TTS Synthesis”, Bell Labs., Murray Hill, New Jersey, USA.
Stylianou,Yannis, Dutoit,Thierry and Schroeter, Juergen. (1997). Diphone Concatenation Using A Harmonic Plus Noise Model Of Speech. Proc. Eurospeech. 613–616.
Wolters M. (1997). “A Diphone-Based Text-to-Speech for Scottish Gaelic”. A Thesis Submitted in Fulfillment of the Requirements for the Degree of Diplom in Informatik to the University of Bonn, University of Bonn, Bonn, Germany.
Yi, Jon Rong-Wei. (1998). Natural-Sounding Speech Synthesis Using Variable-Length Units. Master thesis. Dept. of Electrical Engineering and Computer Science, Massachusetts Institute Of Technology.
Mr. Othman O. Khalifa
- Malaysia
khalifa@iium.edu.my
Mr. Zakiah Hanim Ahmad
- Malaysia
Dr. Aisha-Hassan A. Hashim
- Malaysia
Mr. Teddy Suya Gunawan
- Malaysia


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS