Design of Educational Software for Automatic Speech Recognition (ASR) Techniques

Pengecaman suara merupakan satu subjek penting dalam penyelidikan, dan perkembangannya telah tiba di satu tahap di mana ia telah diaplikasikan di dalam banyak aplikasi industri dan pengguna di luar negeri. Walau bagaimanapun, penyelidikan pengecaman suara masih berada di tahap awalnya di Malaysia. Sebab utama adalah pengecaman suara sungguh kompleks dan pengajaran subjek ini terutamanya teknologi di sebaliknya merupakan satu tugas yang mencabar. Masa kini, sesetengah pengajar menggunakan persembahan tayangan slide dan papan putih dalam pemberian kursus begini. Di akhir kursus, pelajar tidak dapat meninjau output daripada algoritma yang diberi ataupun menguji sistem ini dalam masa nyata. Dalam kes ini, pelajar tidak dapat didedahkan kepada sistem teknikal yang sebenarnya dan mudah berasa bosan. Penyelidikan ini terutamanya memberi perhatian kepada kemajuan terhadap had dan masalah yang dihadapi oleh kaedah pengajaran tradisional di dalam pengecaman suara dengan membangunakan satu set perisian pendidikan yang interaksi dan praktikal untuk membimbing dan membantu pelajar dalam pembelajaran, menjalankan pengujian dan membangunkan aplikasi pengecaman suara. Kata kunci: Pengecaman suara, pengecaman corak, pengaturcaraan berdasarkan objek, perantara muka pengguna bergrafik, interaksi manusia-komputer Speech recognition has been an important subject for research, and it has come to a stage where it has been actively applied in a lot of industrial and consumer applications, overseas. However, speech recognition research is still in its infancy stage in Malaysia. The main reason is that speech recognition systems are highly complex and teaching students in this subject matter with the underlying technologies is a challenging task. Currently, some instructors use slide show presentations and white board in giving such courses. At the end of the course, students are not able to figure out the real output of the algorithms given. In this case, students are not exposed to the real technical systems and would easily get bored. This research is mainly on the improvement over the limitations and problems of the traditional teaching method in speech recognition by developing a set of interactive and practical education software to guide and assist students in studying, and performing experiments for speech recognition. Keywords: Speech recognition, pattern recognition, object oriented programming, graphical user interface, human computer interaction

[1]  Chin-Chen Chang,et al.  A fast LBG codebook training algorithm for vector quantization , 1998 .

[2]  Andreas Spanias,et al.  Development and evaluation of a Web-based signal and speech processing laboratory for distance learning , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Soo-Won Kim,et al.  An LPC cepstrum processor for speech recognition , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[4]  Andreas Spanias,et al.  A software tool for introducing speech coding fundamentals in a DSP course , 1996 .

[5]  D. O'Shaughnessy,et al.  Linear predictive coding , 1988, IEEE Potentials.

[6]  Michael K. McCandless,et al.  SAPPHIRE: an extensible speech analysis and recognition tool based on Tcl/Tk , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  B. Saverione,et al.  ARES: An environment for speech analysis and labelling , 1989, Proceedings. Electrotechnical Conference Integrating Research, Industry and Education in Energy and Communication Engineering',.

[8]  V. Vuckovic Dynamic time-warping method for isolated speech sequence recognition , 2001, 5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517).

[9]  Andreas Spanias,et al.  A MATLAB software tool for the introduction of speech coding fundamentals in a DSP course , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  Jean Hennebert,et al.  POST: parallel object-oriented speech toolkit , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  Michael G. Morrow,et al.  winDSK: a Windows-based DSP demonstration and debugging program , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[13]  S. Sehhati Java based speech analysis via Internet-SPANNET , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).

[14]  Mohammad Akbar Waveedit, an interactive speech processing environment for microsoft windows platform , 1997, EUROSPEECH.

[15]  Tan Tian Swee,et al.  Design and development of speech-control robotic manipulator arm , 2002, 7th International Conference on Control, Automation, Robotics and Vision, 2002. ICARCV 2002..

[16]  Hüseyin Abut,et al.  Interactive classroom for DSP/communication courses , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  John N. Gowdy,et al.  Multi-platform CBI tools using Linux and Java-based solutions for distance learning , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[18]  Jan Nouza,et al.  An educational and experimental workbench for visual processing of speech data , 1997, EUROSPEECH.

[19]  Jhing-Fa Wang,et al.  Chip design of mel frequency cepstral coefficients for speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[20]  Noam Amir The role of graphical programming languages in teaching DSP , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).