Speech and Signal Processing

registered False
in_cart False
is_course_full False

invitation_only False
can_enroll True

can_add_course_to_cart False


Advances in computing devices and information technology make it possible for us today to use speech in human-computer interactions, which is the most convenient way for us to communicate. The natural speech chain contains a speaker, a listener and a channel between them. Automatic speech processing means that one of the participants in the speech chain is replaced by a machine.

The device that can take up the role of the speaker is called a speech synthesizer. Basically, there are two types of speech synthesizers in use. One of them contains devices that can transform predefined text to speech. You can find them, for example, at railway stations where the producer of the sound coming from the loudspeaker is in fact a speech synthesizer with a limited vocabulary.

The other group of speech synthesizers includes text-to-speech converters that can transform free text to speech. These are used, for instance, in applications that read books aloud.

Earlier, the sound generation in speech synthesizers was performed by machines. Consequently, it sounded artificial, but the resulting speech could be easily understood. Nowadays, speech synthesizers FORM speech by combining small parts of a human speaker's voice. Today's research in this field aims at improving the naturalness of the generated speech. In the case of sophisticated speech synthesizers, you don’t even realize that it is not human speech.

In this course you will get to know the basics of speech acoustics and phonetics (focusing on Hungarian, which is an agglutinating language, which makes it harder to recognize than English). Also, you will get an introduction to the types of speech synthesizers and the methods of synthesis. Then we will show you how to program a speech recognizer that recognizes the Hungarian words for numbers, hopefully in a speaker independent manner.

We believe that you will find this course useful, because the signal processing methods you will get to know are used not only in speech processing. They can also be applied for processing any types of signals, like noise or vibration.

On completion of this course you will be able to use speech processing devices and also to build them into new applications. After you learn the steps of training a speech recognizer, you will be able to program a speech recognizer for your own purposes.



Course Staff

Course Team Image #1

Erika Baksáné Varga, Ph.D.

Assistant professor in the Institute of Information Technology at the University of Miskolc, Hungary. Has special interest in data- and knowledge bases, knowledge intensive systems, natural language processing and semantic modeling; and 10-years-long teaching experience in database management and SQL programming, datawarehousing and BIS systems, and procedural programming.

Course Team Image #2

Judit Mária Pintér, Ph.D.

Assistant lecturer at the Institute of Electrical Engineering at the University of Miskolc. Main field of research is speech processing recognition. Involved in teaching communication theory and industrial wired and wireless communications systems.

Course Team Image #3

László Czap, Ph.D.

He is the Director of the Institute of Electrical Engineering and head of Department of Automation and Infocommunication at the University of Miskolc. Main area of his research activity is speech processing: audio-visual speech synthesis and recognition. He has 30-years-long teaching experience in communication technology and image processing.

Recommended reading list

Németh, G., Olaszy, G.: A Magyar Beszéd, Akadémiai Kiadó, Budapest, 2010.

Young, S. et al.: The HTK Book (For Version 3.3), Cambridge University, 2005.

  1. Course Number

  2. Classes Start

    Nov 19, 2015
  3. Classes End

    Jan 20, 2016
  4. Estimated Effort