ABOUT THIS COURSE
Advances in computing devices and information technology make it possible for us today to use speech in human-computer interactions, which is the most convenient way for us to communicate. The natural speech chain contains a speaker, a listener and a channel between them.
Automatic speech processing means that one of the participants in the speech chain is replaced by a machine.
The device that can take up the role of the speaker is called a speech synthesizer. Basically, there are two types of speech synthesizers in use.
One of them contains devices that can transform predefined text to speech. You can find them, for example, at railway stations where the producer of the sound coming from the loudspeaker is in fact a speech synthesizer with a limited vocabulary.
The other group of speech synthesizers includes text-to-speech converters that can transform free text to speech. These are used, for instance, in applications that read books aloud.
Earlier, the sound generation in speech synthesizers was performed by machines. Consequently, it sounded artificial, but the resulting speech could be easily understood. Nowadays, speech synthesizers FORM speech by combining small parts of a human speaker's voice. Today's research in this field aims at improving the naturalness of the generated speech. In the case of sophisticated speech synthesizers, you don’t even realize that it is not human speech.
In this course you will get to know the basics of speech acoustics and phonetics (focusing on Hungarian, which is an agglutinating language, which makes it harder to recognize than English). Also, you will get an introduction to the types of speech synthesizers and the methods of synthesis. Then we will show you how to program a speech recognizer that recognizes the Hungarian words for numbers, hopefully in a speaker independent manner.
We believe that you will find this course useful, because the signal processing methods you will get to know are used not only in speech processing. They can also be applied for processing any types of signals, like noise or vibration.
On completion of this course you will be able to use speech processing devices and also to build them into new applications. After you learn the steps of training a speech recognizer, you will be able to program a speech recognizer for your own purposes.