PshyMorphic.com

     
 

M.Sc.Eng(Electronic)

I've done my masters degree (M.Sc.Eng (Electronic) ) under the guidance of professor J.A. du Preez.
During this time I was part of the digital signal processing (DSP) group at the University of Stellenbosch in South Africa.

I selected the long thesis option, which means you do research for two years and then submit a thesis to present your results. (The other option is to do subjects for a year and research for a year.)
I also took two subjects, pattern recognition and digital signal processing, since they were very directly related to my research field.

Postgraduate Courses

Here is the reports and Matlab code for the tasks we had to do as part of the courses.
Looking at these again, I feel somewhat embarrassed about their quality. Just goes to show how much you learn in three years. I hope this might be useful to someone.

Pattern Recognition 813

Digital Signal Processing 813

Thesis

Abstract

Speaker recognition systems have evolved to a point where near perfect performance can be obtained under ideal conditions, even if the system must distinguish between a large number of speakers. Under adverse conditions, such as when high noise levels are present or when the transmission channel deforms the speech, the performance is often less than satisfying.

This project investigated the performance of a popular speaker recognition system, that use Gaussian mixture models, on speech transmitted over a high frequency channel. Initial experiments demonstrated very unsatisfactory results for the base line system.

We investigated a number of robust techniques. We implemented and applied some of them in an attempt to improve the performance of the speaker recognition systems. The techniques we tested showed only slight improvements.

We also investigates the effects of a high frequency channel and single sideband modulation on the speech features of speech processing systems. The effects that can deform the features, and therefore reduce the performance of speech systems, were identified.

One of the effects that can greatly affect the performance of a speech processing system is noise. We investigated some speech enhancement techniques and as a result we developed a new statistical based speech enhancement technique that employs hidden Markov models to represent the clean speech process.

What can you learn?

Like all things in life, you should ask yourself what you can learn by reading this thesis.
What you will learn obviously depends on what you already know, but here is a list of what is offered:
  • Review of speech enhancement techniques, speech feature analysis and speaker modeling and classification.
  • Speech analysis: Analogue to digital conversion, preprocessing of speech (speech enhancement, framing, windowing and pre-emphasis) and speech features (linear prediction and cepstral features).
  • Speaker modeling and classification using Gaussian mixture models.
  • Robust feature analysis techniques, such as cepstral weighting, adaptive component weighting, cepstral mean subtraction, RASTA, PLP and many more.
  • Short introduction to high frequency (HF) channels.
  • Short introduction to single sideband (SSB) modulation.
  • A description of the effects that Gaussian noise, SSB and an HF channel has on linear prediction coefficient cepstral features.
  • Introduction to Markov chains (MC) and hidden Markov models (HMM).
  • Description of a new statistically based speech enhancement method using HMMs.
  • Short introduction to Hilbert transforms, vector quantisation and the expectation maximisation algorithm.
  • A lot of references in the literature to the above material.
Even if you are not interested in the core of the thesis there is a lot of other material you might find useful or informative.

Download

The final version of the thesis is available in the following formats:
     

©2003-2008 Jan Pool

 

Last Modified: 21 January 2003