|
I've done my masters degree (M.Sc.Eng (Electronic) ) under the
guidance of professor
J.A. du Preez.
During this time I was part of the digital
signal processing (DSP) group at the
University of Stellenbosch
in South Africa.
I selected the long thesis option, which means you do research for two
years and then submit a thesis to present your results. (The other
option is to do subjects for a year and research for a year.)
I also took two subjects, pattern recognition and digital signal
processing, since they were very directly related to my research field.
Here is the reports and Matlab code for the tasks we had to do as part
of the courses.
Looking at these again, I feel somewhat embarrassed about their
quality. Just goes to show how much you learn in three years. I hope
this might be useful to someone.
Pattern Recognition 813
Digital Signal Processing 813
Speaker recognition systems have evolved to a point where near perfect
performance can be obtained under ideal conditions, even if the system
must distinguish between a large number of speakers. Under adverse
conditions, such as when high noise levels are present or when the
transmission channel deforms the speech, the performance is often less than
satisfying.
This project investigated the performance of a popular speaker recognition
system, that use Gaussian mixture models, on speech transmitted over a high
frequency channel. Initial experiments demonstrated very unsatisfactory
results for the base line system.
We investigated a number of robust techniques. We implemented and
applied some of them in an attempt to improve the performance of the speaker
recognition systems. The techniques we tested showed only slight
improvements.
We also investigates the effects of a high frequency channel and
single sideband modulation on the speech features of speech processing
systems. The effects that can deform the features, and therefore reduce the
performance of speech systems, were identified.
One of the effects that can greatly affect the performance of a speech
processing system is noise. We investigated some speech enhancement
techniques and as a result we developed a new statistical based speech
enhancement technique that employs hidden Markov models to represent the
clean speech process.
Like all things in life, you should ask yourself what you can learn by
reading this thesis.
What you will learn obviously depends on what you already know, but here is
a list of what is offered:
- Review of speech enhancement techniques, speech feature analysis
and speaker modeling and classification.
- Speech analysis: Analogue to digital conversion, preprocessing of
speech (speech enhancement, framing, windowing and pre-emphasis) and
speech features (linear prediction and cepstral features).
- Speaker modeling and classification using Gaussian mixture
models.
- Robust feature analysis techniques, such as cepstral weighting,
adaptive component weighting, cepstral mean subtraction, RASTA, PLP
and many more.
- Short introduction to high frequency (HF) channels.
- Short introduction to single sideband (SSB) modulation.
- A description of the effects that Gaussian noise, SSB and an HF
channel has on linear prediction coefficient cepstral features.
- Introduction to Markov chains (MC) and hidden Markov models
(HMM).
- Description of a new statistically based speech enhancement method
using HMMs.
- Short introduction to Hilbert transforms, vector quantisation and
the expectation maximisation algorithm.
- A lot of references in the literature to the above material.
Even if you are not interested in the core of the thesis there is a
lot of other material you might find useful or informative.
The final version of the thesis is available in the following formats:
|