Undergrad Research Project - Robust Physiologically-Based Speech Recognition

Fall 2014

Michael Kellman
Richard Stern
Project description

Models for robust speech recognition over the years have considered several approaches which are based on physiological models to improve upon speech recognition accuracy. Improvements in speech recognition are quantified as increases in speech recognition accuracy in the presence of increasing background noise, decreasing signal quality, and under the presence of reverberation in the source environment. A common physiologically-motivated model for the auditory cortex is commonly represented as a bank of linear filters with varying characteristic frequency and bandwidths. A more novel physiological model is to represent cycle-by-cycle synchrony in the response of low-frequency auditory-nerve fibers. These measures of cycle-by-cycle timing have been shown to provide more robust features than the more common features given by Mel frequency cepstrum coefficients. This synchrony technique in addition to techniques of mean rate analysis on a set of higher frequencies will give a set of features similar to cepstra.

Implementations and improvements to implementations of these techniques will be research, tested, and confirmed using the Sphinx speech recognition system to increase recognition accuracy.

Return to project list