A hierarchical approach for better estimation of unseen event likelihood in speech recognition
26 October 2003
The backoff hierarchical class n-gram language models (LMs) are a generalization of the common backoff word n-gram LMs. Compared to the traditional backoff word n-gram LMs that uses (n-1)-gram to estimate the likelihood of an unseen n-gram event, backoff hierarchical class n-gram LMs uses a class hierarchy to define an appropriate context. We study the impact of the hierarchy depth on the performance of the approach. Performance is evaluated on several databases such us switchboard, call-home and wall street journal (WSJ). Results show that better improvement is achieved when a shallow word (few levels) tree is used. Experiments show up to 26% improvement on the unseen events perplexity and up to 12% improvement in the word error rate (WER).