A high-fidelity speech and audio codec with low delay and low complexity
01 January 2000
This paper presents a high-fidelity speech and audio codec operating at a sampling rate of 32 kHz and a bit rate of 64 kbit/s. Designed primarily for real-time speech communication systems with high port densities, this MDCT-based transform codec has a very low coding delay (8 ms frame size) and low codec complexity (less than 10 MIPS on a 16-bit fixed-point DSP). The codec achieves essentially transparent quality for speech, and very close to transparent quality for music. A novel frame erasure concealment algorithm makes this codec robust to frame erasures for both speech and music. Another novel feature allows the decoder to decode the bit stream directly into a 16 kHz or 8 kHz sampled signal, without the need to decode a 32 kHz signal first and then down-sample it to the target sampling rate. Other novel features include some speed-memory trade-off techniques to reduce the computational complexity