A Dynamical Systems Approach to Speech Processing
The standard model of speech assumes that the speech signal is generated by a stochastic or periodic excitation, fed into a linear all-pole filter y sub t = sum from (k=1) to q a sub k y sub (t-k) + x sub t. It is clear, however, that the excitation of this linear filter, x sub t, is an output of a nonlinear dynamical system, namely: the vocal cords. The relevance of the nonlinearities can be examined by directly measuring the dimensionality Euclidean attractor of the speech LPC residual. By embedding the signal samples in high dimensional Euclidean spaces, the LPC residual signal is shown to lie on a relatively low dimensional manifold. The correlation dimension of this manifold (the attractor of the dynamics) varies between 2 to 5 for voiced speech and 4 to 9 for unvoiced speech. This suggests the possibility of modeling the excitation using a low dimensional, time dependent, nonlinear dynamical system. Thus, every p+1 residual samples are approximately related by a nonlinear functional relation F(x sub (t-p), x sub (t-p+1),...,x sub t ;t)=O.