Advances in Non-Linear Modeling for Speech Processing by Raghunath S. Holambe

By Raghunath S. Holambe

Advances in Non-Linear Modeling for Speech Processing comprises complex themes in non-linear estimation and modeling options besides their purposes to speaker popularity.

Non-linear aeroacoustic modeling strategy is used to estimate the $64000 fine-structure speech occasions, which aren't printed through the quick time Fourier remodel (STFT). This aeroacostic modeling procedure presents the impetus for the excessive answer Teager strength operator (TEO). This operator is characterised via a time solution that may tune swift sign power alterations inside of a glottal cycle.

The cepstral positive aspects like linear prediction cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are computed from the importance spectrum of the speech body and the section spectra is overlooked. to beat the matter of neglecting the section spectra, the speech construction approach might be represented as an amplitude modulation-frequency modulation (AM-FM) version. To demodulate the speech sign, to estimation the amplitude envelope and immediate frequency parts, the strength separation set of rules (ESA) and the Hilbert rework demodulation (HTD) set of rules are mentioned.

Different beneficial properties derived utilizing above non-linear modeling suggestions are used to increase a speaker identity process. ultimately, it really is proven that, the fusion of speech creation and speech conception mechanisms can result in a strong function set.

Linear prediction). The noise or error term w(k) represents the degree of inaccuracy in using the linear state equation to describe the true state dynamics. The observation equation Eq. 20, on the other hand, is static in nature. It contains no dynamics since the time indices of o and x are the same. It represents the noisy relationship between the state vector and the observation vector. Like the noise term in the state equation, the noise term v in the observation equation represents the degree of inaccuracy in using the linear mapping o(k) = Cx(k) to describe the true relationship between the state and observation vectors.

Honda K (2008) Physiological processes of speech production. Springer, Berlin 21. Kitamura T, Honda K, Takemoto H (2005) Individual variation of the hypopharyngeal cavities and its acoustic effects. Acoust Sci Tech 26:16–26 22. Dang J, Honda K (1997) Acoustic characteristics of the piriform fossa in models and humans. J Acoust Soc Am 101(1):456–465 References 25 23. Kitamura T, Takemoto H, Adachi S, Mokhtari P, Honda K (2006) Cyclicity of laryngeal cavity resonance due to vocal fold vibration. J Acoust Soc Am 120(6):2239–2249 24.

For example, measurements by Teager reveal the presence of separated flow within the vocal tract [49]. Separated flow occurs when a region of fast moving fluid-a jet-detaches from regions of relatively stagnant fluid. When this occurs, viscous forces (neglected by linear models) create a tendency for the fluid to ‘roll up’ into rotational fluid structures commonly referred to as vortices as shown in Fig. 3b. Teager suggested that the presence of traveling vortices, ‘smoke rings’ could result in additional acoustic sources throughout the vocal tract.

