Elsevier

Journal of Biomechanics

Volume 74, 6 June 2018, Pages 126-133
Journal of Biomechanics

Stochastic mechanical model of vocal folds for producing jitter and for identifying pathologies through real voices

https://doi.org/10.1016/j.jbiomech.2018.04.031Get rights and content

Abstract

Jitter, in voice production applications, is a random phenomenon characterized by the deviation of the glottal cycle length with respect to a mean value. Its study can help in identifying pathologies related to the vocal folds according to the values obtained through the different ways to measure it. This paper aims to propose a stochastic model, considering three control parameters, to generate jitter based on a deterministic one-mass model for the dynamics of the vocal folds and to identify parameters from the stochastic model taking into account real voice signals experimentally obtained. To solve the corresponding stochastic inverse problem, the cost function used is based on the distance between probability density functions of the random variables associated with the fundamental frequencies obtained by the experimental voices and the simulated ones, and also on the distance between features extracted from the voice signals, simulated and experimental, to calculate jitter. The results obtained show that the model proposed is valid and some samples of voices are synthesized considering the identified parameters for normal and pathological cases. The strategy adopted is also a novelty and mainly because a solution was obtained. In addition to the use of three parameters to construct the model of jitter, it is the discussion of a parameter related to the bandwidth of the power spectral density function of the stochastic process to measure the quality of the signal generated. A study about the influence of all the main parameters is also performed. The identification of the parameters of the model considering pathological cases is maybe of all novelties introduced by the paper the most interesting.

Introduction

The production of a voiced sound starts when the airflow coming from the lungs is modified into the glottal signal, a quasi-periodic signal after passing through the glottis, where the vocal folds are located. The main examples of voiced sounds are the vowels and this paper is based on their production.

The acoustic pressure signal, after passing by the vocal folds, is filtered and amplified by the vocal tract and then radiated by the mouth originating the voice signal. As the vocal folds displacements are not exactly symmetric the time intervals corresponding to the air pulses of the glottal signal have random fluctuations, called jitter.

There are different ways to measure jitter and its study is important to identify irregularities on the phonation. The values of jitter considered to a normal voice is between 0.1% and, at the maximum, 1% in relation to the mean of the time glottal intervals. Other acoustic measures can also be used, as Shimmer and HNR (Ratio Harmonic-Noise), to help in identifying pathologies on the vocal folds, vocal aging or even to help in problems of speaker recognition or stress situations related to the voice. However, the main feature that should be considered is jitter (Wong et al., 1991, Jiang et al., 2009, Dejonckerea et al., 2012, Mongia and Sharma, 2014, Silva et al., 2016) and this paper is focused in its generation.

Some models of jitter have been proposed but, in general, they do not consider mechanical models, they are created directly on the voice signals, considering some perturbations as, for example, a controlled noise (Schoengten and De Guchteneere, 1997).

Some mechanical models of jitter have been proposed by the same authors of this paper (Cataldo et al., 2012, Cataldo and Soize, 2016, Cataldo and Soize, 2017) and, now, a new mechanical stochastic model is then proposed but considering three control parameters, which gives more possibilities to generate jitter, including a way to change the quality of the voice generated. A new parameter is introduced to discuss this quality, related to the bandwidth of the power spectral density function and, mainly, an inverse stochastic problem is solved to identify parameters and, consequently, to validate the model proposed. With these new possibilities, specific pathologies of the vocal folds can be created and identified, such as paralysis of the vocal folds.

The stochastic model proposed here has the origin based on the deterministic model created by Flanagan and Landgraf (1968), known as the first model used to generate voice using a nonlinear one-mass mechanical model. More complete deterministic models were created (Ishizaka and Flanagan, 1972, Avanzini, 2008; Zhang and Jiang, 2008, Pickup and Thomson, 2009; Cveticanin, 2012, Erath et al., 2013, Pinheiro and Kerschen, 2013) even considering pathological cases in the vocal folds (Gunter, 2004) or stress situation (Luzan et al., 2015) but the idea here is to show that it is possible to generate jitter and voice signal with quality from the primary model considering the stiffness as a stochastic process and, mainly, validate the model proposed identifying parameters solving an statistical inverse problem taking into account experimental normal voices and also with pathological characteristics.

Section snippets

Primary deterministic model

Fig. 1 illustrates a sketch of the model.

Each vocal fold is represented by a nonlinear mass-stiffness-damper system and the complete model is composed by the subsystem of the vocal folds (source) coupled by the glottal flow to the subsystem of the vocal tract (filter). To generate jitter the stiffness will be considered as a stochastic process for which a model is proposed.

Stochastic modeling of jitter

The stiffness k is modeled by a stochastic process {K(t),tR} with values in R+. Consequently, the dynamical position of each vocal fold will be given by a stochastic process, named X(t), coupled with the stochastic process associated with the glottal flow (volume flow velocity), noted Ug(t). The stochastic dynamics of the vocal folds is described by Eq. (1):md2X(t)dt2+{c+c(X(t))}dX(t)dt+K(t)X(t)+a1pB(X(t),Ug(t))=a2ps(t),where a1=1.87d2 and a2=d2, with the length of each vocal fold and d

General ideas

The objective of this section is to generate voice signals with jitter using the stochastic model proposed and to analyze the sensitivity of the stochastic model with respect to parameters a,b, and ξ. As the main idea is to generate jitter, a way to measure it will also be discussed. There are different ways to analyze jitter effects (Mongia, 2012). At first, it is important to define the random variable associated with the duration of the glottal cycle, which is defined as the duration between

Statistical inverse problem

In order to validate the model proposed, parameters a,b, and ξ are identified using experimental voice signals. This identification is carried out by introducing a cost function that is constructed writing that the probability density function associated with the simulated voice is close to the probability density function of the experimental voice and also, the jitter obtained for the simulated voice is close to the jitter of the experimental voice. The four measures of jitter are used. The

Conclusions

A stochastic model has been proposed using three control parameters for generating jitter considering a mechanical model for producing voiced sounds. Some pathological cases have been generated and the model has been validated considering an inverse stochastic problem to identify the parameters. With three control parameters more possibilities of different sounds are obtained, including different levels of jitter and, mainly, it is possible to control the quality of the synthesized voice. The

Conflict of interest statement

The authors disclose any financial and personal relationships with other people or organisations that could inappropriately influence their work.

Acknowledgments

This work was supported by CNPq (Conselho Nacional de Pesquisa e Desenvolvimento) – Brazil.

References (25)

  • Cveticanin, L., 2012. Review on mathematical and mechanical models of the vocal cord....
  • P.H. Dejonckerea et al.

    To what degree of voice perturbation are jitter measurements valid? A novel approach with synthesized vowels and visuo-perceptual pattern recognition

    Biomed. Signal Process. Control

    (2012)
  • Cited by (0)

    View full text