Advertisement
Research Article| Volume 37, ISSUE 3, P314-321, May 2023

Download started.

Ok

Automatic Classification of Healthy Subjects and Patients With Essential Vocal Tremor Using Probabilistic Source-Filter Model Based Noise Robust Pitch Estimation

Published:February 10, 2021DOI:https://doi.org/10.1016/j.jvoice.2021.01.009

      Abstract

      Essential voice tremor (EVT) is a voice disorder resulting from dyscoordination within the laryngeal musculature. A low-frequency fluctuations of fundamental voice frequency or the strength of excitation amplitude is the main consequence of the disorder. The automatic classification of healthy control and EVT is useful tool for the clinicians. A typical automatic EVT classification involves three steps. The first step is to compute the pitch contour from the speech. The second step is to compute the features from the pitch contour, and the final step is to use a classifier to classify the features into healthy or EVT. It is shown that a high-resolution pitch contour estimated from the glottal closure instants (GCIs) is useful for EVT classification. The HPRC estimation can be very poor in the presence of noise. Hence, a probabilistic source filter model based noise robust GCI detection is used for HPRC estimation. The Empirical mode decomposition based feature extraction is used followed by a support vector machine classifier. The EVT classification performance is evaluated using recordings from 45 subjects. The proposed method is found to perform better than the baseline techniques in eight different additive noise conditions with six SNR levels.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Hilo J.T.
        Inspiratory breathing exercises for vocal tremor: a preliminary study. University of South Florida, 2012 (Master’s thesis)
        • Sulica L.
        • Louis E.D.
        Clinical characteristics of essential voice tremor: a study of 34 cases.
        Laryngoscope. 2010; 120: 516-528
        • Patel A.B.
        • Bansberg S.F.
        • Adler C.H.
        • et al.
        The mayo clinic arizona spasmodic dysphonia experience: a demographic analysis of 718 patients.
        Ann Otol Rhinol Laryngol. 2015; 124: 859-863
      1. Titze I.R.. Workshop on acoustic voice analysis: summary statement. national center for voice and speech. 1995.

        • Barkmeier-Kraemer J.
        • Lato A.
        • Wiley K.
        Development of a speech treatment program for a client with essential vocal tremor.
        Seminars in speech and language. vol. 32. © Thieme Medical Publishers, 2011: 043-057
        • Cohen S.M.
        • Dupont W.D.
        • Courey M.S.
        Quality-of-life impact of non-neoplastic voice disorders: a meta-analysis.
        Ann Otol Rhinol Laryngol. 2006; 115: 128-134
        • Breitenstein C.
        • Lancker D.V.
        • Daum I.
        The contribution of speech rate and pitch variation to the perception of vocal emotions in a german and an american sample.
        Cognit Emotion. 2001; 15: 57-79
        • Yair E.
        • Gath I.
        On the use of pitch power spectrum in the evaluation of vocal tremor.
        Proc IEEE. 1988; 76: 1166-1175
        • Dromey C.
        • Warrick P.
        • Irish J.
        The influence of pitch and loudness changes on the acoustics of vocal tremor.
        J Speech Lang Hearing Res. 2002; 45: 879-890
        • Winholtz W.S.
        • Ramig L.O.
        Vocal tremor analysis with the vocal demodulator.
        J Speech Lang Hearing Res. 1992; 35: 562-573
        • Lester R.A.
        • Barkmeier-Kraemer J.
        • Story B.H.
        Physiologic and acoustic patterns of essential vocal tremor.
        J Voice. 2013; 27: 422-432
        • Kreiman J.
        • Gabelman B.
        • Gerratt B.R.
        Perception of vocal tremor.
        J Speech Lang Hearing Res. 2003; 46: 203-214
        • Jiang J.
        • Lin E.
        • Hanson D.G.
        Acoustic and airflow spectral analysis of voice tremor.
        J Speech Lang Hearing Res. 2000; 43: 191-204
        • Carbonell K.M.
        • Lester R.A.
        • Story B.H.
        • Lotto A.J.
        Discriminating simulated vocal tremor source using amplitude modulation spectra.
        J Voice. 2015; 29: 140-147
        • Cnockaert L.
        • Grenez F.
        • Schoentgen J.
        Fundamental frequency estimation and vocal tremor analysis by means of morlet wavelet transforms.
        IEEE International Conference on Acoustics, Speech, and Signal Processing.vol. 1. 2005: I-393
        • Vieira M.N.
        • de C. Silva J.E.
        • Yehia H.C.
        Vibrato and tremor extent spectrum: algorithm and applications.
        J Acoust Soc Am. 2011; 130: EL1-EL7
        • Gamboa J.
        • Jiménez-Jiménez F.J.
        • Nieto A.
        • et al.
        Acoustic voice analysis in patients with essential tremor.
        J Voice. 1998; 12: 444-452
        • Anand S.
        • Shrivastav R.
        • Wingate J.M.
        • et al.
        An acoustic-perceptual study of vocal tremor.
        J Voice. 2012; 26: 811-e1
        • Shetty P.A.
        • Yamini B.K.
        • Yadav R.
        • et al.
        Electroglottographic (EGG) and acoustic characteristics of voice in patients with essential tremor (ET).
        movement disorders. vol. 33. Wiley 111 River St, Hoboken 07030-5774, NJ USA, 2018: S542-S543
        • Shao J.
        • MacCallum J.K.
        • Zhang Y.
        • et al.
        Acoustic analysis of the tremulous voice: assessing the utility of the correlation dimension and perturbation parameters.
        J Commun Disord. 2010; 43: 35-44
        • Arjmandi M.K.
        • Pooyan M.
        • Mikaili M.
        • et al.
        Identification of voice disorders using long-time features and support vector machine with different feature reduction methods.
        J Voice. 2011; 25: e275-e289
        • Mekhala H.S.
        • Yamini B.K.
        • Ketan J.
        • et al.
        Classification of healthy subjects and patients with essential vocal tremor using empirical mode decomposition of high resolution pitch contour.
        Twenty-third National Conference on Communications (NCC). 2017: 1-6
        • Madruga M.
        • Campos-Roca Y.
        • Pérez C.J.
        Robustness assessment of automatic reinke’s edema diagnosis systems.
        IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2020: 891-895
        • Rao A.
        • Ghosh P.K.
        PSFM-A Probabilistic source filter model for noise robust glottal closure instant detection.
        IEEE/ACM Trans Audio Speech Lang Process. 2018; 26: 1645-1657
        • Lederle A.
        • Barkmeier-Kraemer J.
        • Finnegan E.
        Perception of vocal tremor during sustained phonation compared with sentence context.
        J Voice. 2012; 26: 668-e1
        • Drugman T.
        • Alwan A.
        Joint robust voicing detection and pitch estimation based on residual harmonics.
        Twelth Annual Conference of the International Speech Communication Association. 2011: 1973-1976
        • Varga A.
        • Steeneken H.J.M.
        Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems.
        Speech Commun. 1993; 12: 247-251
        • Gibbons J.D.
        • Chakraborti S.
        Nonparametric Statistical Inference: Revised and Expanded.
        CRC press, 2014
        • Drugman T.
        • Thomas M.
        • Gudnason J.
        • Naylor P.
        • Dutoit T.
        Detection of glottal closure instants from speech signals: a quantitative review.
        IEEE Trans Audio Speech Lang Process. 2011; 20: 994-1006