Advertisement

Automated Assessment of Glottal Dysfunction Through Unified Acoustic Voice Analysis

Published:September 23, 2020DOI:https://doi.org/10.1016/j.jvoice.2020.08.032

      Abstract

      This paper uses the recent glottal flow model for iterative adaptive inverse filtering to analyze recordings from dysfunctional speakers, namely those with larynx-related impairment such as laryngectomy. The analytical model allows extraction of the voice source spectrum, described by a compact set of parameters. This single model is used to visualize and better understand speech production characteristics across impaired and nonimpaired voices. The analysis reveals some discriminative aspects of the source model which map to a physiological class description of those impairments. Furthermore, being based on analysis of source parameters only, it is complementary to any existing techniques of vocal-tract or phonetic analysis. The results indicate the potential for future automated speech reconstruction systems that adapt to the method of reconstruction required, as well as being useful for mainstream speech systems, such as ASR, in which front-end analysis can direct back-end models to suit characteristics of impaired speech.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

      1. Royal College of Speech and Language Therapists. Key statistics about speech and language therapy. 2017. [Online]. Available: https://www.rcslt.org/influencing/key_stats.

        • Bhattacharyya N.
        The prevalence of voice problems among adults in the United States.
        Laryngoscope. 2014; 124: 2359-2362
        • Morris M.A.
        • Meier S.K.
        • Griffin J.M.
        Prevalence and etiologies of adult communication disabilities in the united states: Results from the 2012 national health interview survey.
        Disabi Health J. 2016; 9: 140-144
        • Norgate S.
        • Oswald N.
        Laryngectomy is not a tragedy (2nd ed.).
        Cancer Laryngectomee Trust. 1984
        • Mustafa M.B.
        • Rosdi F.
        • Salim S.S.
        • et al.
        Exploring the influence of general and specific factors on the recognition accuracy of an {ASR} system for dysarthric speaker.
        Expert Syst Appl. 2015; 42: 3924-3932
        • McLoughlin I.V.
        • Sharifzadeh H.R.
        • Tan S.L.
        • et al.
        Reconstruction of phonated speech from whispers using formant-derived plausible pitch modulation.
        ACM Trans Accessible Comput (TACCESS). 2015; 6: 12
        • Duffy J.R.
        Motor speech disorders: substrates, differential diagnosis, and management.
        Elsevier - Health Sciences Division, St Louis, United States2012
        • Darley F.L.
        • Aronson A.E.
        • Brown J.R.
        Differential diagnostic patterns of dysarthria.
        J Speech Lang Hearing Res. 1969; 12: 246-269
        • Ackermann W.Z.H.
        Articulatory deficits in parkinsonian dysarthria: an acoustic analysis.
        J Neurol Psychiatry. 1991; 54: 1093-1098
        • Keller E.
        • Vigneux P.
        • Laframboise M.
        Acoustic analysis of neurologically impaired speech.
        Brit J Disord Commun. 1991; 26: 75-94
        • Lawrenc D.
        • Shriberg A.
        • Fourakis M.
        • et al.
        Extensions to the speech disorders classification system.
        J Clin Linguist Phonet. 2010; : 795-824
        • Ross E.D.
        • Rush A.J.
        Diagnosis and neuroanatomical correlates of depression in brain-damaged patients.
        J Arch Gen Psychiatry. 1981; 38: 1338-1344
        • Pietruch R.
        • Michalska M.
        • Konopka W.
        Methods for formant extraction in speech of patients after total laryngectomy.
        Biomed Signal Proc and Control. 2005; 1: 107-112
        • Dehqan A.
        • Yadegari F.
        Correlation of vhi-30 to acoustic measurements across three common voice disorder.
        J Voice. 2017; 31
        • Brockmann M.
        • Drinnan M.J.
        • Stock C.
        Reliable jitter and shimmer measurements in voice clinics: the relevance of vowel,gender,vocal intensity, and fundamental frequency effects in a typical clinical task.
        J Voice. 2011; 25: 44-53
        • Fischer E.
        • Goberman A.M.
        Voice onset in parkinson disease.
        J Commun Disord. 2010; 43: 21-34
        • Allen J.
        • Sharifzadeh H.
        • McLoughlin I.
        Acoustic analysis and computerised reconstruction of speech in laryngectomised individuals.
        137th Annual Meeting of American Laryngological Association (ALA), Chicago. ALA, 2016
        • Sharifzadeh H.R.
        • McLoughlin I.
        • Ahmadi F.
        Reconstruction of normal sounding speech for laryngectomy patients through a modified CELP codec.
        IEEE Trans Biomed Eng. 2010; 57: 2448-2458
        • Toda T.
        • Nakagiri M.
        • Shikano K.
        Statistical voice conversion techniques for body-conducted unvoiced speech enhancement.
        IEEE Trans Audio Speech Lang Process. 2012; 20: 2505-2517
        • Sharifzadeh H.
        • Rassouliha A.H.
        • McLoughlin I.
        A training-based speech regeneration approach with cascading mapping models.
        Elsevier Comput Electr Eng. 2017; 62: 601-611
        • Perrotin O.
        • McLoughlin I.V.
        Glottal flow synthesis for whisper-to-speech conversion.
        IEEE/ACM Trans Audio SpeechLang Process. 2020; 28: 889-900
        • Perrotin O.
        • McLoughlin I.
        A spectral glottal flow model for source-filter separation of speech.
        IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK. IEEE, 2019: 7160-7164
        • McLoughlin I.V.
        Speech and Audio Processing: a MATLAB-based approach.
        Cambridge University Press, 2016
        • Doval B.
        • d’Alessandro C.
        • Henrich N.
        The spectrum of glottal flow models.
        Acta Acustica. 2006; 92: 1026-1046
        • Fant G.
        • Liljencrants J.
        • Lin Q.
        A four-parameter model of glottal flow.
        R Instit Technol - Dept SpeechMusic Hearing Q Progr Status Rep. 1985; 4
        • Fant G.
        The LF-model revisited. transformations and frequency domain analysis.
        R Instit Technol - Dept SpeechMusic Hearing Q Progr Status Rep. 1995; 2–3
      2. Henrich N, d'Alessandro C, Doval B. Spectral correlates of voice open quotient and glottal flow asymmetry: theory, limits and experimental data. In: Proc. of Eurospeech, Aalborg, Denmark, September 3-7, 2001, pp. 47–50

      3. Doval B, d'Alessandro C, Henrich N. The voice source as a causal/anticausal linear filter. ISCA Tutorial and Research Workshop on Voice Quality: Functions, Analysis and Synthesis, ser. VOQUAL ‘03. Geneva, Switzerland: ISCA, August 27-29, 2003, pp. 15–20

        • Childers D.
        Vocal quality factors: Analysis, synthesis and perception.
        J Acoust Soc Am. 1991; 90: 2394-2410
        • Harwardt C.
        Comparing the impact of raised vocal effort on various spectral parameters.
        Proc. of Interspeech, Florence, Italy, August 28–31. 2011: 2941-2944
        • Duvvuru S.
        • Erickson M.
        The effect of change in spectral slope and formant frequencies on the perception of loudness.
        J of Voice. 2013; 27: 691-697
        • d’Alessandro C.
        • Doval B.
        Experiments in voice quality modification of natural speech signals: the spectral approach.
        ISCA Speech Synthesis Workshop. Jenolan Caves House, Blue Mountains, Australia, November 26–29, 1998: 277-282
        • Feugère L.
        • d’Alessandro C.
        • Doval B.
        Cantor digitalis: chironomic parametric synthesis of singing.
        EURASIP J Audio Speech Music Process. 2017;
        • Gobl C.
        • Chasaide A.N.
        The role of voice quality in communicating emotion, mood and attitude.
        Speech Comm. 2003; 40: 189-212
        • Jensen M.K.
        Recognition of word tones in whispered speech.
        WORD. 1958; 14: 187-196
      4. 597–607
        • Olthoff A.
        • Mrugalla S.
        • Laskawi R.
        • Frohlich M.
        • et al.
        Assessment of irregular voices after total and laser surgical partial laryngectomy.
        Arch Otolaryngol-Head Neck Surg. 2003; 129: 994-999
        • Tartter V.C.
        What’s in a whisper?.
        J Acoust Soc Am. 1989; 86: 1678-1683
        • Eklund I.
        • Traunmller H.
        Comparative study of male and female whispered and phonated versions of the long vowels of swedish.
        Phonetica. 1997; 54: 1-21
      5. 01
        • Thomas I.B.
        Perceived pitch of whispered vowels.
        J Acoust Soc Am. 1969; 46: 468-470
        • Sharifzadeh H.
        • McLoughlin I.V.
        • Russell M.J.
        A comprehensive vowel space for whispered speech.
        J Voice. 2012; 26: 49-56
      6. Degottex G. Glottal source and vocal-tract separation: Estimation of glottal parameters, voice transformation and synthesis using a glottal model. Ph.D. dissertation, Univ. Pierre et Marie Curie (UPMC), Nov. 2010.

        • Alku P.
        Glottal inverse filtering analysis of human voice production – a review of estimation and parameterization methods of the glottal excitation and their applications.
        Sadhana. 2011; 36: 623-650
        • Makhoul J.
        Linear prediction: a tutorial review.
        Proc IEEE. 1975; 63: 561-580
        • Alku P.
        Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering.
        Speech Comm. 1992; 11: 109-118
        • Mokhtari P.
        • Story B.
        • Alku P.
        • et al.
        Estimation of the glottal flow from speech pressure signals: evaluation of three variants of iterative adaptive inverse filtering using computational physical modelling of voice production.
        Speech Comm. 2018; 104: 24-38
        • Wood J.
        • Athanasiadis T.
        • Allen J.
        Laryngitis.
        BMJ. 2014;
        • Nemr K.
        • Simões Zenari M.
        • Cordeiro G.F.
        • et al.
        GRBAS and Cape-V scales: High reliability and consensus when applied at different times.
        J Voice. 2012; 26
        • Perrotin O.
        • Loughlin I.M.
        GFM-voc: a real-time voice quality modification system.
        Proc. Interspeech 2019, Graz, Austria. 2019: 3685-3686
      7. Bristow-Johnson R. 2001. March Audio-eq-cookbook. [Online]. Available: https://music.columbia.edu/pipermail/music-dsp/2001-March/041752.html.