Fundamental Frequency Estimation of Low-quality Electroglottographic Signals

  • Christian T. Herbst
    Correspondence
    Address correspondence and reprint requests to Christian T. Herbst, Bioacoustics Laboratory, Department of Cognitive Biology, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria.
    Affiliations
    Bioacoustics Laboratory, Department of Cognitive Biology, University of Vienna, Vienna, Austria
    Search for articles by this author
  • Jacob C. Dunn
    Affiliations
    Behavioural Ecology Research Group, Department of Biology, Faculty of Science & Technology, Anglia Ruskin University, Cambridge, UK

    Division of Biological Anthropology, University of Cambridge, Cambridge, UK
    Search for articles by this author

      Summary

      Fundamental frequency (fo) is often estimated based on electroglottographic (EGG) signals. Because of the nature of the method, the quality of EGG signals may be impaired by certain features like amplitude or baseline drifts, mains hum, or noise. The potential adverse effects of these factors on fo estimation have to date not been investigated.
      Here, the performance of 13 algorithms for estimating fo was tested, based on 147 synthesized EGG signals with varying degrees of signal quality deterioration. Algorithm performance was assessed through the standard deviation σfo of the difference between known and estimated fo data, expressed in octaves.
      With very few exceptions, simulated mains hum, and amplitude and baseline drifts did not influence fo results, even though some algorithms consistently outperformed others. When increasing either cycle-to-cycle fo variation or the degree of subharmonics, the SIGMA algorithm had the best performance (max. σfo = 0.04). That algorithm was, however, more easily disturbed by typical EGG equipment noise, whereas the NDF and Praat's auto-correlation algorithms performed best in this category (σfo = 0.01).
      These results suggest that the algorithm for fo estimation of EGG signals needs to be selected specifically for each particular data set. Overall, estimated fo data should be interpreted with care.

      Key Words

      To read this article in full you will need to make a payment
      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Baken R.J.
        • Orlikoff R.F.
        2nd ed. Clinical Measurement of Speech and Voice. Vol. 2. Singular Publishing, Thompson Learning, San Diego, CA2000
        • Fischer J.
        • Noser R.
        • Hammerschmidt K.
        Bioacoustic field research: a primer to acoustic analyses and playback experiments with primates.
        Am J Primatol. 2013; 75: 643-663
        • Fitch W.T.
        • Hauser M.D.
        Vocal production in nonhuman primates: acoustics, physiology, and functional constraints on ‘honest’ advertisement.
        Am J Primatol. 1995; 37: 191-219
        • Fletcher N.H.
        Acoustic systems in biology: from insects to elephants.
        Acoust Aust. 2005; 3: 83-88
        • Owren M.J.
        • Linker C.D.
        Some analysis methods that may be useful to acoustic primatologists.
        in: Zimmermann E. Newman J.D. Jürgens U. Current Topics in Primate Vocal Communication. Springer, New York1995: 286
        • Titze I.R.
        Workshop on acoustic voice analysis. Summary statement.
        (National Center for Voice and Speech)1995
        • Titze I.R.
        Some consensus has been reached on the labeling of harmonics, formants, and resonances.
        J Voice. 2016; 30: 129
        • ANSI
        USA standard acoustical terminology (including mechanical shock and vibration).
        Tech Rep. 1960; (S1.1-1960)
        • Bergé P.
        • Pomeau Y.
        • Vidal C.
        Order Within Chaos: Towards a Deterministic Approach to Turbulence.
        Hermann and John Wiley & Sons, Paris1984
        • Roark R.M.
        Frequency and voice: perspectives in the time domain.
        J Voice. 2006; 20: 325-354
        • Herzel H.
        • Berry D.
        • Titze I.R.
        • et al.
        Analysis of vocal disorders with methods from nonlinear dynamics.
        J Speech Hear Res. 1994; 37: 1008-1019
        • Titze I.R.
        • Baken R.J.
        • Herzel H.
        Evidence of chaos in vocal fold vibration.
        in: Titze I.R. Vocal Fold Physiology: Frontiers Basic Science. Singular Publishing Group, San Diego, CA1993: 143-188
        • Fitch W.T.
        • Neubauer J.
        • Herzel H.
        Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production.
        Anim Behav. 2002; 63: 407-418
        • Friedrich G.
        • Dejonckere P.H.
        Das Stimmdiagnostik−Protokoll der European Laryngological Society (ELS)—erste Erfahrungen im Rahmen einer Multizenterstudie.
        Laryngo Rhino Otol. 2005; 84: 744-752
        • Childers D.G.
        • Naik J.M.
        • Larar J.N.
        • et al.
        Electroglottography, speech, and ultra-high speed cinematography.
        in: Titze I.R. Scherer R. Vocal Fold Physiology and Biophysics of Voice. Denver Center of Performing Arts, Denver, CO1983: 202-220
        • Deliyski D.D.
        • Hillman R.E.
        State of the art laryngeal imaging: research and clinical implications.
        Curr Opin Otolaryngol Head Neck Surg. 2010; 18: 147-152
        • Hertegard S.
        What have we learned about laryngeal physiology from high-speed digital videoendoscopy?.
        Curr Opin Otolaryngol Head Neck Surg. 2005; 13: 152-156
        • Fabre P.
        Un procédé électrique percuntané d'inscription de l'accolement glottique au cours de la phonation: glottographie de haute fréquence; premiers résultats (A non-invasive electric method for measuring glottal closure during phonation: High frequency glottography; first results).
        Bull Acad Nat Med. 1957; 141: 66-69
        • Hampala V.
        • Garcia M.
        • Svec J.G.
        • et al.
        Relationship between the electroglottographic signal and vocal fold contact area.
        J Voice. 2016; 30: 161-171
        • Baken R.J.
        Electroglottography.
        J Voice. 1992; 6: 98-110
        • Titze I.
        A four-parameter model of the glottis and vocal fold contact area.
        Speech Commun. 1989; 8: 191-201
        • Koike Y.
        Application of some acoustic measures for the evaluation of laryngeal dysfunction.
        Stud Phonol. 1973; VII: 17-23
        • Bergan C.
        • Titze I.R.
        Perception of pitch and roughness in voice signals with subharmonics.
        J Voice. 2001; 15: 165-175
        • Herbst C.T.
        • Schutte H.K.
        • Bowling D.L.
        • et al.
        Comparing chalk with cheese—The EGG contact quotient is only a limited surrogate of the closed quotient.
        J Voice. 2017; 31: 401-409
        • Hess W.
        Pitch Determination of Speech Signals: Algorithms and Devices.
        Springer-Verlag, Heidelberg, Germany1983
        • Tuan V.N.
        • D'Alessandro C.
        Robust glottal closure detection using the wavelet transform.
        Proc Eur Conf Speech Technol. 1999; 2808: 2805
        • Ananthapadmanabha T.
        • Yegnanarayana B.
        Epoch extraction from linear prediction residual for identification of closed glottis interval.
        IEEE Trans Acoust. 1979; 27: 309-319
        • Kadambe S.
        • Boudreaux-Bartels G.F.
        Application of the wavelet transform for pitch detection of speech signals.
        IEEE Trans Inf Theory. 1992; 38: 917-924
        • Manfredi C.
        • Aniello M.D.
        • Bruscaglioni P.
        • et al.
        A comparative analysis of fundamental frequency estimation methods with application to pathological voices.
        Med Eng Phys. 2000; 22: 135-147
        • Thomas M.R.P.
        • Naylor P.A.
        The SIGMA algorithm: a glottal activity detector for electroglottographic signals.
        IEEE Trans Audio Speech Lang Process. 2009; 17: 1557-1566
        • Hagmüller M.
        • Kubin G.
        Poincaré sections for pitch mark determination.
        in: ITRW on Non-Linear Speech Processing (NOLISP 05). 2005: 1-7
        • Talkin D.
        A Robust Algorithm for Pitch Tracking (RAPT).
        in: Kleijn W.B. Paliwal K.K. Speech Coding and Synthesis. Elsevier, New York1995: 495-518
        • Drugman T.
        • Alku P.
        • Alwan A.
        • et al.
        Glottal source processing: from analysis to applications.
        Comput Speech Lang. 2014; 28: 1117-1138
        • Boersma P.
        • Weenink D.
        Praat: Doing Phonetics by Computer.
        Institute of Phonetic Sciences, University of Amsterdam, Amsterdam, The Netherlands2017
        • Eaton J.W.
        • Bateman D.
        • Hauberg S.
        • et al.
        GNU Octave version 4.0.0 manual: a high-level interactive language for numerical computations.
        2015
        • Henrich N.
        DECOM.
        2006
        • Henrich N.
        • d'Alessandro C.
        • Doval B.
        • et al.
        On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation.
        J Acoust Soc Am. 2004; 115: 1321-1332
        • Kounoudes A.
        • Naylor P.A.
        • Brookes M.
        The DYPSA algorithm for estimation of glottal closure instants in voiced speech.
        in: ICASSP. 2002: 349-352
        • Naylor P.A.
        • Kounoudes A.
        • Gudnason J.
        • et al.
        Estimation of glottal closure instants in voiced speech using the DYPSA algorithm.
        IEEE Trans Audio Speech Lang Process. 2007; 15: 34-43
        • Kawahara H.
        • De Cheveigné A.
        • Banno H.
        • et al.
        Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT.
        in: Interspeech. 2005: 537-540
        • Camacho A.
        • Harris J.G.
        A sawtooth waveform inspired pitch estimator for speech and music.
        J Acoust Soc Am. 2008; 124: 1638-1652
        • Thomas M.R.P.
        • Gudnason J.
        • Naylor P.A.
        Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm.
        IEEE Trans Audio Speech Lang Process. 2012; 20: 82-91
        • Jones E.
        • Oliphant T.
        • Peterson P.
        SciPy: open source scientific tools for Python.
        2001
        • Rossing T.
        The Science of Sound.
        Addison-Wesley Publishing Company, London, UK1990
        • Young R.W.
        Terminology for logarithmic frequency units.
        J Acoust Soc Am. 1939; 11: 134-139
        • Drugman T.
        • Drugman T.
        • Alwan A.
        Joint robust voicing detection and pitch estimation based on residual harmonics.
        in: Interspeech. 2011
        • Kane J.
        • Gobl C.
        Evaluation of glottal closure instant detection in a range of voice qualities.
        Speech Commun. 2013; 55: 295-314
        • Babacan O.
        • Drugman T.
        • Henrich N.
        • et al.
        A Quantitative Comparison of Glottal Closure Instant Estimation Algorithms on a Large Variety of Singing Sounds.
        Interspeech. 2013; : 1-5
        • Rabiner L.
        • Cheng M.
        • Rosenberg A.
        • et al.
        A comparative performance study of several pitch detection algorithms.
        IEEE Trans Acoust. 1976; 24: 399-418
        • Jang S.-J.
        • Choi S.-H.
        • Kim H.-M.
        • et al.
        Evaluation of performance of several established pitch detection algorithms in pathological voices.
        in: 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Vol. 2007. 2007: 620-623
        • Cheng M.J.
        • Rabiner L.R.
        • Rosenberg A.E.
        • et al.
        Comparative performance study of several pitch detection algorithms.
        J Acoust Soc Am. 1975; 58: S61-S62
        • Parsa V.
        • Jamieson D.G.
        A comparison of high precision F0 extraction algorithms for sustained vowels.
        J Speech Lang Hear Res. 1999; 42: 112-126
        • Titze I.R.
        • Liang H.
        Comparison of F0 extraction methods for high precision voice perturbation measurements.
        J Speech Hear Res. 1993; 36: 1120-1133
        • Tsanas A.
        • Zañartu M.
        • Little M.A.
        • et al.
        Robust fundamental frequency estimation in sustained vowels: detailed algorithmic comparisons and information fusion with adaptive Kalman filtering.
        J Acoust Soc Am. 2014; 135: 2885-2901