A Comparative Analysis of Pitch Detection Methods Under the Influence of Different Noise Conditions

  • Lyudmila Sukhostat
    Correspondence
    Address correspondence and reprint requests to Lyudmila Sukhostat, Institute of Information Technology, Azerbaijan National Academy of Sciences, 9, B. Vahabzade Street, Baku AZ1141, Azerbaijan.
    Affiliations
    Department of Information Security, Institute of Information Technology, Azerbaijan National Academy of Sciences, Baku, Azerbaijan
    Search for articles by this author
  • Yadigar Imamverdiyev
    Affiliations
    Department of Information Security, Institute of Information Technology, Azerbaijan National Academy of Sciences, Baku, Azerbaijan
    Search for articles by this author
Published:February 20, 2015DOI:https://doi.org/10.1016/j.jvoice.2014.09.016

      Summary

      Objectives/Hypothesis

      Pitch is one of the most important components in various speech processing systems. The aim of this study was to evaluate different pitch detection methods in terms of various noise conditions.

      Study Design

      Prospective study.

      Methods

      For evaluation of pitch detection algorithms, time-domain, frequency-domain, and hybrid methods were considered by using Keele and CSTR speech databases. Each of them has its own advantages and disadvantages.

      Results

      Experiments have shown that BaNa method achieves the highest pitch detection accuracy.

      Conclusions

      The development of methods for pitch detection, which are robust to additive noise at different signal-to-noise ratio, is an important field of research with many opportunities for enhancement the modern methods.

      Key Words

      To read this article in full you will need to make a payment
      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Hess W.J.
        Pitch Determination of Speech Signals.
        Springer-Verlag, Berlin, Germany1983
        • Rabiner L.
        • Cheng M.J.
        • Rosenberg A.E.
        • McGonegal C.A.
        A comparative performance study of several pitch detection algorithms.
        IEEE Trans Acoust Speech Signal Process. 1976; 5: 399-417
        • Veprek P.
        • Scordilis M.S.
        Analysis, enhancement and evaluation of five pitch determination techniques.
        Speech Commun. 2002; 37: 249-270
        • De Cheveigne A.
        • Kawahara H.
        Yin, a fundamental frequency estimator for speech and music.
        J Acoust Soc Am. 2002; 111: 1917-1930
        • Kasi K.
        • Zahorian S.A.
        Yet another algorithm for pitch tracking.
        Proceedings of ICASSP. 2002; : 361-364
        • Tan L.N.
        • Alwan A.
        Multi-band summary correlogram-based pitch detection for noisy speech.
        Speech Commun. 2013; 55: 841-856
        • Ba H.
        • Yang N.
        BaNa: a hybrid approach for noise resilient pitch detection.
        IEEE Statistical Signal Processing Workshop. 2012; : 369-372
        • Carey M.J.
        • Parris E.S.
        • Lloyd-Thomas H.
        • Bennet S.
        Robust prosodic features for speaker identification.
        Proceedings of ICSLP. 1996; : 1800-1803
        • Xie Y.L.
        • Zhou X.
        • Yao Z.Q.
        • Chen J.X.
        • Liu M.H.
        University of science and technology of China SSIP laboratory NIST SRE 2005 system.
        NIST SRE Workshop. 2005;
        • Adami A.G.
        • Mihaescu R.
        • Reynolds D.A.
        • Godfrey J.J.
        Modeling prosodic dynamics for speaker recognition.
        Proceedings of ICASSP. 2003; : 788-791
        • Ross M.J.
        • Shaffer H.L.
        • Cohen A.
        • Freudberg R.
        • Manley H.J.
        Average magnitude difference function pitch extractor.
        IEEE Trans Acoust Speech Signal Process. 1974; ASSP-22: 353-362
        • Sondhi M.M.
        New methods of pitch extraction.
        IEEE Trans Audio Electroacoust. 1968; AU-16: 262-266
        • Gold B.
        • Rabiner L.R.
        Parallel processing techniques for estimating pitch periods of speech in the time domain.
        J Acoust Soc Am. 1969; 46: 442-448
        • Miller N.J.
        Pitch detection by data reduction.
        IEEE Trans Acoust Speech Signal Process. 1975; ASSP-23: 72-79
        • Rabiner L.R.
        On the use of autocorrelation analysis for pitch detection.
        IEEE Trans Acoust Speech Signal Process. 1977; ASSP-25: 24-33
        • Un C.K.
        • Yang S.C.
        A pitch extraction algorithm based on LPC inverse filtering and AMDF.
        IEEE Trans Acoust Speech Signal Process. 1977; ASSP-25: 526-572
        • Moorer J.A.
        The optimum comb method of pitch period analysis of continuous digitized speech.
        IEEE Trans Acoust Speech Signal Process. 1974; 22: 330-338
        • Medan Y.
        • Yair E.
        • Chazan D.
        Super resolution pitch determination of speech signals.
        IEEE Trans Signal Process. 1991; 39: 40-48
        • Barnard E.
        • Cole A.R.
        • Vea M.
        • Alleva F.
        Pitch detection with a neural-net classifier.
        IEEE Trans Signal Process. 1991; 39: 298-307
        • Noll A.M.
        Pitch determination of human speech by the harmonic product spectrum, the harmonic sum spectrum, and maximum likelihood estimate.
        Proceedings of the Symposium on Computer Processing Communications. 1969; : 779-797
        • Seneff S.
        Real-time harmonic pitch detector.
        IEEE Trans Acoust Speech Signal Process. 1978; ASSP-26: 358-365
        • Sreenivas T.V.
        • Rao P.V.S.
        Pitch extraction from corrupted harmonics of the power spectrum.
        J Acoust Soc Am. 1979; 65: 223-228
        • Lahat M.
        • Niederjohn R.J.
        • Krubsack D.A.
        A spectral autocorrelation method for measurement of the fundamental frequency of noise-corrupted speech.
        IEEE Trans Acoust Speech Signal Process. 1987; ASSP-35: 741-750
        • Serra X.
        Musical sound modeling with sinusoids plus noise.
        Musical Signal Process. 1997; : 91-122
        • Noll A.M.
        Cepstrum pitch determination.
        J Acoust Soc Am. 1967; 41: 293-309
        • Markel J.
        The SIFT algorithm for fundamental frequency estimation.
        IEEE Trans Audio Electroacoust. 1972; AU-20: 367-377
        • Schroeder M.R.
        Period histogram and product spectrum: new methods for fundamental-frequency measurement.
        J Acoust Soc Am. 1968; : 829-834
        • Atal B.S.
        • Rabiner L.R.
        A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition.
        IEEE Trans Acoust Speech Signal Process. 1976; ASSP-24: 201-212
        • Camacho A.
        SWIPE: A sawtooth waveform inspired pitch estimator for speech and music [PhD thesis].
        University of Florida, Gainesville, Florida2007
        • Gonzalez S.
        • Brookes M.
        A pitch estimation filter robust to high levels of noise (PEFAC).
        Proceedings of EUSIPCO. 2011; : 451-455
        • Rabiner L.
        • Schafer R.
        Digital Processing of Speech Signals.
        Prentice-Hall, Englewood Cliffs, NJ1978
        • Zahorian S.A.
        • Hu H.
        A spectral/temporal method for robust fundamental frequency tracking.
        J Acoust Soc Am. 2008; 123: 4559-4571
        • Talkin D.
        A robust algorithm for pitch tracking (RAPT).
        Speech Coding and Synthesis. 1995; : 495-518
        • Loughlin P.
        • Tacer B.
        On the amplitude- and frequency-modulation decomposition of signals.
        J Acoust Soc Am. 1996; 100: 1594-1601
      1. Bagshaw PC. Automatic prosodic analysis for computer aided pronunciation teaching [doctoral dissertation]. Edinburgh, Scotland: University of Edinburgh; 1996.

        • Bagshaw P.C.
        • Hiller S.M.
        • Jack M.A.
        Enhanced pitch tracking and the processing of F0 contours for computer and intonation teaching.
        Proceedings of Eurospeech. 1993; : 1003-1006
        • Plante F.
        • Meyer G.
        • Ainsworth W.A.
        A pitch extraction reference database.
        Proceedings of Eurospeech. 1995; : 837-840
        • Varga A.
        • Steeneken H.J.
        Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems.
        Speech Commun. 1993; 12: 247-251
        • Chu W.
        • Alwan A.
        Reducing F0 frame error of F0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend.
        Proceedings of ICASSP. 2009; : 3969-3972
        • Drugman T.
        • Alwan A.
        Joint robust voicing detection and pitch estimation based on residual harmonics.
        Proceedings of Interspeech. 2011; : 1973-1976