Cepstral Peak Sensitivity: A Theoretic Analysis and Comparison of Several Implementations

      Summary

      Objective

      The aim of this study was to develop a theoretic analysis of the cepstral peak (CP), to compare several CP software programs, and to propose methods for reducing variability in CP estimation.

      Study Design

      Descriptive, experimental study.

      Methods

      The theoretic CP value of a pulse train was derived and compared with estimates computed for pulse train WAV files using available CP software programs: (1) Hillenbrand's CP prominence (CPP) software (Western Michigan University, Kalamazoo, MI), (2) KayPENTAX (Montvale, NJ) Multi-Speech implementation of CPP, and (3) a MATLAB (The Mathworks, Natick, MA, version R2014a) implementation using cepstral interpolation. The CP variation was also investigated for synthetic breathy vowels.

      Results

      For pulse trains with period T samples, the theoretic CP is 1/2+ε/T, |ε|<0.1 for all pulse trains (ε=0 for integer T). For fundamental frequencies between 70 and 230 Hz, the CP mean±standard deviation was 0.496±0.002 using cepstral interpolation and 0.29±0.03 using Hillenbrand's software, whereas CPP was 35.0±3.8 dB using Hillenbrand's software and 20.5±2.7 dB using KayPENTAX's software. The CP and CPP versus signal-to-noise ratio for synthetic breathy vowels were fit to a logistic model for the Hillenbrand (R2=0.92) and KayPENTAX (R2=0.82) estimators as well as an ideal estimator (R2=0.98), which used a period-synchronous analysis.

      Conclusions

      The findings indicate that several variables unrelated to the signal itself impact CP values, with some factors introducing large variability in CP values that would otherwise be attributed to the signal (eg, voice quality). Variability may be reduced by using a period-synchronous analysis with Hann windows.

      Key Words

      To read this article in full you will need to make a payment

      Subscribe:

      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Awan S.N.
        • Roy N.
        • Jette M.E.
        • Meltzner G.S.
        • Hillman R.E.
        Quantifying dysphonia severity using a sepctral/cepstral-based acoustic index: comparisons with auditory-perceptual judgements from the CAPE-V.
        Clin Linguist Phon. 2010; 24: 742-758
      1. Hillenbrand JM. James M. Hillenbrand Homepage. 2014. [Online]. Available at: http://homepages.wmich.edu/∼hillenbr/.

        • Maryn Y.
        • Roy N.
        • De Bodt M.
        • Van Cauwenberge P.
        • Corthals P.
        Acoustic measurement of overall voice quality: a meta-analysis.
        J Acoust Soc Am. 2009; 126: 2619-2634
      2. Skowronski MD, Shrivastav R, Hunter EJ. The cepstral peak: a theoretic analysis and implementation comparison of a popular voice measure. Presented at: 167th Meeting of the Acoustical Society of America, Providence, RI. 2014;135:2426.

        • Murphy P.J.
        On first rahmonic amplitude in the analysis of synthesized aperiodic voice signals.
        J Acoust Soc Am. 2006; 120: 2896-2907
        • Fraile R.
        • Godino-Llorente J.I.
        Cepstral peak prominence: a comprehensive analysis.
        Biomed Signal Process Control. 2014; 14: 42-54
        • Bogert B.P.
        • Healy M.J.R.
        • Tukey J.W.
        The frequency analysis of time series for echoes: Cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking.
        in: Rosenblatt M. Proceedings of the Symposium on Time Series Analysis. John Wiley and Sons, Inc., New York, NY1963: 209-243
        • Noll A.M.
        Short-time spectrum and “cepstrum” techniques for vocal-pitch detection.
        J Acoust Soc Am. 1964; 36: 296-302
        • Oppenheim A.V.
        Speech analysis-synthesis system based on homomorphic filtering.
        J Acoust Soc Am. 1969; 45: 458-465
        • de Krom G.
        A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals.
        J Speech Hear Res. 1993; 36: 254-266
        • Dejonckere P.H.
        • Wieneke G.H.
        Cepstra of normal and pathological voices, in correlation with acoustic, aerodynamic, and perceptual data.
        in: Ball M.J. Duckworth M. Advances in Clinical Phonetics. John Benjamins Publishing Company, Amsterdam, The Netherlands1996: 217-226
        • Hillenbrand J.
        • Cleveland R.A.
        • Erickson R.L.
        Acoustic correlates of breathy vocal quality.
        J Speech Hear Res. 1994; 37: 769-778
        • Hillenbrand J.
        • Houde R.A.
        Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech.
        J Speech Hear Res. 1996; 39: 311-321
        • Eadie T.L.
        • Baylor C.R.
        The effect of perceptual training on inexperienced listeners' judgements of dyshponic voice.
        J Voice. 2006; 20: 527-544
        • Callan D.E.
        • Kent R.D.
        • Roy N.
        • Tasko S.M.
        Self-organizing map for the classification of normal and disordered female voices.
        J Speech Lang Hear Res. 1999; 42: 355-366
        • Heman-Ackah Y.D.
        • Michael D.D.
        • Goding Jr., G.S.
        The relationship between cepstral peak prominence and selected parameters of dysphonia.
        J Voice. 2002; 16: 20-27
        • Shrivastav R.
        • Sapienza C.M.
        Objective measures of breathy voice quality obtained using an auditory model.
        J Acoust Soc Am. 2003; 114: 2217-2224
        • Wolfe V.I.
        • Martin D.P.
        • Palmer C.I.
        Perception of dysphonic voice quality by naive listeners.
        J Speech Lang Hear Res. 2000; 43: 697-705
        • Stranik A.
        • Cmejla R.
        • Vokral J.
        Acoustic parameters for classification of breathiness in continuous speech according to the GRBAS scale.
        J Voice. 2014; 28: 653.e9-653.e17
        • Maryn Y.
        • Corthals P.
        • Van Cauwenberge P.
        • Roy N.
        • De Bodt M.
        Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels.
        J Voice. 2010; 24: 540-555
        • Awan S.N.
        • Roy N.
        Acoustic prediction of voice type in women with functional dysphonia.
        J Voice. 2005; 19: 268-282
        • Samlan R.A.
        • Story B.H.
        • Bunton K.
        Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computational modeling.
        J Speech Lang Hear Res. 2013; 56: 1209-1223
        • Chen G.
        • Kreiman J.
        • Gerratt B.R.
        • Neubauer J.
        • Shue Y.-L.
        • Alwan A.
        Development of a glottal area index that integrates glottal gap size and open quotient.
        J Acoust Soc Am. 2013; 133: 1656-1666
        • Brinca L.F.
        • Batista A.P.F.
        • Tavares A.I.
        • Goncalves I.C.
        • Moreno M.L.
        Use of cepstral analyses for differentiating normal from dysphonic voices: a comparative study of connected speech versus sustained vowel in European Portuguese female speakers.
        J Voice. 2014; 28: 282-286
        • Oppenheim A.V.
        • Schafer R.W.
        Discrete-Time Signal Processing.
        Prentice-Hall, Englewood Cliffs, NJ1989
        • Klatt D.H.
        • Klatt L.C.
        Analysis, synthesis, and perception of voice quality variations among female and male talkers.
        J Acoust Soc Am. 1990; 87: 820-857
        • Childers D.G.
        • Lee C.K.
        Vocal quality factors: analysis, synthesis, and perception.
        J Acoust Soc Am. 1991; 90: 2394-2410
        • Atal B.A.
        • Hanauer S.L.
        Speech analysis and synthesis by linear prediction of the speech wave.
        J Acoust Soc Am. 1971; 50: 637-655
        • Deller Jr., J.R.
        • Hansen J.H.L.
        • Proakis J.G.
        Discrete-Time Processing of Speech Signals.
        IEEE Press, New York, NY2000
        • Muta H.
        • Baer T.
        • Wagatsuma K.
        • Muraoka T.
        • Fukuda H.
        A pitch-synchronous analysis of hoarseness in running speech.
        J Acoust Soc Am. 1988; 84: 1292-1301
        • Harris F.J.
        On the use of windows for harmonic analysis with the discrete Fourier transform.
        Proc IEEE. 1978; 66: 51-83
        • Childers D.G.
        • Skinner D.P.
        • Kemerait R.C.
        The cepstrum: a guide to processing.
        Proc IEEE. 1977; 65: 1428-1443