Clinical Use of the CAPE-V Scales: Agreement, Reliability and Notes on Voice Quality

  • Kathleen F. Nagle
    Address correspondence and reprint requests to Kathleen F. Nagle, Department of Speech-Language Pathology, School of Health & Medical Science, Seton Hall University, 123 Metro Boulevard, Room 0440, Nutley, NJ, 07110
    Department of Speech-Language Pathology, School of Health & Medical Science, Seton Hall University, Nutley, New Jersey
    Search for articles by this author
Published:December 19, 2022DOI:



      The CAPE-V is a widely used protocol developed to help standardize the evaluation of voice. Variability of voice quality ratings has prevented development of training protocols that might themselves improve interrater agreement among new clinicians. As part of a larger mixed methods project, this study examines agreement and reliability for experienced clinicians using the CAPE-V scales.

      Study Design



      Experienced voice clinicians (N=20) provided ratings of recordings from 12 speakers representing a range of overall voice quality. Participants were instructed to rate the voices as they normally would, using the CAPE-V scales. Descriptive data were recorded and two levels of agreement were calculated. Single rater reliability was calculated using a 2-way random model of absolute agreement for intraclass correlations (ICC [2,1]).


      Participants use of the CAPE-V scales varied considerably, although most rated overall severity, breathiness, roughness and strain. Data from one participant did not meet a priori agreement criteria. Because outcomes were significantly different without their data, agreement and reliability were analyzed based on the reduced data set from 19 participants. Interrater agreement and reliability were comparable to previous research; the mean range of ratings was at least 47mm for all dimensions of voice quality.


      Results indicated differential use of the components of the CAPE-V form and scales in evaluating voice quality and severity of dysphonia, including categorical variability among ratings of all of the primary CAPE-V dimensions of voice quality that may complicate the clinical description of a voice as mildly, moderately or severely dysphonic.

      Key Words

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Kempster GB
        • Gerratt BR
        • Verdolini Abbott K
        • et al.
        Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol.
        Am J Speech Lang Pathol. 2009; 18: 124-132
        • Karnell M
        • Melton S
        • Childes J
        • et al.
        Reliability of clinician-based (GRBAS and CAPE-V) and patient-based (V-RQOL and IPVI) documentation of voice disorders.
        J Voice. 2007; 21: 576-590
        • Zraick RI
        • Kempster GB
        • Connor NP
        • et al.
        Establishing validity of the consensus auditory-perceptual evaluation of voice (CAPE-V).
        Am J Speech Lang Pathol. 2011; 20: 14-22
        • Eadie TL
        • Kapsner-Smith M.
        The effect of listener experience and anchors on judgments of dysphonia.
        J Speech Lang Hear Res. 2011; 54: 430-447
        • Helou L
        • Solomon N
        • Henry L
        • et al.
        The role of listener experience on Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) ratings of postthyroidectomy voice.
        Am J Speech Lang Pathol. 2010; 19: 248-258
        • Nagle KF.
        Challenges to CAPE-V as a standard.
        Perspect ASHA Spec Interest Groups SIG. 2016; 1 (3): 47-53
      1. ASHA. Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) ASHA Special Interest Division 3, Voice and Voice Disorders. Published online 2009. Accessed October 13, 2019.

        • Lodhavia A
        • Kempster GB.
        Fidelity to the CAPE-V.
        ASHA, Denver, CO2015 (Poster presented at:November 13)
        • Kreiman J
        • Gerratt BR
        • Precoda K.
        Listener experience and perception of voice quality.
        J Speech Hear Res. 1990; 33: 103-115
        • Kreiman J
        • Gerratt BR.
        Comparing two methods for reducing variability in voice quality measurements.
        J Speech Lang Hear Res. 2011; 54: 803-812
        • Walden PR.
        Perceptual voice qualities database (PVQD): database characteristics.
        J Voice. 2020; 0
        • Hirano M.
        GRBAS” scale for evaluating the hoarse voice & frequency range of phonation.
        in: Hirano M Clinical Examination of Voice. Vol 5. Disorders of Human Communication. Springer-Verlag/Wien, 1981: 83-84 (88-89)
        • Nagle KF
        • Helou LB
        • Solomon NP
        • et al.
        Does the presence or location of graphic markers affect untrained listeners’ ratings of severity of dysphonia?.
        J Voice. 2014; 28: 469-475
      2. Walden P. Perceptual Voice Qualities Database (PVQD), Mendeley. Published online October 9, 2020. Accessed October 20, 2020.

        • Shrivastav R
        • Sapienza CM
        • Nandur V.
        Application of psychometric theory to the measurement of voice quality using rating scales.
        J Speech Lang Hear Res. 2005; 48: 323-335
        • Eddins DA
        • Anand S
        • Camacho A
        • et al.
        Modeling of breathy voice quality using pitch-strength estimates.
        J Voice. 2016; 30: 774.e1-774.e7
        • Eddins DA
        • Shrivastav R.
        Psychometric properties associated with perceived vocal roughness using a matching task.
        J Acoust Soc Am. 2013; 134: EL294-EL300
        • Anand S
        • Kopf LM
        • Shrivastav R
        • et al.
        Objective indices of perceived vocal strain.
        J Voice. 2019; 33: 838-845
        • Eddins DA
        • Anand S
        • Lang A
        • et al.
        Developing clinically relevant scales of breathy and rough voice quality.
        J Voice. 2021; 35: 663.e9-663.e16
        • Stevens S.
        Psychophysics: Introduction to Its Perceptual, Neural, and Social Prospects.
        Wiley, 1975
        • Eadie TL
        • Doyle PC.
        Direct magnitude estimation and interval scaling of naturalness and severity in tracheoesophageal (TE) speakers.
        J Speech Lang Hear Res. 2002; 45: 1088-1096
        • Schiavetti N
        • Sacco PR
        • Metz DE
        • et al.
        Direct magnitude estimation and interval scaling of stuttering severity.
        J Speech Hear Res. 1983; 26: 568-573
        • Toner MA
        • Emanuel FW.
        Direct magnitude estimation and equal appearing interval scaling of vowel roughness.
        J Speech Hear Res. 1989; 32: 78-82
        • Stevens SS
        • Galanter EH.
        Ratio scales and category scales for a dozen perceptual continua.
        J Exp Psychol. 1957; 54: 377-411
        • Bunton K
        • Kent RD
        • Duffy JR
        • et al.
        Listener agreement for auditory-perceptual ratings of dysarthria.
        J Speech Lang Hear Res. 2007; 50: 1481-1495
        • Yiu E
        • Ng C.
        Equal appearing interval and visual analogue scaling of perceptual roughness and breathiness.
        Clin Linguist Phon. 2004; 18: 211-229
        • Wilson DK.
        Voice Problems of Children.
        3rd ed. Williams & Wilkins, 1987
        • Kreiman J
        • Gerratt BR
        • Kempster GB
        • Erman A
        • Berke GS.
        Perceptual evaluation of voice quality: review, tutorial, and a framework for future research.
        J Speech Hear Res. 1993; 36: 21-40
        • Kreiman J
        • Gerratt BR
        • Precoda K
        • et al.
        Individual differences in voice quality perception.
        J Speech Hear Res. 1992; 35: 512-520
        • Eadie T
        • Nicolici C
        • Baylor C
        • et al.
        Effect of experience on judgments of adductor spasmodic dysphonia.
        Ann Otol Rhinol Laryngol. 2007; 116: 695-701
        • Eadie TL
        • Kapsner M
        • Rosenzweig J
        • et al.
        The role of experience on judgments of dysphonia.
        J Voice. 2010; 24 (S0892-1997(08)00209-9 [pii][doi]): 564-573
        • Shrout PE
        • Fleiss JL.
        Intraclass correlations: uses in assessing rater reliability.
        Psychol Bull. 1979; 86: 420-428
        • Portney L
        • Watkins M.
        Foundations of Clinical Research: Applications to Practice.
        2nd ed. Prentice Hall Health, 2000
        • Wuyts F
        • De Bodt M
        • Van de Heyning P.
        Is the reliability of a visual analog scale higher than an ordinal scale? An experiment with the GRBAS scale for the perceptual evaluation of dysphonia.
        J Voice. 1999; 13: 508-517
        • Ghio A
        • Dufour S
        • Wengler A
        • et al.
        Perceptual evaluation of dysphonic voices: can a training protocol lead to the development of perceptual categories?.
        J Voice. 2015; 29: 304-311
        • Awan SN
        • Roy N
        • Jetté ME
        • et al.
        Quantifying dysphonia severity using a spectral/cepstral-based acoustic index: comparisons with auditory-perceptual judgements from the CAPE-V.
        Clin Linguist Phon. 2010; 24: 742-758
        • Kreiman J
        • Gerratt BR.
        Validity of rating scale measures of voice quality.
        J Acoust Soc Am. 1998; 104: 1598-1608
        • Awan SN
        • Solomon NP
        • Helou LB
        • et al.
        Spectral-cepstral estimation of dysphonia severity: external validation.
        Ann Otol Rhinol Laryngol. 2013; 122: 40-48
        • Oates JM
        • Bain B
        • Davis P
        • et al.
        Development of an auditory-perceptual rating instrument for the operatic singing voice.
        J Voice. 2006; 20: 71-81
        • Chan KM
        • Yiu EM.
        The effect of anchors and training on the reliability of perceptual voice evaluation.
        J Speech Lang Hear Res. 2002; 45: 111-126
        • Iwarsson J
        • Reinholt Petersen N
        Effects of consensus training on the reliability of auditory perceptual ratings of voice quality.
        J Voice. 2012; 26: 304-312