Advertisement

Habitual Use of Vocal Fry in Young Adult Female Speakers

Published:September 15, 2011DOI:https://doi.org/10.1016/j.jvoice.2011.04.007

      Summary

      The purpose of this study was to examine the use of vocal fry in young adult Standard American-English (SAE) speakers. This was a preliminary attempt (1) to determine the prevalence of the use of this register in young adult college-aged American speakers and (2) to describe the acoustic characteristics of vocal fry in these speakers. Subjects were 34 female college students. They were native SAE speakers aged 18–25 years. Data collection procedures included high quality recordings of two speaking conditions, (1) sustained isolated vowel /a/ and (2) sentence reading task. Data analyses included both perceptual and acoustic evaluations. Results showed that approximately two-thirds of this population used vocal fry and that it was most likely to occur at the end of sentences. In addition, statistically significant differences between vocal fry and normal register were found for mean F0 minimum, F0 maximum, F0 range, and jitter local. Preliminary findings were taken to suggest that use of the vocal fry register may be common in some adult SAE speakers.

      Key Words

      Introduction

      Vocal fry is a form of phonation, characterized by a distinct laryngeal vibratory pattern, distinct acoustic features, and a distinct vocal quality. Vocal fry has been referred to as pulse register, creaky voice, stiff voice, or glottal fry.
      • Hollien H.
      On vocal register.
      For Hollien,
      • Hollien H.
      On vocal register.
      these terms are synonymous.

      Ewender T, Hoffmann S, Pfister B. Nearly perfect detection of continuous F0 contour and frame classification for TTS synthesis. In: Proceedings of Interspeech; 2009:100–103.

      However, in psycholinguistic research, the terms “glottalization” or “irregular phonation” are the preferred terms to describe this mode of phonation.
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.

      Bohm T, Shattuck-Hufnagel S. Utterance final glottalization as cue for familiar speaking recognition. In: INTERSPEECH-2007; 2007:2657–2660.

      • Slifka J.
      Some physiological correlates to regular and irregular phonation at the end of an utterance.
      Historically, vocal fry has been classified as part of a clinical voice disorder because it has frequently been associated with abnormal vocal laryngeal outputs.
      • Moore P.
      • Von Leden H.
      Dynamic variations in the vibratory pattern in the normal larynx.
      • Wolfe V.
      • Cornell R.
      • Palmer C.
      Acoustic correlates of pathologic voice types.
      • Eskenazi L.
      • Childers D.G.
      • Hicks D.M.
      Acoustic correlates of vocal quality.
      • Ylitalo Y.
      • Hammarberg B.
      Voice characteristics, effects of voice therapy, and long-term follow-up of contact granuloma patients.
      Vocal fry has often been associated with creaky, harsh, and rough voice qualities.
      • Moore P.
      • Von Leden H.
      Dynamic variations in the vibratory pattern in the normal larynx.
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Daniloff R.
      • Schuckers G.
      • Feth L.
      The Physiology of Speech and Hearing: An Introduction.
      • Childers D.G.
      • Lee C.K.
      Vocal quality factors: analysis, synthesis, and perception.
      In particular, Ylitalo & Hammarberg
      • Ylitalo Y.
      • Hammarberg B.
      Voice characteristics, effects of voice therapy, and long-term follow-up of contact granuloma patients.
      reported that vocal fry is one of the most salient perceptual voice characteristics in patients with contact granulomas.
      More recently, vocal fry has also been identified in normal populations. Gottliebson et al
      • Gottliebson R.O.
      • Lee L.
      • Weinrich B.
      • Sanders J.
      Voice problems of future speech-language pathologists.
      examined the evidence of voice disorders in 104 normally speaking (i.e., no previous diagnosis of voice disorders) speech-language pathology first-year graduate students, aged 21–48 years. Ninety-four percent of the participants were females and 6% were males. Perceptual judgments of the voice characteristics were made during conversational speech. Results showed that the voices of 86% of students had normal vocal quality (i.e., passed the Quick Screen for Voice). The remaining 14% exhibited two or more features of disordered voice. They were all females. The primary characteristics, which included hoarse voice, creaky voice, strained phonation, and abnormally low pitch, were present in various degrees. However, of striking importance was the evidence of continual glottal fry in all these 14% of cases. Furthermore, among those who passed the screening (i.e. 86%), 18% exhibited vocal fry. Thus, a large proportion of the subjects in their study used vocal fry in their conversational speech.
      • Gottliebson R.O.
      • Lee L.
      • Weinrich B.
      • Sanders J.
      Voice problems of future speech-language pathologists.
      The advent of more sophisticated instrumentation facilitated more in-depth examination of the physiological and acoustic characteristics of vocal fold behavior. This, combined with the assessments of normal laryngeal functioning in psycholinguistic and crosslinguistic studies prompted a departure from the earlier viewpoint that vocal fry was essentially associated with vocal pathology.
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      • Chen Y.
      • Robb M.P.
      • Gilbert H.R.
      Electroglottographic evaluation of gender and vowel effects during modal and vocal fry phonation.
      In their seminal articles, Hollien et al
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      suggested that vocal fry be viewed as one of several physiologically normal modes of laryngeal vibrations (i.e., modal, falsetto). In their view, registers are distributed along the frequency-pitch continuum, with the falsetto register lying on the upper end above the frequency range of the modal register and vocal fry occurring on the lower end of the frequency spectrum below that of the modal register. They suggested that speakers without vocal pathology could selectively use any of these registers. Based on anecdotal evidence, Hollien et al
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      reported that vocal fry may frequently be perceived as “normal,” especially when the fundamental frequency drops below frequencies typical of the modal register.
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Wendahl R.W.
      Perceptual study of vocal fry.
      Hollien et al
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      postulated that similar to falsetto and modal register, vocal fry is characterized by a unique laryngeal vibratory behavior, which results in specific physiological, acoustic, and perceptual characteristics. Several other researchers have supported this view.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      • Chen Y.
      • Robb M.P.
      • Gilbert H.R.
      Electroglottographic evaluation of gender and vowel effects during modal and vocal fry phonation.
      • Timcke R.
      • von Leden H.
      • Moore P.
      Laryngeal vibrations: measurements of the glottic wave.
      • McGlone R.E.
      • Shipp T.
      Some physiologic correlates of vocal-fry phonation.
      These assumptions have been empirically studied and supported in numerous research studies. For example, Moore and Von Leden,
      • Moore P.
      • Von Leden H.
      Dynamic variations in the vibratory pattern in the normal larynx.
      using slow motion films, were the first to report the distinct vibratory patterns of vocal fry in normal subjects. They labeled it “dicrotic dysphonia”. Others, Timcke et al,
      • Timcke R.
      • von Leden H.
      • Moore P.
      Laryngeal vibrations: measurements of the glottic wave.
      Moore and Von Leden,
      • Moore P.
      • Von Leden H.
      Dynamic variations in the vibratory pattern in the normal larynx.
      and Hollien et al,
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      using ultra-high speed-motion pictures, showed that during vocal fry, the vocal folds open and close twice rapidly per cycle and then remain adducted for a relatively long period of time. More recently, newer techniques such as high-speed photography of the larynx and electroglottography (EGG) have confirmed the existence of this dicrotic vibratory pattern.
      • Childers D.G.
      • Lee C.K.
      Vocal quality factors: analysis, synthesis, and perception.
      • Hollien H.
      • Girard G.T.
      • Coleman R.F.
      Vocal fold vibratory patterns of pulse register phonation.
      • Whitehead R.L.
      • Mets D.E.
      • Whitehead B.H.
      Vibratory patterns of the vocal folds during pulse register phonation.
      • Chen Y.
      • Ng M.
      • Jeng J.
      • Gilbert H.
      Aerodynamic, electroglottographic, and acoustic characteristics of modal and vocal fry registers.
      Childers and Lee,
      • Childers D.G.
      • Lee C.K.
      Vocal quality factors: analysis, synthesis, and perception.
      using EGG, measured the speed quotient (SQ), which is the ratio of opening-phase duration to the closing-phase duration. SQ provides an estimation of the symmetry of the glottal cycle. They reported that glottal fry productions are characterized by short glottal pulses, followed by a long period where the vocal folds are completely adducted.
      • Childers D.G.
      • Lee C.K.
      Vocal quality factors: analysis, synthesis, and perception.
      Thus, all of these findings support the notion that vocal fry in normal voice is characterized by a distinct laryngeal behavior.
      Additional acoustic characteristics of glottal fry have been described by Eskenazi et al.
      • Eskenazi L.
      • Childers D.G.
      • Hicks D.M.
      Acoustic correlates of vocal quality.
      They reported that measurements of jitter and shimmer were significantly higher and signal-to-noise ratios were significantly lower in vocal fry than in modal register for both females and males. These findings were further substantiated by Blomgren et al.
      • Blomgren M.
      • Chen Y.
      • Ng M.
      • Gilbert H.
      Acoustic, aerodynamic, and perceptual characteristics of modal and vocal fry registers.
      Acoustically, vocal fry lies in the lower end of the fundamental frequency (F0) spectrum, below the frequency range typical of the modal register.
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      • McGlone R.E.
      • Shipp T.
      Some physiologic correlates of vocal-fry phonation.
      • Murry T.
      Subglottal pressure and airflow measures during vocal fry phonation.
      It is characterized by a specific range of F0, which minimally overlaps with those of the modal register.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      • McGlone R.E.
      • Shipp T.
      Some physiologic correlates of vocal-fry phonation.
      Hollien and Michel
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      demonstrated their subjects’ ability to match samples of vocal fry produced by another speaker. The ranges for males (7–78 Hz) and females (2–78 Hz) were surprisingly similar. There was little overlap with the ranges of the modal register (71–561 Hz for males; 122–798 Hz for females). This is in agreement with a study conducted by Blomgren et al.
      • Blomgren M.
      • Chen Y.
      • Ng M.
      • Gilbert H.
      Acoustic, aerodynamic, and perceptual characteristics of modal and vocal fry registers.
      They asked their participants to sustain vowels for 6 seconds and seven contiguous syllables /pi/. They reported that significant gender differences in mean fundamental frequency existed for modal register but not for vocal fry.
      The recognition and systematic investigation of vocal fry in the voices of normal speakers have not been limited to the field of speech-language pathology. Hollien et al
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      suggested that a shift from modal register to vocal fry might be triggered by linguistic factors as well. In the field of psycholinguistics, glottal fry is most often referred to as glottalization or pulse phonation. Glottalization refers to a perceptual and acoustic translation of intermittent irregular vocal fold vibrations occurring during speech.
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.

      Bohm T, Shattuck-Hufnagel S. Utterance final glottalization as cue for familiar speaking recognition. In: INTERSPEECH-2007; 2007:2657–2660.

      They noted that sporadic glottalization should be contrasted with ongoing glottalization, which is reported to be most often associated with vocal disorders.

      Belotel-Grenié A, Grenié M. The creaky voice phonation and the organization of Chinese discourse, [on-line]. In: International Symposium on Tonal Aspects of Languages: Emphasis on Tone Languages; 2004:28–30.

      Glottalization has been found to serve as a boundary marker in several languages. In American English, Lehiste

      Lehiste I. Sentence boundaries and paragraph boundaries—perceptual evidence. In: The Elements: A Para-Session on Linguistic Units and Levels. Chicago Linguistics Society; 1979:99–109.

      and Kreiman
      • Kreiman J.
      Perception of sentence and paragraph boundaries in natural conversation.
      have shown that glottalization is one of several acoustic cues to indicate utterance endings. They showed that it is used to indicate the end of paragraphs and sentence endings. Several variables, such as word frequency, boundary level, pitch accent, speed, rhythm, prosody, pauses, and segmental and pragmatic variables, have been shown to influence the rate of glottalization in English
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.
      • Slifka J.
      Some physiological correlates to regular and irregular phonation at the end of an utterance.
      • Allen J.
      The glottal stop as a junctural correlate in English.
      • Umeda N.
      Occurrence of glottal stops in fluent speech.
      • Dilley L.
      • Shattuck Hufnagel S.
      • Ostendorf M.
      Glottalization of word-initial vowels as function of prosodic structure.
      • Laver J.
      The Phonetic Description of Voice Quality.
      and in Swedish, Czech, Finnish, Serbian/Croatian,

      Lehiste I. Juncture. In: Proceedings of the 5th International Congress of Phonetic Sciences; 1965:172–200.

      • Lehiste I.
      Suprasegmentals.
      • Ogden R.
      Turn-holding, turn-yielding, and laryngeal activity in Finnish talk-in-interaction.
      and Chinese.
      • Slifka J.
      Some physiological correlates to regular and irregular phonation at the end of an utterance.

      Belotel-Grenié A, Grenié M. The creaky voice phonation and the organization of Chinese discourse, [on-line]. In: International Symposium on Tonal Aspects of Languages: Emphasis on Tone Languages; 2004:28–30.

      Redi and Shattuck-Hufnagel
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.
      conducted a study to assess the linguistic markers of glottalization in American English. They used a sentence-reading task for which several factors were controlled (text, segmental context, position within utterances, and prosodic location). They found (1) a high rate of glottalization of words at the end of utterances and (2) a higher rate of glottalization at the boundaries of phrases using full intonation and at intermediate intonation of phrases. Surana and Slifka

      Surana K, Slifka J. Is irregular phonation a reliable cue towards the segmentation of continuous speech in American English? In: ICSA International Conference on Speech Prosody. Dresden, Germany; 2006.

      used a phonetically balanced database (Texas Instruments/Massachusetts Institute of Technology) comprising read and isolated utterances produced by American-English speakers of two different dialects. They found that 78% of the irregular phonations were prevalent at word boundaries and 5% occurred at syllable boundaries. Furthermore, of those found at the syllable boundaries, 72% were positioned at the junction of a compound-word (e.g., “outcast”) or at the junction of a base word and a suffix. Finally, 70% of the glottalizations, which did not occur at word boundaries, occurred next to voiceless consonants mostly in the utterance-final position.

      Surana K, Slifka J. Is irregular phonation a reliable cue towards the segmentation of continuous speech in American English? In: ICSA International Conference on Speech Prosody. Dresden, Germany; 2006.

      There are striking interspeaker differences that have been documented in American English as well as in other languages.
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.

      Hedelin P, Huber D. Pitch period determination of aperiodic speech signals. In: Proceedings of ICASSP. Albuquerque, New Mexico; 1990:361–364

      Redi and Shattuck-Hufnagel
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.
      showed that some American-English speakers glottalized frequently (88%) while others glottalized rarely (13%). Interestingly, Bohm and Shattuck-Hufnagel

      Bohm T, Shattuck-Hufnagel S. Utterance final glottalization as cue for familiar speaking recognition. In: INTERSPEECH-2007; 2007:2657–2660.

      showed that such speaker differences in rate of glottalization may be a cue to speaker identification. Furthermore, Slifka
      • Slifka J.
      Some physiological correlates to regular and irregular phonation at the end of an utterance.
      reported that some speakers tend to glottalize more often in utterance-final position than in any other utterance position. In summary, findings show that vocal fry or glottalized speech is useful in speaker identification and also serves as a marker of syntactic boundaries. It generally co-occurs with the end of a breath, phonation-ending laryngeal gestures, and a drop in alveolar pressure.
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.

      Bohm T, Shattuck-Hufnagel S. Utterance final glottalization as cue for familiar speaking recognition. In: INTERSPEECH-2007; 2007:2657–2660.

      • Slifka J.
      Some physiological correlates to regular and irregular phonation at the end of an utterance.
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      In conclusion, the research reviewed above shows that vocal fry is evident in abnormal voice and is commonly used by clinicians in diagnostic assessments to identify vocal pathology. Importantly, research has highlighted the evidence of vocal fry in normal voice with distinct vocal characteristics. Normal speakers are able to simulate or match the vocal fry register, analogous to the capability of switching into falsetto register. Furthermore, data in psycholinguistic research show that speakers of different languages use vocal fry (glottalizations) as a means to convey linguistic and paralinguistic information. Hence, vocal fry as a sign of vocal pathology and vocal fry as a normal phonational register are not mutually exclusive. Speakers with normal laryngeal functioning have the option to switch from the modal register to vocal fry at a given moment. Naturally, an individual with vocal pathology and disordered laryngeal function will lack this ability to control vocal fry.
      Although there is significant research describing the characteristics of vocal fry in both normal and abnormal speakers, there is little information on the prevalence of vocal fry in standard American-English speakers with no vocal pathology. The paucity of research systematically investigating the prevalence of vocal fry in normal populations, together with casual observations and anecdotal reports of its occurrence in young standard American-English speakers, provide the underlying rationale for the present study. Hence, the purpose of the present study is (1) to quantify the prevalence of vocal fry in a population of young female SAE college students, and (2) to describe the acoustic characteristics of vocal fry in these speakers using phonetically balanced material.

      Method

      Participants

      Thirty-four female speakers were recorded. All participants were between the ages of 18 and 25. All were native English speakers. They did not have any known history of anatomical or physiological vocal pathology and no known hearing disorders.
      A power analysis indicated that 33 subjects were needed to have 80% power for detecting a medium sized effect when using the traditional 0.05 criterion of statistical significance.

      Apparatus

      The tokens were recorded on one channel of a CD recording system (Superscope, model PSD 300; Superscope Technologies, Geneva, IL), using a high definition microphone (Shure, Beta 58A; Shure Inc, Niles, IL) positioned on a stand, 30 cm from the speaker’s mouth. All files were then imported from the left channel of the recordings onto a PC at a sampling rate of >44,000 samples per second. Praat software

      Boersma P, Weenink D. Praat: doing phonetics by computer (Version 5.1.05) [Computer program]. Available at: http://www.praat.org/1. Last accessed May 1, 2009.

      was used for the acoustic measurements and for the listening tests. A MacBookPro (Apple Inc, Cupertino, CA) was used to run the perceptual tests.

      Design and material

      This was a within-subject design, whereby all participants underwent all conditions (production of the vowel /a/ and reading of the first six sentences of the Rainbow passage). The Rainbow passage was used for the reading task in this study. Although conversational speech is the ideal speech task for the purpose of the study, a reading task was selected because it enables the investigator to control for identical sentences across all subjects. The Rainbow passage was chosen specifically because of its clarity and appropriate length. It has also been selected because it is a phonetically balanced passage, commonly used both in research and in clinical settings.

      Procedure

      The speakers were seated comfortably with legs uncrossed in a sound attenuated room. Participants were asked to maintain a good upright posture and to try to find a comfortable position. The speakers were asked to move minimally during the session. To keep the distance between the mouth of the speaker and the microphone somewhat constant, a 30-cm folder was affixed to the stand to which the microphone was attached. The speakers were asked to keep this distance constant throughout the recording.
      A licensed speech language pathologist recorded the demographic information and engaged in an informal discussion, during which participants were asked to use their natural speaking intensity and pitch level. At some point, participants were asked whether they felt they were using their natural speaking intensity and pitch level. When they responded positively, recordings commenced.
      Participants were asked to produce the vowel /a/ at their typical pitch and loudness level. They were asked to sustain its production for as long as they could comfortably, using only one breath. They were then asked to read the six sentences of the Rainbow Passage at their most comfortable and natural speaking voice and loudness level.
      The listening tasks consisted of blocked trials of an equal number (three) of sustained vowels, and of selected sentences from the Rainbow passage. Three sentences from the Rainbow passage (#2, 4, and 6) were deliberately chosen to be separated by at least one sentence, so as to eliminate the possible effects of continuation of the same vocal register or general mode of phonation more likely occurring in adjacent sentences. Sentence #2, “the rainbow is a division of white light into many beautiful colors,” ends with a bisyllabic word (“colors”); sentence #4, “there is, according to legend, a boiling pot of gold at one end,” is terminated by a monosyllabic word (“end”), whereas the last token of sentence 6, “when a man looks for something beyond his reach, his friends say he is looking for the pot of gold at the end of the rainbow,” is a bisyllabic word (“rainbow”). These sentences were chosen for the following reasons: (1) three nonadjacent sentences, (2) a mix of mono and bisyllabic words, and (3) all of the words in sentence final position end with a voiced sound which was thought to counteract the tendency to drop pitch and intensity as could be more likely to occur in voiceless sounds.

      Perceptual tests

      The listeners consisted of two speech-language pathologists with an expertise in voice disorders. Before beginning the evaluations, they were trained on 40 samples of speech (i.e., sustained vowels and reading sentences), which were not used in the study. Samples of voice with and without vocal fry were included. They were instructed to evaluate the speech samples using two criteria to determine the presence or the absence of glottal fry: (1) reduced and distinctly lowered pitch and (2) rough gravel like quality. The training lasted until they reached an agreement on 75% of the tokens. During the training, the tokens that were not agreed upon were discussed with respect to the criteria.
      Listeners were asked to judge whether tokens were produced with or without vocal fry. Listeners were asked to select from two possible choices (normal or fry) and then indicate whether fry was perceived at the beginning, middle, or end of the utterance. The listening tasks were conducted on a MacBookPro using a Praat script. The judges were asked to use the two criteria described above. They rated a total of 204 instances ([three repetitions of the vowel /a/+three sentences from the Rainbow passage]×34 speakers). Using a mouse, they chose between two possible responses “FRY” or “NORMAL” displayed on the screen. If they selected “FRY”, they were prompted to judge the location of the sentence where they perceived vocal fry to occur (beginning, middle, or end [“BEG, MID, or END”]). The answers were automatically registered by the computer and stored in a text file.
      A measure of interjudge reliability was calculated using a Kappa statistic separately for the vowels and the sentences. The obtained Kappa statistics were significant for the vowel (0.48) and for the sentences (0.67), indicating a high agreement between the two listeners.

      Dependent variables

      Dependent variables included perceptual and acoustic measures. The perceptual evaluations of three instances of the sustained vowels and sentences of the Rainbow passage were performed by two experienced and licensed speech-language pathologists, trained on the same two criteria for accurate decision making of the presence or the absence of vocal fry. Judges were also asked to indicate the sentence location where they perceived vocal fry to occur (“BEG, MID, or END”).
      The acoustic measures included mean F0, F0 minimum (F0 min), F0 maximum (F0 max), F0 range, jitter local, shimmer local, and harmonic-to-noise ratio (HNR). These acoustic measures are indicators of vocal stability. After the listening tests, post hoc acoustic analyses were performed. They were done only on the portion of the instances in which vocal fry occurred. The goal of this procedure was to assess whether sentences could be differentiated based on these acoustic measures.

      Results

      Results showed the following four main findings: (1) vocal fry was used in sentence reading by more than two-thirds of this population of female college students, (2) vocal fry rarely occurred in sustained vowels, (3) vocal fry occurred most often at the end of utterances, and (4) statistically significant differences were found for several acoustic measures between vocal fry and normal register.

      Perceptual evaluations

      Data indicate that the percentages of tokens perceived as being produced with glottal fry vary with the speaking condition (sustained vowels, reading sentences) and with the listener. The percentages were lowest in the sustained vowel condition (L1=6% and L2=2%) and highest in the reading condition (L1=81%, L2=69%).
      Additional analyses explored whether the fry occurred most frequently at the beginning, middle, or end of the sentences. Data show that the two judges frequently perceived vocal fry at the end of the sentences (L1=92%, L2=88%), rarely in the middle of the sentences (L1=8%, L2=12%), and never in the beginning of the sentences.
      After perceptual evaluations, acoustic analyses were applied to the portion of the sentences in which vocal fry was evident and matched to sentences judged as normal. This was to assess whether sentences could be differentiated with respect to vocal fry based on acoustic measures. Listener agreement had to be adhered to in order for the sentence to be selected for analysis. The individual values were averaged across speakers. Unpaired t tests were conducted on the following measures: (1) mean F0, (2) F0 min, (3) F0 max, (4) F0 range, (5) jitter local, (6) shimmer local, and (7) HNR. Statistically significant differences were found for F0 min [t(33)=2.68, P<0.01], F0 max [t(33)=2.95, P<0.005], F0 range [t(33)=3.38, P<0.001], and for jitter local [t(33)=2.76, P<0.005]. Sentences judged to have vocal fry had significantly lower F0 min, higher F0 max, larger F0 range, and higher jitter local than sentences judged to be normal (Table 1). Note that the F0 means are lower than the typical F0 female average in modal register because F0 measurements were taken at the end of utterances when speakers drop their F0. There were no statistical significant differences in mean F0 [t(32)=1.06, P>0.05], shimmer local [t(33)=0.37, P>0.05] and HNR [t(33)=1.53, P>0.05].
      Table 1Mean and Standard Deviations in Parentheses for Mean F0, F0 Min, F0 Max, F0 Range, Jitter Local, Shimmer Local, and HNR for the Sentences Perceived by the Two Listeners to Have Vocal Fry (N=17) and Sentences Perceived to Have No Vocal Fry (N=17) By the Two Listeners
      Mean F0, (Hz)F0 Min, (Hz)F0 Max, (Hz)F0 Range, (Hz)Jitter Local, %Shimmer Local, %HNR, dB
      Vocal fry133 (33)46 (9)266 (93)220 (90)5.22 (2.37)14.14 (4.36)4.14 (2.21)
      Normal122 (29)55 (9)195 (34)141 (34)3.14 (1.59)13.60 (4.10)5.35 (2.40)
      NSP<0.05P<0.05P<0.05P<0.05NSNS

      Discussion

      The results of this study show that most of this population of SAE female college students uses vocal fry in their sentence reading. Furthermore, when vocal fry is used by these speakers, it is most likely to occur at the end of sentences. These preliminary findings suggest that the use of the vocal fry register is frequent in some adult SAE speakers. Hence, the present findings are in accordance with Hollien et al
      • Hollien H.
      • Moore P.
      • Wendahl R.W.
      • Michel J.F.
      On the nature of vocal fry.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      who postulated that speakers without vocal pathology can selectively use glottal fry in connected speech. The fact that vocal fry is frequently detected in the connected speech condition and rarely in the sustained vowel task further supports the notion that vocal fry is a normal register which speakers, without underlying vocal pathology or abnormal laryngeal physiological functioning, can selectively use. These findings also corroborate anecdotal reports in the study by Gottliebson et al.
      • Gottliebson R.O.
      • Lee L.
      • Weinrich B.
      • Sanders J.
      Voice problems of future speech-language pathologists.
      They noted that college students reported vocal fry to be a selected-speaking pattern among many speakers of this age group. It is possible that these college students have either practiced or observed this vocal register and modeled it to match popular figures. Further research investigating the evidence of vocal fry in college students in other parts of the country are needed before one may generalize these results.
      The present findings are also in agreement with psycholinguistic research reporting the tendency to use vocal fry as a linguistic marker of paragraph and sentence boundaries.
      • Redi L.
      • Shattuck-Hufnagel S.
      Variations in the realization of glottalization in normal speakers.
      • Slifka J.
      Some physiological correlates to regular and irregular phonation at the end of an utterance.

      Lehiste I. Sentence boundaries and paragraph boundaries—perceptual evidence. In: The Elements: A Para-Session on Linguistic Units and Levels. Chicago Linguistics Society; 1979:99–109.

      • Kreiman J.
      Perception of sentence and paragraph boundaries in natural conversation.
      However, unlike what was reported in Gottliebson et al,
      • Gottliebson R.O.
      • Lee L.
      • Weinrich B.
      • Sanders J.
      Voice problems of future speech-language pathologists.
      glottal fry is not continually present in any of our speakers but rather is specifically prominent at the end of sentences. This distinction between continual and sporadic use of vocal fry needs to be further investigated. Hence, future epidemiological studies should examine the prevalence of those two types of vocal fry and monitor their long-term consequences.
      In addition, statistically significant differences between vocal fry and normal registers were found for F0 min, F0 max, F0 range, and jitter local in the directions noted in most studies.
      • Childers D.G.
      • Lee C.K.
      Vocal quality factors: analysis, synthesis, and perception.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      • Chen Y.
      • Robb M.P.
      • Gilbert H.R.
      Electroglottographic evaluation of gender and vowel effects during modal and vocal fry phonation.
      • Blomgren M.
      • Chen Y.
      • Ng M.
      • Gilbert H.
      Acoustic, aerodynamic, and perceptual characteristics of modal and vocal fry registers.
      As expected, the mean F0 was below the average F0, typical of the modal register for females as the measures were made at the end of the sentences, where glottalization is most likely to occur.
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.

      Belotel-Grenié A, Grenié M. The creaky voice phonation and the organization of Chinese discourse, [on-line]. In: International Symposium on Tonal Aspects of Languages: Emphasis on Tone Languages; 2004:28–30.

      Lehiste I. Sentence boundaries and paragraph boundaries—perceptual evidence. In: The Elements: A Para-Session on Linguistic Units and Levels. Chicago Linguistics Society; 1979:99–109.

      • Kreiman J.
      Perception of sentence and paragraph boundaries in natural conversation.
      It should be noted that the findings of the present study are preliminary in nature because the study only assessed (1) female speakers, (2) SAE speakers, (3) college students at one university campus, (4) speech measures based on sentence reading and not conversational speech, and (5) used a sample from one specific geographical area. Therefore, future research should use both reading tasks as well as conversational speech tasks to distinguish between a “social use of vocal fry” versus a “linguistic” use of vocal fry. It may be the case that continual use of vocal fry is more prevalent in social settings, whereas a discrete use of vocal fry is more common in reading sentences. Future studies should include natural conversational speech with peers and with nonpeers and compare these findings with that of a reading task. The inclusion of peers and nonpeers is necessary to assess whether the continual use of glottal fry is most frequent among members of a group than between members of different groups. Furthermore, in a reading task, where a phonetically balanced material can be used, one might examine the phonetic structure of the word or the phonetic variables that may be associated with the use of vocal fry. The relationship between laryngeal-gestures ending-declarative sentences and occurrence of vocal fry should also be examined.
      Empirical studies comparing males and females may also be valuable in drawing epidemiological conclusions. Given that Hollien and Michel
      • Hollien H.
      • Michel J.F.
      Vocal fry as a phonational register.
      reported similar range of F0 range for males and females, we suspect that members of both groups are equally capable of using vocal fry. However, Gottliebson et al
      • Gottliebson R.O.
      • Lee L.
      • Weinrich B.
      • Sanders J.
      Voice problems of future speech-language pathologists.
      reported that all the college students who exhibited continuous vocal fry were females. If substantiated, these observations would suggest that the continual use of fry might be a social identifier, possibly placing women at higher risks of vocal abuse. It would be extremely valuable to do similar epidemiological studies investigating the use of vocal fry in college students of other cultures.
      Future knowledge of the extent of vocal fry usage in college students may have very important long-term consequences for vocal health. Researchers need to consider the long-term consequences of the prolonged use of vocal fry in SAE speakers, its possible contribution to voice misuse, and its implications for vocal hygiene. Colton et al
      • Colton R.H.
      • Casper J.K.
      • Leonard R.
      Understanding Voice Problems: A Physiological Perceptive for Diagnosis and Treatment.
      acknowledge glottal fry as normal mode of vibration but note that the habitual use of fry is atypical and possibly a form of vocal abuse. There are two reasons why vocal fry is referred to above as “misuse” and should be considered by the clinician as a potential hazard to voice health.
      • Colton R.H.
      • Casper J.K.
      • Leonard R.
      Understanding Voice Problems: A Physiological Perceptive for Diagnosis and Treatment.
      (p81) First, because it is difficult to produce glottal fry with sufficient loudness volume, a person may be likely to show increased tension of the vocal mechanism when attempting to increase loudness. Second, individuals who use glottal fry are reported to experience vocal fatigue.
      • Colton R.H.
      • Casper J.K.
      • Leonard R.
      Understanding Voice Problems: A Physiological Perceptive for Diagnosis and Treatment.
      Much future research is needed to understand the degree to which each one of these variables may contribute to vocal health.
      In conclusion, studies on vocal fry are necessary to clarify the distinction between continuous use of vocal fry, which might lead to vocal abuse and the selective use of vocal fry, which is not likely to lead to vocal pathology. Knowledge of the nature of vocal fry (continual, selective), its extent in college populations, and its possible contribution to voice misuse need to be investigated and the implications for vocal hygiene must be considered. Such research is needed to understand the degree to which each one of these variables may contribute to vocal health because they will enable clinicians to distinguish between the two and take appropriate prophylactic measures. Future studies, using varied elicitation speech tasks, different populations, and sample sizes appropriate to the effect size and desired power level, may prove to be valuable to verify the epidemiological prevalence of vocal fry in young speakers and to draw attention to its potential hazards.

      Acknowledgements

      All authors contributed extensively to the conception of the project presented in this article. In addition, all authors contributed equally to the development of the project.

      References

        • Hollien H.
        On vocal register.
        J Phon. 1974; 2: 125-144
      1. Ewender T, Hoffmann S, Pfister B. Nearly perfect detection of continuous F0 contour and frame classification for TTS synthesis. In: Proceedings of Interspeech; 2009:100–103.

        • Redi L.
        • Shattuck-Hufnagel S.
        Variations in the realization of glottalization in normal speakers.
        J Phon. 2001; 29: 407-429
      2. Bohm T, Shattuck-Hufnagel S. Utterance final glottalization as cue for familiar speaking recognition. In: INTERSPEECH-2007; 2007:2657–2660.

        • Slifka J.
        Some physiological correlates to regular and irregular phonation at the end of an utterance.
        J Voice. 2006; 20: 171-186
        • Moore P.
        • Von Leden H.
        Dynamic variations in the vibratory pattern in the normal larynx.
        Folia Phoniatr (Basel). 1958; 10: 205-238
        • Wolfe V.
        • Cornell R.
        • Palmer C.
        Acoustic correlates of pathologic voice types.
        J Speech Hear Res. 1991; 34: 509-516
        • Eskenazi L.
        • Childers D.G.
        • Hicks D.M.
        Acoustic correlates of vocal quality.
        J Speech Hear Res. 1990; 33: 298-306
        • Ylitalo Y.
        • Hammarberg B.
        Voice characteristics, effects of voice therapy, and long-term follow-up of contact granuloma patients.
        J Voice. 1997; 14: 557-566
        • Hollien H.
        • Moore P.
        • Wendahl R.W.
        • Michel J.F.
        On the nature of vocal fry.
        J Speech Hear Res. 1966; 9: 245-247
        • Daniloff R.
        • Schuckers G.
        • Feth L.
        The Physiology of Speech and Hearing: An Introduction.
        Prentice-Hall, Englewood Cliffs, NJ1980
        • Childers D.G.
        • Lee C.K.
        Vocal quality factors: analysis, synthesis, and perception.
        J Acoust Soc Am. 1991; 90: 2394-2410
        • Gottliebson R.O.
        • Lee L.
        • Weinrich B.
        • Sanders J.
        Voice problems of future speech-language pathologists.
        J Voice. 2007; 21: 699-704
        • Hollien H.
        • Michel J.F.
        Vocal fry as a phonational register.
        J Speech Hear Res. 1968; 11: 600-604
        • Chen Y.
        • Robb M.P.
        • Gilbert H.R.
        Electroglottographic evaluation of gender and vowel effects during modal and vocal fry phonation.
        J Speech Lang Hear Res. 2002; 45: 821-829
        • Hollien H.
        • Wendahl R.W.
        Perceptual study of vocal fry.
        J Acoust Soc Am. 1968; 43: 506-509
        • Timcke R.
        • von Leden H.
        • Moore P.
        Laryngeal vibrations: measurements of the glottic wave.
        AMA Arch Otolaryngol. 1958; 68: 1-19
        • McGlone R.E.
        • Shipp T.
        Some physiologic correlates of vocal-fry phonation.
        J Speech Hear Res. 1971; 14: 769-775
        • Hollien H.
        • Girard G.T.
        • Coleman R.F.
        Vocal fold vibratory patterns of pulse register phonation.
        Folia Phoniatr (Basel). 1977; 9: 200-205
        • Whitehead R.L.
        • Mets D.E.
        • Whitehead B.H.
        Vibratory patterns of the vocal folds during pulse register phonation.
        J Acoust Soc Am. 1984; 75: 1293-1297
        • Chen Y.
        • Ng M.
        • Jeng J.
        • Gilbert H.
        Aerodynamic, electroglottographic, and acoustic characteristics of modal and vocal fry registers.
        ASHA. 1996; 38: 62
        • Blomgren M.
        • Chen Y.
        • Ng M.
        • Gilbert H.
        Acoustic, aerodynamic, and perceptual characteristics of modal and vocal fry registers.
        J Acoust Soc Am. 1998; 103: 2649-2658
        • Murry T.
        Subglottal pressure and airflow measures during vocal fry phonation.
        J Speech Hear Res. 1971; 14: 544-551
      3. Belotel-Grenié A, Grenié M. The creaky voice phonation and the organization of Chinese discourse, [on-line]. In: International Symposium on Tonal Aspects of Languages: Emphasis on Tone Languages; 2004:28–30.

      4. Lehiste I. Sentence boundaries and paragraph boundaries—perceptual evidence. In: The Elements: A Para-Session on Linguistic Units and Levels. Chicago Linguistics Society; 1979:99–109.

        • Kreiman J.
        Perception of sentence and paragraph boundaries in natural conversation.
        J Phon. 1982; 10: 163-175
        • Allen J.
        The glottal stop as a junctural correlate in English.
        J Acoust Soc Am. 1970; 40: 57-58
        • Umeda N.
        Occurrence of glottal stops in fluent speech.
        J Acoust Soc Am. 1978; 64: 88-94
        • Dilley L.
        • Shattuck Hufnagel S.
        • Ostendorf M.
        Glottalization of word-initial vowels as function of prosodic structure.
        J Phon. 1996; 24: 423-444
        • Laver J.
        The Phonetic Description of Voice Quality.
        Cambridge University Press, Cambridge, UK1980
      5. Lehiste I. Juncture. In: Proceedings of the 5th International Congress of Phonetic Sciences; 1965:172–200.

        • Lehiste I.
        Suprasegmentals.
        MIT Press, Cambridge, UK1970
        • Ogden R.
        Turn-holding, turn-yielding, and laryngeal activity in Finnish talk-in-interaction.
        J Int Phon Ass. 2001; 31: 139-152
      6. Surana K, Slifka J. Is irregular phonation a reliable cue towards the segmentation of continuous speech in American English? In: ICSA International Conference on Speech Prosody. Dresden, Germany; 2006.

      7. Hedelin P, Huber D. Pitch period determination of aperiodic speech signals. In: Proceedings of ICASSP. Albuquerque, New Mexico; 1990:361–364

      8. Boersma P, Weenink D. Praat: doing phonetics by computer (Version 5.1.05) [Computer program]. Available at: http://www.praat.org/1. Last accessed May 1, 2009.

        • Colton R.H.
        • Casper J.K.
        • Leonard R.
        Understanding Voice Problems: A Physiological Perceptive for Diagnosis and Treatment.
        Lippincott Williams & Wilkins, Baltimore, MD2011