Perception of Emotional Valences and Activity Levels from Vowel Segments of Continuous Speech

Published:December 29, 2008DOI:


      This study aimed to investigate the role of voice source and formant frequencies in the perception of emotional valence and psychophysiological activity level from short vowel samples (∼150 milliseconds). Nine professional actors (five males and four females) read a prose passage simulating joy, tenderness, sadness, anger, and a neutral emotional state. The stress carrying vowel [a:] was extracted from continuous speech during the Finnish word [ta:k:ahan] and analyzed for duration, fundamental frequency (F0), equivalent sound level (Leq), alpha ratio, and formant frequencies F1–F4. Alpha ratio was calculated by subtracting the Leq (dB) in the range 50 Hz–1 kHz from the Leq in the range 1–5 kHz. The samples were inverse filtered by Iterative Adaptive Inverse Filtering and the estimates of the glottal flow obtained were parameterized with the normalized amplitude quotient (NAQ = fAC/(dpeakT)). Fifty listeners (mean age 28.5 years) identified the emotional valences from the randomized samples. Multinomial Logistic Regression Analysis was used to study the interrelations of the parameters for perception. It appeared to be possible to identify valences from vowel samples of short duration (∼150 milliseconds). NAQ tended to differentiate between the valences and activity levels perceived in both genders. Voice source may not only reflect variations of F0 and Leq, but may also have an independent role in expression, reflecting phonation types. To some extent, formant frequencies appeared to be related to valence perception but no clear patterns could be identified. Coding of valence tends to be a complicated multiparameter phenomenon with wide individual variation.

      Key Words

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Murray I.R.
        • Arnott J.L.
        Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion.
        J Acoust Soc Am. 1993; 93: 1097-1108
        • Ladd D.R.
        • Silverman K.E.A.
        • Tolkmitt F.
        • Bergmann G.
        • Scherer K.R.
        Evidence for the independent function of intonation contour type, voice quality, and F0 range in signaling speaker affect.
        J Acoust Soc Am. 1985; 78: 435-444
        • Cowie R.
        • Cornelius R.R.
        Describing the emotional states that are expressed in speech.
        Speech Commun. 2003; 40: 1-33
        • Laukkanen A.-M.
        • Vilkman E.
        • Alku P.
        • Oksanen H.
        Physical variations related to stress and emotional state: a preliminary study.
        J Phonet. 1996; 24: 313-335
        • Lieberman P.
        • Michaels S.B.
        Some aspects of fundamental frequency and envelope amplitude as related to the emotional content of speech.
        J Acoust Soc Am. 1962; 34: 922-927
        • Laukkanen A.-M.
        • Vilkman E.
        • Alku P.
        • Oksanen H.
        On the perception of emotions in speech: the role of voice quality.
        Log Phon Voc. 1997; 22: 157-168
        • Banse R.
        • Scherer K.R.
        Acoustic Profiles in Vocal Emotion Expression.
        J Personality and Soc Psychol. 1996; 70: 614-636
        • Scherer K.R.
        Vocal communication of emotion: a review of research paradigms.
        Speech Commun. 2003; 40: 227-256
        • Laver J.
        The phonetic description of voice quality.
        Cambridge University Press, Great Britain1980
        • Granqvist S.
        • Hertegård S.
        • Larsson H.
        • Sundberg J.
        Simultaneous analysis of vocal vibration and transglottal airflow; exploring a new experimental set-up.
        STL-QPSR. 2003; 45: 35-46
      1. Rothenberg M. Some relations between glottal air flow and vocal fold contact area. Accessed July 5, 2006. Available at

        • Sundberg J.
        • Gauffin J.
        Waveform and spectrum of the glottal voice source.
        STL-QPSR. 1978; : 35-50
        • Gauffin J.
        • Sundberg J.
        Spectral correlates of glottal voice source waveform characteristics.
        J Speech and Hear Res. 1989; 32: 556-565
        • Cummings K.E.
        • Clementes M.A.
        Analysis of the glottal excitation of emotionally styled and stressed speech.
        J Acoust Soc Am. 1995; 98: 88-98
        • Alku P.
        • Bäckström T.
        • Vilkman E.
        Normalized amplitude quotient for parametrization of the glottal flow.
        J Acoust Soc Am. 2002; 112: 701-710
        • Airas M.
        • Alku P.
        Emotions in vowel segments of continuous speech: analysis of the glottal flow using the normalised amplitude quotient.
        Phonetica. 2006; 63: 26-46
        • Fant G.
        • Lin Q.
        Glottal source-vocal tract acoustic interaction.
        STL-QPSR. 1987; 1: 13-27
        • Granström B.
        • Nord L.
        Ways of exploring speaker characteristics and speaking styles.
        Proceedings of the 12th International Congress of Phonetic Sciences. 1991; (19–24 August Aix-en-Provence): 278-281
        • Laukkanen A.-M.
        On speaking voice exercises. A study on the acoustic and physiological effects of speaking voice exercises applying manipulation of the acoustic-aerodynamic state of the supraglottic space and artificially modified auditory feedback. Doctoral dissertation.
        Medical School. University of Tampere, Finland1995
        • Alku P.
        • Vilkman E.
        • Laukkanen A.-M.
        Estimation of amplitude features of the glottal flow by inverse filtering speech pressure signals.
        Speech Commun. 1998; 24: 123-132
        • Sonesson B.
        On the anatomy and vibratory pattern of the human vocal folds. With special reference to a photo-electrical method for studying the vibratory movements. In: Acta Oto-laryngologica, supplementum 156.
        Sweden: Department of Anatomy and Department of Otolaryngology. University of Lund, Lund1960
        • Sundberg J.
        • Andersson M.
        • Hultqvist C.
        Effects of subglottal pressure variation on professional baritone singers' voice sources.
        J Acoust Soc Am. 1999; 105: 1965-1971
        • Sundberg J.
        • Fahlstedt E.
        • Morell A.
        Effects on the glottal voice source of vocal loudness variation in untrained female and male voices.
        J Acoust Soc Am. 2005; 117: 879-885
        • Sundberg J.
        • Titze I.
        • Scherer R.
        Phonatory control in male singing: a study of the effects of subglottal pressure, fundamental frequency, and mode of phonation on the voice source.
        J Voice. 1993; 7: 15-29
        • Holmberg E.B.
        • Hillman R.E.
        • Perkell J.S.
        Glottal airflow and transglottal air pressure measurements for male and female speakers in soft, normal, and loud voice.
        J Acoust Soc Am. 1988; 84: 511-529
        • Vilkman E.
        • Alku P.
        • Vintturi J.
        Dynamic extremes of voice in the light of time domain parameters extracted from the amplitude features of glottal flow and its derivative.
        Folia Phoniatr logop. 2002; 54: 144-157
        • Holmberg E.B.
        • Hillman R.E.
        • Perkell J.S.
        Glottal airflow and transglottal air pressure measurements for male and female speakers in low, normal, and high pitch.
        J Voice. 1989; 3: 294-305
        • Shipp T.
        • Izdebski K.
        Vocal frequency and vertical larynx positioning by singers and nonsingers.
        J Acoust Soc Am. 1975; 58: 5
        • Story B.H.
        • Laukkanen A.-M.
        • Titze I.R.
        Acoustic impedance of an artificially lengthened and constricted vocal tract.
        J Voice. 2000; 14: 455-469
        • Fant G.
        Acoustic Theory of Speech Production. With Calculations Based on X-Ray Studies of Russian Articulations.
        2nd ed. Mouton, The Hague1970
        • Gobl C.
        • Ní Chasaide A.
        The role of voice quality in communicating emotion, mood and attitude.
        Speech Commun. 2003; 40: 189-212
      2. Waaramaa T, Laukkanen A-M, Alku P, Björkner E, Leino T. Perception of emotions in mono-pitched vowels. In: Rantala L. (toim./ed.), Puheopin laitos. Raportteja 5/2007. Tampereen yliopisto. (Department of Speech Communication and Voice Research. Reports 5/2007. University of Tampere.)

        • Toivanen J.
        • Waaramaa T.
        • Alku P.
        • Laukkanen A.-M.
        • Seppänen T.
        • Väyrynen E.
        • Airas M.
        Emotions in /a:/: a perceptual and acoustic study.
        Log Phon Voc. 2006; 31: 43-48
        • Waaramaa T.
        • Alku P.
        • Laukkanen A.-M.
        The role of F3 in the vocal expression of emotions.
        Log Phon Voc. 2006; 31: 153-156
        • Bostanov V.
        • Kotchoubey B.
        Recognition of affective prosody: continuous wavelet measures of event-related brain potentials to emotional exclamations.
        Psychophysiology. 2004; 41: 259-268
        • Frøkjær-Jensen B.
        • Prytz S.
        Registration of voice quality.
        Brüel & Kjær Technical Review. 1973; 3: 3-17
        • Alku P.
        Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering.
        Speech Commun. 1992; 11: 109-118
        • Bäckström T.
        • Alku P.
        • Vilkman E.
        Time-domain parameterization of the closing phase of glottal airflow waveform from voices over a large intensity range.
        IEEE Trans Speech Audio Process. 2002; 10: 186-192
        • Damasio A.
        Looking for Spinoza: Joy, Sorrow, and the Feeling Brain.
        A Harvest Book, Harcourt, Inc. USA. 2003;
        • Zei Pollermann B.
        A place for prosody in a unified model of cognition and emotion.
        Speech Prosody 2002. Aix-en-Provence, France. April 11–13, 2002; (SP-2002: 17–22)
        • Scherer K.R.
        Vocal communication of emotion: a review of research paradigms.
        Speech Commun. 2003; 40: 227-256
        • Feldman Barrett L.
        Are emotions natural kind?.
        Perspectives on Psychol Sci. 2006; 1: 1