Journal of Voice
Volume 24, Issue 1 , Pages 30-38, January 2010

Perception of Emotional Valences and Activity Levels from Vowel Segments of Continuous Speech

  • Teija Waaramaa

      Affiliations

    • Department of Speech Communication and Voice Research, University of Tampere, Tampere, Finland
    • Corresponding Author InformationAddress correspondence and reprint requests to Teija Waaramaa, Department of Speech Communication and Voice Research, University of Tampere, FIN-33014 Tampere, Finland.
  • ,
  • Anne-Maria Laukkanen

      Affiliations

    • Department of Speech Communication and Voice Research, University of Tampere, Tampere, Finland
  • ,
  • Matti Airas

      Affiliations

    • Department of Signal Processing and Acoustics, Helsinki University of Technology, Espoo, Finland
  • ,
  • Paavo Alku

      Affiliations

    • Department of Signal Processing and Acoustics, Helsinki University of Technology, Espoo, Finland

Accepted 14 April 2008. published online 29 December 2008.

Summary 

This study aimed to investigate the role of voice source and formant frequencies in the perception of emotional valence and psychophysiological activity level from short vowel samples (∼150 milliseconds). Nine professional actors (five males and four females) read a prose passage simulating joy, tenderness, sadness, anger, and a neutral emotional state. The stress carrying vowel [a:] was extracted from continuous speech during the Finnish word [ta:k:ahan] and analyzed for duration, fundamental frequency (F0), equivalent sound level (Leq), alpha ratio, and formant frequencies F1–F4. Alpha ratio was calculated by subtracting the Leq (dB) in the range 50Hz–1kHz from the Leq in the range 1–5kHz. The samples were inverse filtered by Iterative Adaptive Inverse Filtering and the estimates of the glottal flow obtained were parameterized with the normalized amplitude quotient (NAQ = fAC/(dpeakT)). Fifty listeners (mean age 28.5 years) identified the emotional valences from the randomized samples. Multinomial Logistic Regression Analysis was used to study the interrelations of the parameters for perception. It appeared to be possible to identify valences from vowel samples of short duration (∼150 milliseconds). NAQ tended to differentiate between the valences and activity levels perceived in both genders. Voice source may not only reflect variations of F0 and Leq, but may also have an independent role in expression, reflecting phonation types. To some extent, formant frequencies appeared to be related to valence perception but no clear patterns could be identified. Coding of valence tends to be a complicated multiparameter phenomenon with wide individual variation.

Key Words: Voice quality, Perception of emotional valence, Inverse filtering

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

 Part of the paper was presented at The Voice Foundation's 35th Annual Symposium “Care of the Professional Voice,” May 31–June 4, 2006, Philadelphia, and part of it at the “3rd World Voice Conference 2006,” June 20–22, 2006, Istanbul.

PII: S0892-1997(08)00063-5

doi:10.1016/j.jvoice.2008.04.004

Journal of Voice
Volume 24, Issue 1 , Pages 30-38, January 2010