Acoustic Characteristics of Cantonese Speech Through Protective Facial Coverings

  • Ting Zhang
    Department of Linguistics and Translation, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong S.A.R., China
    Search for articles by this author
  • Mosi He
    Department of Linguistics and Translation, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong S.A.R., China
    Search for articles by this author
  • Bin Li
    Address correspondence and reprint requests to Bin Li, Department of Linguistics and Translation, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong S.A.R., China.
    Department of Linguistics and Translation, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong S.A.R., China
    Search for articles by this author
  • Cuiling Zhang
    School of Criminal Investigation, Southwest University of Political Science & Law, Chongqing, China

    Chongqing Institutes of Higher Education Key Forensic Science Laboratory, Chongqing, China
    Search for articles by this author
  • Jinlian Hu
    Department of Biomedical Engineering, City University of Hong Kong, Hong Kong S.A.R., China
    Search for articles by this author
Published:October 18, 2022DOI:



      Protective facial coverings (PFCs) such as surgical masks attenuate speech transmission and affect speech intelligibility, which is reported in languages such as English and German. The present study intended to verify the detrimental impacts on production of tonal languages such as Cantonese, by examining realization of speech correlates in Cantonese under PFCs including facial masks and shields.


      We recorded scripted speech in Hong Kong Cantonese produced by three adult speakers who wore various PFCs, including surgical masks, KF94 masks, and face shields (with and without surgical masks). Spectral and temporal parameters were measured, including mean intensity, speaking rate, long-term amplitude spectrum, formant frequencies of vowels, and duration and fundamental frequency (F0) of tone-bearing parts.


      Significant changes were observed in all acoustic correlates of Cantonese speech under PFCs. Sound pressure levels were attenuated more intensely at ranges of higher frequencies in speech through face masks, whereas sound transmission was affected more at ranges of lower frequencies in speech under face shields. Vowel spaces derived from formant frequencies shrank under all PFCs, with the vowel /aa/ demonstrating largest changes in the first two formants. All tone-bearing parts were shortened and showed increments of F0 means in speech through PFCs. The decrease of tone duration was statistically significant in High-level and Low-level tones, while the increment of F0 means was significant in High-level tone only.


      General filtering effect of PFCs is observed in Cantonese speech data, confirming language-universal patterns in acoustic attenuation by PFCs. The various coverings lower overall intensity levels of speech and degrade speech signal in higher frequency regions. Modification patterns specific to Hong Kong Cantonese are also identified. Vowel space area is reduced and found associated with increased speaking rates. Tones are produced with higher F0s under PFCs, which may be attributed to vocal tension caused by tightened vocal tract during speaking through facial coverings.

      Key Words

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Hu J.
        Respiratory Masks for Combat COVID-19 Pandemic and Public Healthcare.
        in: Singapore International Conference on Public Health 2020 (SICPH-2020), Singapore. 2020 (Available at:) (Accessed September 22, 2021)
        • Corey RM
        • Jones U
        • Singer AC.
        Acoustic effects of medical, cloth, and transparent face masks on speech signals.
        J Acoust Soc Am. 2020; 148: 2371-2375
        • Truong TL
        • Weber A.
        Intelligibility and recall of sentences spoken by adult and child talkers wearing face masks.
        J Acoust Soc Am. 2021; 150: 1674-1681
        • Goldin A
        • Weinstein BE
        • Shiman N.
        How do medical masks degrade speech perception.
        Hearing Rev. 2020; 27 (Available at:) (Accessed September 22, 2021): 8-9
        • Magee M
        • Lewis C
        • Noffs G
        • et al.
        Effects of face masks on acoustic analysis and speech perception: Implications for peri-pandemic protocols.
        J Acoust Soc Am. 2020; 148: 3562-3568
        • Nguyen DD
        • McCabe P
        • Thomas D
        • et al.
        Acoustic voice characteristics with and without wearing a facemask.
        Sci Rep-UK. 2021; 11: 1-11
        • Gutz SE
        • Rowe HP
        • Green JR.
        Speaking with a KN95 face mask: ASR performance and speaker compensation.
        in: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021 (Available at:) (Accessed September 22, 2021)
        • Palmiero AJ
        • Symons D
        • Morgan JW
        • et al.
        Speech intelligibility assessment of protective facemasks and air-purifying respirators.
        J Occup Environ Hyg. 2016; 13: 960-968
        • Erickson D
        • Iwata R
        • Endo M
        • et al.
        Effect of tone height on jaw and tongue articulation in Mandarin Chinese.
        in: International symposium on tonal aspects of languages: With emphasis on tone languages. 2004 (Available at:) (Accessed September 22, 2021)
        • Wetherell A.
        Defensive scientific and technical laboratory.
        The UK General Service Respirator. Porton Down, United Kingdom2003
        • Sakayori S
        • Kitama T
        • Chimoto S
        • et al.
        Critical spectral regions for vowel identification.
        Neurosci Res. 2002; 43: 155-162
        • Cheang HS
        • Pell MD.
        Acoustic markers of sarcasm in Cantonese and English.
        J Acoust Soc Am. 2009; 126: 1394-1405
        • Xu W
        • Han D
        • Li H
        • et al.
        Application of the mandarin Chinese version of the voice handicap index.
        J Voice. 2010; 24: 702-707
      1. Svec JG, Granqvist S. Guidelines for selecting microphones for human voice production research. 2010. Accessed August 15, 2022.

        • Oliveira G
        • Fava G
        • Baglione M
        • et al.
        Mobile digital recording: adequacy of the iRig and iOS device for acoustic and perceptual analysis of normal voice.
        J Voice. 2017; 31: 236-242
      2. Boersma P, Weenink D. Praat: doing phonetics by computer [Computer program]. Version 6.0. 2018, February 3.

        • Monson BB
        • Caravello J.
        The maximum audible low-pass cutoff frequency for speech.
        J Acoust Soc Am. 2019; 146: EL496-EL501
      3. Daniel RM. phonR: tools for phoneticians and phonologists. R package version 1.0-7. 2016. Available at: Accessed May 13, 2022.

      4. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at: Accessed May 13, 2022

        • Bates D
        • Mächler M
        • Bolker B
        • et al.
        Fitting linear mixed-effects models using lme4.
        J Stat Softw. 2015; 67: 1-48
      5. Wong WYP. Syllable fusion and speech rate in Hong Kong Cantonese. In Speech Prosody 2004, International Conference. Available at: Accessed September 22, 2021.

      6. Bauer R, Benedict P. Modern Cantonese Phonology. De Gruyter Mouton; 2011.

        • Mok PP
        • Zuo D
        • Wong PW.
        Production and perception of a sound change in progress: Tone merging in Hong Kong Cantonese.
        Lang Var Change. 2013; 25: 341-370
        • Li B
        • Guan Y.
        Generational differences in production of a tonal contrast in Hong Kong Cantonese.
        in: Proceedings of the 19th International Congress of Phonetic Sciences. 2019: 186-190 (Available at:) (Accessed March 6, 2022)
        • Fiorella ML
        • Cavallaro G
        • Di Nicola V
        • et al.
        Voice differences when wearing and not wearing a surgical mask.
        J Voice. 2021;
        • Atcherson SR
        • McDowell BR
        • Howard MP.
        Acoustic effects of non-transparent and transparent face coverings.
        J Acoust Soc Am. 2021; 149: 2249-2254
        • Hazan V
        • Tuomainen O
        • Kim J
        • et al.
        Clear speech adaptations in spontaneous speech produced by young and older adults.
        J Acoust Soc Am. 2018; 144: 1331-1346
        • Lee J
        • Shaiman S
        • Weismer G
        Relationship between tongue positions and formant frequencies in female speakers.
        J Acoust Soc Am. 2016; 139: 426-440
        • Joshi A
        • Procter T
        • Kulesz PA.
        COVID-19: acoustic measures of voice in individuals wearing different facemasks.
        J Voice. 2021;
        • Raphael LJ
        • Borden GJ
        • Harris KS.
        Speech science primer: Physiology, acoustics, and perception of speech.
        Lippincott Williams & Wilkins, 2007
        • Zhang C
        • Morrison GS
        • Ochoa F
        • et al.
        Reliability of human-supervised formant-trajectory measurement for forensic voice comparison.
        J Acoust Soc Am. 2013; 133: EL54-EL60
        • Wong P
        • Chan HY.
        Acoustic characteristics of highly distinguishable Cantonese entering and non-entering tones.
        J Acoust Soc Am. 2018; 143: 765-779
        • Karagkouni O.
        The effects of the use of protective face mask on the voice and its relation to self-perceived voice changes.
        J Voice. 2021;
        • Gama R
        • Castro ME
        • van Lith-Bijl JT
        • et al.
        Does the wearing of masks change voice and speech parameters?.
        Eur Arch Oto-Rhino-L. 2021; : 1-8
        • Guzman M
        • Castro C
        • Testart A
        • et al.
        Laryngeal and pharyngeal activity during semioccluded vocal tract postures in subjects diagnosed with hyperfunctional dysphonia.
        J Voice. 2013; 27: 709-716