Advertisement
Research Article|Articles in Press

Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice

      Abstract

      Purpose

      The purpose of this study was to assess the relationship and comparability of cepstral and spectral measures of voice obtained from a high-cost “flat” microphone and precision sound level meter (SLM) vs. high-end and entry level models of commonly and currently used smartphones (iPhone i12 and iSE; Samsung s21 and s9 smartphones). Device comparisons were also conducted in different settings (sound-treated booth vs. typical “quiet” office room) and at different mouth-to-microphone distances (15 and 30 cm).

      Methods

      The SLM and smartphone devices were used to record a series of speech and vowel samples from a prerecorded diverse set of 24 speakers representing a wide range of sex, age, fundamental frequency (F0), and voice quality types. Recordings were analyzed for the following measures: smoothed cepstral peak prominence (CPP in dB); the low vs high spectral ratio (L/H Ratio in dB); and the Cepstral Spectral Index of Dysphonia (CSID).

      Results

      A strong device effect was observed for L/H Ratio (dB) in both vowel and sentence contexts and for CSID in the sentence context. In contrast, device had a weak effect on CPP (dB), regardless of context. Recording distance was observed to have a small-to-moderate effect on measures of CPP and CSID but had a negligible effect on L/H Ratio. With the exception of L/H Ratio in the vowel context, setting was observed to have a strong effect on all three measures. While these aforementioned effects resulted in significant differences between measures obtained with SLM vs. smartphone devices, the intercorrelations of the measurements were extremely strong (r's > 0.90), indicating that all devices were able to capture the range of voice characteristics represented in the voice sample corpus. Regression modeling showed that acoustic measurements obtained from smartphone recordings could be successfully converted to comparable measurements obtained by a "gold standard" (precision SLM recordings conducted in a sound-treated booth at 15 cm) with small degrees of error.

      Conclusions

      These findings indicate that a variety of commonly available modern smartphones can be used to collect high quality voice recordings usable for informative acoustic analysis. While device, setting, and distance can have significant effects on acoustic measurements, these effects are predictable and can be accounted for using regression modeling.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Voice
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Petrizzo D
        • Popolo PS
        Smartphone Use in Clinical Voice Recording and Acoustic Analysis: A Literature Review.
        J Voice. 2021; 35: 499.e23-499.e28https://doi.org/10.1016/J.JVOICE.2019.10.006
        • Jannetts S
        • Schaeffler F
        • Beck J
        • Cowen S
        Assessing voice health using smartphones: bias and random error of acoustic voice parameters captured by different smartphone types.
        Int J Lang Commun Disord. 2019; 54: 292-305https://doi.org/10.1111/1460-6984.12457
        • Grillo EU
        • Brosious JN
        • Sorrell SL
        • Anand S
        Influence of Smartphones and Software on Acoustic Voice Measures.
        Int J Telerehabil. 2016; 8: 9-14https://doi.org/10.5195/ijt.2016.6202
        • Kim GH
        • Kang DH
        • Lee YY
        • et al.
        Recording Quality of Smartphone for Acoustic Analysis.
        Journal of Clinical Otolaryngology Head and Neck Surgery. 2016; 27: 286-294https://doi.org/10.35420/JCOHNS.2016.27.2.286
        • Maryn Y
        • Ysenbaert F
        • Zarowski A
        • Vanspauwen R
        Mobile Communication Devices, Ambient Noise, and Acoustic Voice Measures.
        J Voice. 2017; 31: 248.e11-248.e23https://doi.org/10.1016/J.JVOICE.2016.07.023
        • Guan Y
        • Li B
        Usability and Practicality of Speech Recording by Mobile Phones for Phonetic Analysis.
        in: 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021. Published online January 24, 2021https://doi.org/10.1109/ISCSLP49672.2021.9362082
        • Zhang C
        • Jepson K
        • Lohfink G
        • Arvaniti A
        Comparing acoustic analyses of speech data collected remotely.
        J Acoust Soc Am. 2021; 149: 3910-3916https://doi.org/10.1121/10.0005132
        • Uloza V
        • Ulozaitė-Stanienė N
        • Petrauskas T
        • Kregždytė R
        Accuracy of Acoustic Voice Quality Index Captured With a Smartphone – Measurements With Added Ambient Noise.
        Journal of Voice. Published online 2021; https://doi.org/10.1016/j.jvoice.2021.01.025
        • Ulozaite-Staniene N
        • Petrauskas T
        • Šaferis V
        • Uloza V
        Exploring the feasibility of the combination of acoustic voice quality index and glottal function index for voice pathology screening.
        Eur Arch Otorhinolaryngol. 2019; 276: 1737-1745https://doi.org/10.1007/S00405-019-05433-5
        • Marsano-Cornejo MJ
        • Roco-Videla Á
        Comparison of the acoustic parameters obtained with different smartphones and a professional microphone.
        Acta Otorrinolaringol Esp. 2021; 73: 51-55https://doi.org/10.1016/J.OTORRI.2020.08.006
      1. Ge C, Xiong Y, Mok P. How reliable are phonetic data collected remotely? Comparison of recording devices and environments on acoustic measurements. Published online 2021. doi:10.21437/Interspeech.2021-1122

      2. State by State: The Most Popular Android Phones in the US | PCMag. Accessed September 6, 2022. https://www.pcmag.com/news/state-by-state-the-most-popular-android-phones-in-the-us

      3. These are the most popular iOS and Android devices in North America by active use - PhoneArena. Accessed September 6, 2022. https://www.phonearena.com/news/most-popular-active-ios-android-devices-north-america-market-report_id132703

      4. US Smartphone Market Grows 19% YoY in Q1 2021. Accessed September 6, 2022. https://www.counterpointresearch.com/us-smartphone-market-q1-2021/

      5. • US smartphone market share by vendor 2016-2022 | Statista. Accessed September 6, 2022. https://www.statista.com/statistics/620805/smartphone-sales-market-share-in-the-us-by-vendor/

        • Awan SN
        • Shaikh MA
        • Desjardins M
        • Feinstein H
        • Abbott KV
        The Effect of Microphone Frequency Response on Spectral and Cepstral Measures of Voice: An Examination of Low-Cost Electret Headset Microphones.
        Am J Speech Lang Pathol. 2022; 31: 959-973https://doi.org/10.1044/2021_AJSLP-21-00156
      6. Boersma P, Weenink D. Praat. Published online 2016. http://www.fon.hum.uva.nl/praat/

        • Patel RR
        • Awan S
        • Barkmeier-kraemer J
        • et al.
        Recommended Protocols for Instrumental Assessment of Voice: American Speech- Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function.
        Am J Speech Lang Pathol. 2018; 27: 887-905
        • Maryn Y
        • Roy N
        • de Bodt M
        • van Cauwenberge P
        • Corthals P
        Acoustic measurement of overall voice quality: a meta-analysis.
        J Acoust Soc Am. 2009; 126: 2619-2634https://doi.org/10.1121/1.3224706
        • Awan SN
        • Roy N
        • Jetté ME
        • Meltzner GS
        • Hillman RE
        Quantifying dysphonia severity using a spectral/cepstral-based acoustic index: Comparisons with auditory-perceptual judgements from the CAPE-V.
        Clin Linguist Phon. 2010; 24: 742-758https://doi.org/10.3109/02699206.2010.492446
        • Gillespie AI
        • Gartner-Schmidt J
        • Lewandowski A
        • Awan SN
        An Examination of Pre- and Posttreatment Acoustic Versus Auditory Perceptual Analyses of Voice Across Four Common Voice Disorders.
        Journal of Voice. 2018; 32: 169-176https://doi.org/10.1016/j.jvoice.2017.04.018
        • Awan S
        • Helou L
        • Stojadinovic A
        • Solomon N
        Tracking voice change after thyroidectomy: application of spectral/cepstral analyses.
        Clin Linguist Phon. 2011; 25: 302-320https://doi.org/10.3109/02699206.2010.535646
        • Alharbi GG
        • Cannito MP
        • Buder EH
        • Awan SN
        Spectral/Cepstral Analyses of Phonation in Parkinson's Disease before and after Voice Treatment: A Preliminary Study.
        Folia Phoniatrica et Logopaedica. 2019; 71: 275-285
        • Awan S
        • Roy N
        Outcomes measurement in voice disorders: application of an acoustic index of dysphonia severity.
        J Speech Lang Hear Res. 2009; 52: 482-499https://doi.org/10.1044/1092-4388(2009/08-0034
        • Hillenbrand J
        • Houde R
        Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech.
        J Speech Hear Res. 1996; 39: 311-321
        • Awan S
        • Roy N
        Acoustic prediction of voice type in women with functional dysphonia.
        J Voice. 2005; 19: 268-282https://doi.org/10.1016/j.jvoice.2004.03.005
        • Hillenbrand J
        • Cleveland R
        • Erickson RL
        Acoustic correlates of breathy vocal quality.
        J Speech Hear Res. 1994; 37: 769-778
        • Kempster GB
        • Gerratt BR
        • Verdolini Abbott K
        • Barkmeier-Kraemer J
        • Hillman RE
        Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol.
        American journal of speech-language pathology /American Speech-Language-Hearing Association. 2009; 18: 124-132https://doi.org/10.1044/1058-0360(2008/08-0017
        • Awan S
        Analysis of Dysphonia in Speech and Voice (ADSV): An Application Guide.
        KayPENTAX, Inc./Pentax Medical, Inc., Montvale NJ2011
        • Awan SN
        • Solomon NP
        • Helou LB
        • Stojadinovic A
        Spectral-cepstral estimation of dysphoria severity: External validation.
        Annals of Otology, Rhinology and Laryngology. 2013; 122: 40-48
        • Awan S
        • Roy N
        • Cohen S
        Exploring the relationship between spectral and cepstral measures of voice and the voice handicap index (VHI).
        Journal of Voice. 2014; 28: 430-439https://doi.org/10.1016/j.jvoice.2013.12.008
        • Cavalcanti JC
        • Englert M
        • Oliveira M
        • Constantini AC
        Microphone and Audio Compression Effects on Acoustic Voice Analysis: A Pilot Study.
        Journal of Voice. Published online 2021; https://doi.org/10.1016/j.jvoice.2020.12.005
      7. JASP Team. JASP. Published online 2022. https://jasp-stats.org/

      8. R Core Team. R: A language and environment for statistical computing. Published online 2020. https://www.r-project.org/

        • Cohen J
        Statistical Power Analysis for the Behavioral Sciences.
        2nd ed. L. Erlbaum Associates, Hillsdale NJ1988
      9. Widder J, Morcelli A. Basic principles of MEMS microphones. EDN. Published 2014. Accessed September 6, 2022. https://www.edn.com/basic-principles-of-mems-microphones/

        • Karnell MP
        • Melton SD
        • Childes JM
        • Coleman TC
        • Dailey SA
        • Hoffman HT
        Reliability of Clinician-Based (GRBAS and CAPE-V) and Patient-Based (V-RQOL and IPVI) Documentation of Voice Disorders.
        Journal of Voice. 2007; 21: 576-590https://doi.org/10.1016/J.JVOICE.2006.05.001
        • Helou LB
        • Solomon NP
        • Henry LR
        • Coppit GL
        • Howard RS
        • Stojadinovic A
        The role of listener experience on Consensus Auditory-perceptual Evaluation of Voice (CAPE-V) ratings of postthyroidectomy voice.
        American journal of speech-language pathology /American Speech-Language-Hearing Association. 2010; 19: 248-258https://doi.org/10.1044/1058-0360(2010/09-0012
        • Kelchner LN
        • Brehm SB
        • Weinrich B
        • et al.
        Perceptual Evaluation of Severe Pediatric Voice Disorders: Rater Reliability Using the Consensus Auditory Perceptual Evaluation of Voice.
        Journal of Voice. 2010; 24: 441-449https://doi.org/10.1016/J.JVOICE.2008.09.004
        • Majd NS
        • Khoddami SM
        • Drinnan M
        • Kamali M
        • Amiri-Shavaki Y
        • Fallahian N
        Validity and rater reliability of Persian version of the Consensus Auditory Perceptual Evaluation of Voice.
        Audiology. 2014; 23: 65-74
        • Ertan-Schlüter E
        • Demirhan E
        • Ünsal EM
        • Tadıhan-Özkan E
        The Turkish version of the consensus auditory-perceptual evaluation of voice (CAPE-V): A reliability and validity study.
        Journal of Voice. 2020; 34: 965.e13-965.e22https://doi.org/10.1016/J.JVOICE.2019.05.014
        • Zraick RI
        • Kempster GB
        • Connor NP
        • et al.
        Establishing Validity of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V).
        Am J Speech Lang Pathol. 2011; 20: 14-22https://doi.org/10.1044/1058-0360(2010/09-0105
        • Watts CR
        • Awan SN
        • Maryn Y
        A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs.
        Journal of Voice. 2017; 31https://doi.org/10.1016/j.jvoice.2016.09.012