Summary
Objectives
Methods
Results
Conclusion
Key Words
Introduction
Previous work
- Gelzinis A.
- Verikas A.
- Vaiciukynas E.
- Bacauskiene M.
- Minelga J.
- Hållander M.
- et al.
Contribution of present work
Material and methods
Material

Inverse filtering
Granqvist S. Sopran [Computer program]. 2022. https://tolvan.com/index.php?page=/sopran/sopran.php.
Feature extraction
- Fundamental frequency (fo)
- AC flow (ACF): the difference between the maximum and the minimum of the glottal flow waveform.
- Maximum flow declination rate (MFDR): the maximum negative slope of the glottal flow airflow waveform.
- Amplitude quotient (AQ): the ratio between peak-to-peak flow amplitude and maximum flow declination rate (ACF/MFDR).
- Normalized amplitude quotient (NAQ): the ratio between amplitude quotient and period (AQ/T).
- L1L2: the level of the fundamental relative to the level of the second harmonic.
- Closed quotient (CQ): the ratio between the closed phase duration and period.
- Skewing quotient (SQ): the ratio between the opening and closing phase durations.
Boersma P., Weenink D.. Praat: doing phonetics by computer. 2021. Computer program, http://www.praat.org/.
- Smoothed cepstral peak prominence (CPPS): the amplitude of the first rahmonic relative to the regression line across the real cepstrum of the signal.20
- Alpha ratio: the ratio of acoustic energy in the high (1-5 kHz) and the low (0-1 kHz) frequency bands: , in dB.
- L1: The level of the fundamental.
- L1L2: The level of the fundamental relative to the level of the second harmonic: .
- Harmonic richness factor (HRF): The amplitude of the fundamental relative to the summed amplitudes of to : (, in dB48.
Data analysis
Results
Classification experiments





Relationships between the voice source, accelerometer, and audio signals
fo | Psub | ACF | MFDR | AQ | NAQ | CQ | SQ | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
VS | Psub | ||||||||||
ACF | |||||||||||
MFDR | |||||||||||
AQ | |||||||||||
NAQ | |||||||||||
CQ | |||||||||||
SQ | |||||||||||
ACC | fo | ||||||||||
SL | |||||||||||
CPPS | |||||||||||
Alpha | |||||||||||
HRF | |||||||||||
MIC | fo | ||||||||||
SL | |||||||||||
CPPS | |||||||||||
Alpha | |||||||||||
HRF |



Discussion
- Alku P.
- Bäckström T.
- Vilkman E.
- Gauffin J.
- Sundberg J.
Conclusions
Acknowledgements
References
- Vocal quality factors: Analysis, synthesis, and perception.Journal of the Acoustical Society of America. 1991; 90: 2394-2410
- Voice Quality: The Laryngeal Articulator Model.Cambridge University Press, 2019
- Objective characterization of phonation type using amplitude of flow glottogram pulse and of voice source fundamental.Journal of Voice. 2020; 36: 4-14
- Phonation types: A cross-linguistic overview.Journal of Phonetics. 2001; 29: 383-406
- Vocal fold vibratory patterns in tense versus lax phonation contrasts.Journal of the Acoustical Society of America. 2014; 136: 2784-2797
- Objective dysphonia measures in the program Praat: Smoothed cepstral peak prominence and acoustic voice quality index.Journal of Voice. 2015; 29: 35-43
- Predicting voice disorder status from smoothed measures of cepstral peak prominence using Praat and analysis of dysphonia in speech and voice (ADSV).Journal of Voice. 2017; 31: 557-566
- Emotions in vowel segments of continuous speech: Analysis of the glottal flow using the normalised amplitude quotient.Phonetica. 2006; 63: 26-46
- The role of voice quality in communicating emotion, mood and attitude.Speech Communication. 2003; 40: 189-212
- Comparing the acoustic expression of emotion in the speaking and the singing voice.Computer Speech & Language. 2015; 29: 218-235
- :619–622. Florence, Italy
- :3081–3084. Florence, Italy
- Glasgow, UK
- Cues to upcoming Swedish prosodic boundaries: Subjective judgment studies and acoustic correlates.Speech Communication. 2005; 46: 326-333
- Cue interaction in the perception of prosodic prominence: The role of voice quality.Proceedings of Interspeech 2021.2021 (:1006–1010)
- :949–952. Pittsburgh, PA
- :177–180. Florence, Italy
- Detecting a targeted voice style in an audiobook using voice quality features.In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2012). 2012 (:4593–4596)
- Wavelet maxima dispersion for breathy to tense voice discrimination.IEEE Transactions on Audio, Speech, and Language Processing. 2013; 21: 1170-1179
- Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech.Journal of Speech Language and Hearing Research. 1996; 39
- The relationship between cepstral peak prominence and selected parameters of dysphonia.Journal of Voice. 2002; 16: 20-27
- Classification of voice modes using neck-surface accelerometer data.Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2017 (:5060–5064. New Orleans, LA)
- Data-driven detection and analysis of the patterns of creaky voice.Computer Speech & Language. 2014; 28: 1233-1253
- A method for automatic detection of vocal fry.IEEE Transactions on Audio, Speech, and Language Processing. 2008; 16: 47-56
- :3166–3170. San Francisco CA, USA
- Modeling the glottal volume-velocity waveform for three voice types.The Journal of the Acoustical Society of America. 1995; 97: 505-519
- Florence, Italy
- Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering.Speech Communication. 1992; 11: 109-118
- Bonn, Germany
- Modal and nonmodal voice quality classification using acoustic and electroglottographic features.IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2017; 25: 2281-2291
- COVAREP — A collaborative voice analysis repository for speech technologies.Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014). IEEE, 2014: 960-964
- A miniature accelerometer for detecting glottal waveforms and nasalization.Journal of Speech, Language, and Hearing Research. 1975; 18: 594-599
- Electroglottograph and contact microphone for measuring vocal pitch.Speech Transmission Laboratory, Quarterly Progress and Status Report. 1977; 4: 13-21
- Chest wall vibrations in singers.Journal of Speech and Hearing Research. 1983; 26: 329-340
- Comparison of microphone and neck-mounted accelerometer monitoring of the performing voice.Journal of Voice. 1988; 2: 200-205
- Estimation of sound pressure levels of voiced speech from skin vibration of the neck.The Journal of the Acoustical Society of America. 2005; 117: 1386-1394
- Estimating subglottal pressure from neck-surface acceleration during normal voice production.Journal of Speech, Language, and Hearing Research. 2016; 59: 1335-1345
- Magnitude of neck-surface vibration as an estimate of subglottal pressure during modulations of vocal effort and intensity in healthy speakers.Journal of Speech, Language, and Hearing Research. 2017; 60: 3404-3416
- An accelerometric measure as a physical correlate of perceived hypernasality in speech.Journal of Speech, Language, and Hearing Research. 1983; 26: 476-480
- Detecting nasalization using a low-cost miniature accelerometer.Journal of Speech, Language, and Hearing Research. 1981; 24: 314-317
- The difference between first and second harmonic amplitudes correlates between glottal airflow and neck-surface accelerometer signals during phonation.The Journal of the Acoustical Society of America. 2019; 145: EL386-EL392
- Using ambulatory voice monitoring to investigate common voice disorders: Research update.Frontiers in Bioengineering and Biotechnology. 2015; 3
- Real-time estimation of aerodynamic features for ambulatory voice biofeedback.Journal of the Acoustical Society of America. 2015; 138: EL14-9
- Learning to detect vocal hyperfunction from ambulatory neck-surface acceleration features: Initial results for vocal fold nodules.IEEE Transactions on Biomedical Engineering. 2014; 61: 1668-1675
- Exploring sustained phonation recorded with acoustic and contact microphones to screen for laryngeal disorders.Proceedings of the 2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE). IEEE, 2014: 125-132
Granqvist S. Sopran [Computer program]. 2022. https://tolvan.com/index.php?page=/sopran/sopran.php.
Boersma P., Weenink D.. Praat: doing phonetics by computer. 2021. Computer program, http://www.praat.org/.
- Acoustic correlates of vocal quality.Journal of Speech, Language, and Hearing Research. 1990; 33: 298-306
- Random Forests.Machine Learning. 2001; 45: 5-32
- Classification and regression by randomForest.R News. 2002; 2: 18-22
- Flow glottogram characteristics and perceived degree of phonatory pressedness.Journal of Voice. 2016; 30: 287-292https://doi.org/10.1016/j.jvoice.2015.03.014
- Flow glottogram and subglottal pressure relationship in singers and untrained voices.Journal of Voice. 2018; 32: 23-31
- Subglottal impedance-based inverse filtering of voiced sounds using neck surface acceleration.IEEE Transactions on Audio, Speech and Language Processing. 2013; 21: 1929-1939
- Normalized amplitude quotient for parametrization of the glottal flow.The Journal of the Acoustical Society of America. 2002; 112: 701-710https://doi.org/10.1121/1.1490365
- Variability in the relationships among voice quality, harmonic amplitudes, open quotient, and glottal area waveform shape in sustained phonation.Journal of the Acoustical Society of America. 2012; 132: 2625-2632
- Spectral correlates of glottal voice source waveform characteristics.Journal of Speech, Language, and Hearing Research. 1989; 32: 556-565https://doi.org/10.1044/jshr.3203.556
- Objective characterization of phonation type using amplitude of flow glottogram pulse and of voice source fundamental.Journal of Voice. 2022; 36: 4-14https://doi.org/10.1016/j.jvoice.2020.03.018
- Estimating perceived phonatory pressedness in singing from flow glottograms.Journal of Voice. 2004; 18: 56-62https://doi.org/10.1016/j.jvoice.2003.05.006
Article info
Publication history
Publication stage
In Press Corrected ProofFootnotes
Declarations of interests: none.
Identification
Copyright
User license
Creative Commons Attribution (CC BY 4.0) |
Permitted
- Read, print & download
- Redistribute or republish the final article
- Text & data mine
- Translate the article
- Reuse portions or extracts from the article in other works
- Sell or re-use for commercial purposes
Elsevier's open access license policy