Summary
Objective
Method
Results
Conclusions
Key words
Purchase one-time access:
Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online accessOne-time access price info
- For academic or personal research use, select 'Academic and Personal'
- For corporate R&D use, select 'Corporate R&D Professionals'
Subscribe:
Subscribe to Journal of VoiceReferences
- Clinical voice disorders: an interdisciplinary approach.Mayo Clinic Proceedings. 66. Elsevier, 1991: 656
- Voice disorders in the general population: prevalence, risk factors, and occupational impact.Laryngoscope. 2005; 115: 1988-1995
- Auditory-perceptual evaluation of disordered voice quality.Folia Phoniatr Logop. 2009; 61: 49-56
- GRBAS-scaling of pathological voices: reliability, clinical relevance, and differentiated correlation with acoustic measurements, especially with cepstral measurements.in: Proceedings of the 22th World Congress IALP. 1992
- Reliability in perceptual analysis of voice quality.J Voice. 2005; 19: 555-573
- Test-retest study of the GRBAS scale: influence of experience and professional background on perceptual rating of voice quality.J Voice. 1997; 11: 74-80
- Modulation spectra morphological parameters: a new method to assess voice pathologies according to the GRBAS scale.BioMed Res Int. 2015;
- Reliability of perceptions of voice quality: evidence from a problem asthma clinic population.J Laryngol Otol. 2009; 123: 755-763
- Age and changes in vocal jitter.J Gerontol. 1980; 35: 194-198
- Reliable jitter and shimmer measurements in voice clinics: the relevance of vowel, gender, vocal intensity, and fundamental frequency effects in a typical clinical task.J Voice. 2011; 25: 44-53
- Vocal acoustic analysis–jitter, shimmer and hnr parameters.Procedia Technol. 2013; 9: 1112-1122
- Comparing reliability of perceptual ratings of roughness and acoustic measures of jitter.J Speech Lang Hear Res. 1995; 38: 26-32
- Deep learning.Nature. 2015; 521: 436-444
- Detection of pathological voice using cepstrum vectors: a deep learning approach.J Voice. 2019; 33: 634-641
- Psycho-acoustic evaluation of voice.Clini Exam Voice. 1981; : 81-84
Hidaka S, Lee Y, Wakamiya K, et al. Automatic Estimation of Pathological Voice Quality Based on Recurrent Neural Network Using Amplitude and Phase Spectrogram. In INTERSPEECH. 2020:3880-3884.
- Objective assessment of pathological voice using artificial intelligence based on the GRBAS scale.J Voice. 2021; https://doi.org/10.1016/j.jvoice.2021.11.021
- Multimodal and multi-output deep learning architectures for the automatic assessment of voice quality using the GRB scale.IEEE J Selec Top Signal Process. 2019; 14: 413-422
García MA, Rosset AL. Deep Neural Network for Automatic Assessment of Dysphonia. arXiv preprint arXiv:2202.12957. 2022.
- Classification of voice disorders using a one-dimensional convolutional neural network.J Voice. 2022; 36: 15-20
- Vowel-and text-based cepstral analysis of chronic hoarseness.J Voice. 2012; 26: 416-424
- Validity, reliability and reproducibility of the “extended GRBAS scale,” a comprehensive perceptual evaluation of dysphonia.J Voice. 2022;
- Perceptual evaluation of voice quality and its correlation with acoustic measurements.J Voice. 2004; 18: 299-304
- Acoustic parameters for classification of breathiness in continuous speech according to the GRBAS scale.J Voice. 2014; 28 (653-e9)
- Perceptual and quantitative assessment of dysphonia across vowel categories.J Voice. 2019; 33: 473-481
- Examining relationships between GRBAS ratings and acoustic, aerodynamic and patient-reported voice measures in adults with voice disorders.J Voice. 2021;
- Effect of endoscopic glottoplasty on acoustic measures and quality of voice: a systematic review and meta-analysis.J Voice. 2020;
- Covariation between voice quality and pitch: revisiting the case of Mandarin creaky voice.J Acoust Soc Am. 2017; 142: 1693-1706
- Throaty voice quality: subglottal pressure, voice source, and formant characteristics.J Voice. 2006; 20: 25-37
- Attention is all you need.Advan Neural Inform Process Syst. 2017; : 30
- Overview of the transformer-based models for NLP Tasks.in: 15th IEEE Conference on Computer Science and Information Systems (FedCSIS). 2020: 179-183
Woldert-Jokisz B. Saarbruecken voice database. 2007.
- Computing. Estimation of prediction error by using K-fold cross-validation.Statist Comput. 2011; 21: 137-146
- Analysis of k-fold cross-validation over hold-out validation on colossal datasets for quality classification.in: IEEE 6th International conference on advanced computing (IACC). 2016
- Comparative analysis of CNN and RNN for voice pathology detection.Biomed Res Int. 2021; : 2021
- Performance comparison of heterogeneous classifiers for detection of Parkinson's disease using voice disorder (dysphonia).in: International Conference on Informatics, Electronics & Vision (ICIEV). 2014
- Classification of heart sound signals using a novel deep WaveNet model.Comput Met Prog Biom. 2020; 196105604
- A speech enhancement approach using piecewise linear approximation of an explicit model of environmental distortions.in: Ninth annual conference of the international speech communication association (INTERSPEECH). 2008
- Context-aware self-attention networks for natural language processing.Neurocomputing. 2021; 458: 157-169
- State-of-the-art speech recognition using multi-stream self-attention with dilated 1d convolutions.IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). 2019: 54-61
- Activation functions in neural networks.Towards Data Sci. 2017; 6: 310-316
Agarap AF. Deep learning using rectified linear units (relu). arXiv preprint arXiv: 1803.08375. 2018.
- Generalized cross entropy loss for training deep neural networks with noisy labels.Advan Neural Inform Process Syst. 2018; : 31
- Can cross entropy loss be robust to label noise?.in: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence (IJCAI). 2021: 2206-2212
- Incorporating nesterov momentum into adam.in: International Conference on Learning Representations (ICLR). 2016
- Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones.Speech Comm. 1990; 9: 453-467
- MBR-PSOLA: text-to-speech synthesis based on an MBE re-synthesis of the segments database.Speech Comm. 1993; 13: 435-440
- Audio augmentation for speech recognition.in: Sixteenth Annual Conference Of The International Speech Communication Association (INTERSPEECH). 2015
- An objective evaluation framework for pathological speech synthesis.in: Speech Communication; 14th ITG Conference. 2021
- Scikit-learn: machine learning in python.J Mach Learn Res. 2011; 12: 2825-2830
Powers DM. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061. 2020.
Gowda T, You W, Lignos C, et al. Macro-average: rare types are important too. arXiv preprint arXiv:2104.05700. 2021.
- The role of voice therapy in male-to-female transsexuals.Current Opinion Otolaryngo Head Neck Surgery. 2002; 10: 173-177
- Visualizing data using t-SNE.J Mach Learn Res. 2008; 9
- Improved environment aware based noise reduction system for cochlear implant users based on a knowledge transfer approach: development and usability study.J Med Int Res. 2021; 23: e25460
Article info
Publication history
Publication stage
In Press Corrected ProofFootnotes
This study was supported by the industry-academia cooperation project of APrevent Medical Inc. (109J052), the National Science and Technology Council of Taiwan (MOST 110-2218-E-A4S9A-501) and (MOST 111-2221-E-A49-041-MY2).