Journal of Voice
Volume 25, Issue 1 , Pages 38-43 , January 2011

Discrimination Between Pathological and Normal Voices Using GMM-SVM Approach

,Accepted 4 August 2009.

References 

  1. Smits I, Ceuppens P, De Bodt M. A comparative study of acoustic voice measurements by means of Dr. Speech and Computerized Speech Lab. J Voice. 2005;19:187–196
  2. Stemple J, Stanley J, Lee L. Objective measures of voice production in normal subjects following prolonged voice use. J Voice. 1995;9:127–133
  3. Klingholz F, Martin F. Quantitative spectral evaluation of shimmer and jitter. J Speech Lang Hear Res. 1985;28:169–174
  4. Korm G. A cepstrum-based technique for determining a harmonic-to-noise ratio in speech signals. J Speech Hear Res. 1993;36:254–266
  5. Yunik M, Boyanov B. Method for evaluation of the noise-to-harmonic-component ratios in pathological and normal voices. Acustica. 1990;70:89–91
  6. Yumoto E, Gould W, Baer T. Harmonics-to-noise ratio as an index of the degree of hoarseness. J Acoust Soc Am. 1982;71:1544
  7. Eskenazi L, Childers D, Hicks D. Acoustic correlates of vocal quality. J Speech Lang Hear Res. 1990;33:298–306
  8. Prosek R, Montgomery A, Walden B, Hawkins D. An evaluation of residue features as correlates of voice disorders. J Commun Disord. 1987;20:105–117
  9. Qi Y, Hillman R. Temporal and spectral estimations of harmonics-to-noise ratio in human voice signals. J Acoust Soc Am. 1997;102:537
  10. Heman-Ackah Y, Heuer R, Michael D, et al. Cepstral peak prominence: a more reliable measure of dysphonia. Ann Otol Rhinol Laryngol. 2003;112:324
  11. Umapathy K, Krishnan S, Parsa V, Jamieson D. Discrimination of pathological voices using a time-frequency approach. IEEE Trans Biomed Eng. 2005;52:421–430
  12. Godino-Llorente J, Gomez-Vilda P. Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans Biomed Eng. 2004;51:380–384
  13. Ritchings R, McGillion M, Moore C. Pathological voice quality assessment using artificial neural networks. Med Eng Phys. 2002;24:561–564
  14. Chen W, Peng C, Zhu X, Wan B, Wei D. SVM-based identification of pathological voices. In: Engineering in Medicine and Biology Society. EMBS 2007. 29th Annual International Conference of the IEEE. August 22–26, 2007:3786–3789.
  15. Reynolds D, Rose R. Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audio Process. 1995;3:72–83
  16. Campbell W, Sturim D, Reynolds D. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process Lett. 2006;13:308–311
  17. Matejka R, Burget L, Schwarz P, et al. STBU system for the NIST 2006 speaker recognition evaluation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007; 2007.
  18. Godino-Llorente J, Gomez-Vilda P, Blanco-Velasco M. Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters. IEEE Trans Biomed Eng. 2006;53:1943–1953
  19. Kay Elemetrics, Corp. Disordered Voice Database, 1.03 ed. Lincoln Park, NJ, 1994.
  20. Wepman JA. Analog-to-digital converters and their applications in radio receivers. IEEE Commun Mag. 1995;33:39–45
  21. L. Rabiner, Juang B-H. Fundamentals of Speech Recognition. Upper Saddle River, NJ: Prentice-Hall, Inc., 1993.
  22. Reynolds DA, Quatieri TF, Dunn RB. Speaker verification using adapted Gaussian mixture models. Digit Signal Process. 2000;10:19–41
  23. Gunn S. Support vector machines for classification and regression. ISIS Tech Rep. 1998;14:
  24. Joachims T. SVMLight: Support Vector Machine. University of Dortmund; 1999;http://svmlight.joachims.org
  25. Smola A, Schlkopf B. From regularization operators to support vector kernels. In: Advances in Neural Information Process Systems 10. Cambridge, MA: MIT press; 1998;p. 343–349
  26. Bilmes J. A Gentle Tutorial on the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Vol. 4. Berkeley, CA: International Computer Science Institute; 1998;
  27. Chien J, Lee C, Wang H. A hybrid algorithm for speaker adaptation using MAP transformation and adaptation. IEEE Signal Process Lett. 1997;4:167–169
  28. Kullback S. Information Theory and Statistics. New York, NY: Wiley; 1959;
  29. Do M. Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models. IEEE Signal Process Lett. 2003;10:115–118
  30. Wan V, Campbell W. Support vector machines for speaker verification and identification. In: Neural Networks Signal Processing Proceedings of IEEE. December 11–13, 2000;2:775–784.
  31. Hanley JA. Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagn Imaging. 1989;29:307–355
  32. Martin A, Doddington G, Kamm T., Ordowski M., Przybocki M. The DET curve in assessment of detection task performance. In: Fifth European Conference on Speech Communication and Technology. April 15–20, 1997.
  33. Cheng JM, Wang HC. A method of estimating the equal error rate for automatic speaker verification. In: Chinese Spoken Language Processing, 2004.
  34. Karam ZN, Campbell WM. A new kernel for SVM MLLR based speaker recognition. Proc Interspeech. 2007;
  35. Leggetter C, Woodland P. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput Speech Lang. 1995;9:171

PII: S0892-1997(09)00119-2

doi: 10.1016/j.jvoice.2009.08.002

Journal of Voice
Volume 25, Issue 1 , Pages 38-43 , January 2011