SUMMARY
Objectives
The coronavirus disease 2019 (COVID-19) has caused a crisis worldwide. Amounts of efforts have been made to prevent and control COVID-19′s transmission, from early screenings to vaccinations and treatments. Recently, due to the spring up of many automatic disease recognition applications based on machine listening techniques, it would be fast and cheap to detect COVID-19 from recordings of cough, a key symptom of COVID-19. To date, knowledge of the acoustic characteristics of COVID-19 cough sounds is limited but would be essential for structuring effective and robust machine learning models. The present study aims to explore acoustic features for distinguishing COVID-19 positive individuals from COVID-19 negative ones based on their cough sounds.
Methods
By applying conventional inferential statistics, we analyze the acoustic correlates of COVID-19 cough sounds based on the ComParE feature set, i.e., a standardized set of 6,373 acoustic higher-level features. Furthermore, we train automatic COVID-19 detection models with machine learning methods and explore the latent features by evaluating the contribution of all features to the COVID-19 status predictions.
Results
The experimental results demonstrate that a set of acoustic parameters of cough sounds, e.g., statistical functionals of the root mean square energy and Mel-frequency cepstral coefficients, bear essential acoustic information in terms of effect sizes for the differentiation between COVID-19 positive and COVID-19 negative cough samples. Our general automatic COVID-19 detection model performs significantly above chance level, i.e., at an unweighted average recall (UAR) of 0.632, on a data set consisting of 1,411 cough samples (COVID-19 positive/negative: 210/1,201).
Conclusions
Based on the acoustic correlates analysis on the ComParE feature set and the feature analysis in the effective COVID-19 detection approach, we find that several acoustic features that show higher effects in conventional group difference testing are also higher weighted in the machine learning models.
INTRODUCTION
A novel coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a disease that quickly spread worldwide at the end of 2019 and the beginning of 2020. In February 2020, the World Health Organization (WHO) named the disease COVID-19 and shortly after that declared the COVID-19 outbreak a global pandemic. Globally, as of February 2022, more than 434,150,000 confirmed cases of COVID-19, including more than 5,940,000 deaths were reported to the WHO.
1World Health Organization. WHO Coronavirus (COVID-19) Dashboard. 2022. Accessed at: February 28, 2022. Accessed from: https://covid19.who.int/
Both the presenting symptoms and the symptom severity vary considerably from patient to patient, ranging from asymptomatic infections or a mild flu-like clinical picture to severe illness or even death. Commonly reported symptoms of COVID-19 include (1) respiratory and ear-nose-throat symptoms such as cough, shortness of breath, sore throat and headache, (2) systemic symptoms such as fever, muscle pain, and weakness, as well as (3) loss of smell and/or taste.
2- Esakandari H
- Nabi-Afjadi M
- Fakkari-Afjadi J
- et al.
A comprehensive review of COVID-19 characteristics.
Less common ear-nose-throat symptoms associated with COVID-19 are pharyngeal erythema, nasal congestion, tonsil enlargement, rhinorrhea, and upper respiratory tract infection.
3- El-Anwar MW
- Elzayat S
- Fouad YA.
ENT manifestation in COVID-19 patients.
Diagnostic approaches
The early detection of a COVID-19 infection in a patient is essential to prevent the transmission of the virus to other hosts and provide the patient with appropriate and early treatment. A series of laboratory diagnosis instruments have been proposed to test for COVID-19, e.g., computed tomography (CT), real-time reverse transcription polymerase chain reaction (rRT-PCR) tests, and serological methods.
4- Long C
- Xu H
- Shen Q
- et al.
Diagnosis of the coronavirus disease (COVID-19): rRT-PCR or CT?.
,5- Tang Y-W
- Schmitz JE
- Persing DH
- et al.
Laboratory diagnosis of COVID-19: Current issues and challenges.
CT and X-ray detect COVID-19 based on chest images.
6- Jiang Y
- Chen H
- Loew M
- et al.
COVID-19 CT image synthesis with a conditional generative adversarial network.
, 7- Raptis CA
- Hammer MM
- Short RG
- et al.
Chest CT and coronavirus disease (COVID-19): a critical review of the literature to date.
, 8- Tabik S
- Gómez-Ríos A
- Martín-Rodríguez JL
- et al.
COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on chest X-ray images.
, 9- Zhang Y
- Liao Q
- Yuan L
- et al.
Exploiting shared knowledge from non-COVID lesions for annotation-efficient COVID-19 CT lung infection segmentation.
An rRT-PCR test focuses on analyzing the virus’ ribonucleic acid (RNA) and synthesized complementary deoxyribonucleic acid (cDNA) from a nasopharyngeal swab and/or an oropharyngeal swab.
10- Rahbari R
- Moradi N
- Abdi M.
rRT-PCR for SARS-CoV-2: analytical considerations.
Serological instruments measure antibody responses to the corresponding infection and confirm the COVID-19 status.
5- Tang Y-W
- Schmitz JE
- Persing DH
- et al.
Laboratory diagnosis of COVID-19: Current issues and challenges.
However, the instruments mentioned above are costly and/or not always available, since they can only be conducted by professionals and require special equipment and a certain time of analysis. Even though rapid antigen and molecular tests are more and more used by non-professionals/the test person him- or herself to quickly detect COVID-19, e.g. in everyday life settings, they result in a huge amount of waste due to the testing kits as well as their packing. Thus, it is essential to develop low-cost, real-time, easy-to-apply, and eco-friendly screening instruments that are ready-to-use every day and basically everywhere.
Disease detection based on bioacoustic signals
A promising approach for a screening tool fulfilling these requirements could be based on bioacoustic signals such as speech sounds or cough sounds.
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
, 12- Hecker P
- Pokorny F
- Bartl-Pokorny K
- et al.
Speaking corona? Human and machine recognition of COVID-19 from voice.
, 13- Schuller BW
- Batliner A
- Bergler C
- et al.
The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.
, 14- Schuller BW
- Schuller DM
- Qian K
- et al.
COVID-19 and computer audition: an overview on what speech & sound analysis could contribute in the SARS-CoV-2 Corona crisis.
Several studies have reported acoustic peculiarities in the speech of patients who have diseases associated with symptoms affecting anatomical correlates of speech production, such as bronchial asthma
15- Balamurali BT
- Hee HI
- Teoh OH
- et al.
Asthmatic versus healthy child classification based on cough and vocalised /a:/sounds.
,16- Dogan M
- Eryuksel E
- Kocak I
- et al.
Subjective and objective evaluation of voice quality in patients with asthma.
or vocal cord disorders.
17- Falk S
- Kniesburges S
- Schoder S
- et al.
3D-FV-FE aeroacoustic larynx model for investigation of functional based voice disorders.
, 18- Jesus LMT
- Martinez J
- Hall A
- et al.
Acoustic correlates of compensatory adjustments to the glottic and supraglottic structures in patients with unilateral vocal fold paralysis.
, 19- Petrović-Lazić M
- Babac S
- Vuković M
- et al.
Acoustic voice analysis of patients with vocal fold polyp.
Differences in various acoustic parameters were also found in recent studies comparing speech samples of COVID-19 positive and COVID-19 negative individuals.
20- Asiaee M
- Vahedian-Azimi A
- Atashi SS
- et al.
Voice quality evaluation in patients with COVID-19: an acoustic analysis.
,21- Bartl-Pokorny KD
- Pokorny FB
- Batliner A
- et al.
The voice of COVID-19: acoustic correlates of infection in sustained vowels.
Motivated by acoustic voice peculiarities found for various diseases, machine learning has been increasingly applied to automatically detect medical conditions from voice, such as upper respiratory tract infection,
22- Albes M
- Ren Z
- Schuller B
- et al.
Squeeze for sneeze: compact neural networks for cold and flu recognition.
Parkinson's disease,
23- Yaman O
- Ertam F
- Tuncer T.
Automated Parkinson's disease recognition based on statistical pooling method using acoustic features.
and depression.
24- Ringeval F
- Schuller B
- Valstar M
- et al.
AVEC 2019 workshop and challenge: state-of-mind, detecting depression with AI, and cross-cultural affect recognition.
Recent studies on the automatic detection of COVID-19 from speech signals achieved promising results through both traditional machine learning
13- Schuller BW
- Batliner A
- Bergler C
- et al.
The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.
,25- Han J
- Brown C
- Chauhan J
- et al.
Exploring automatic COVID-19 diagnosis via voice and symptoms from crowdsourced data.
, 26- Shimon C
- Shafat G
- Dangoor I
- et al.
Artificial intelligence enabled preliminary diagnosis for COVID-19 from voice cues and questionnaires.
, 27- Stasak B
- Huang Z
- Razavi S
- et al.
Automatic detection of COVID-19 based on short-duration acoustic smartphone speech analysis.
and deep learning techniques.
13- Schuller BW
- Batliner A
- Bergler C
- et al.
The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.
,28- Hassan A
- Shahin I
- Alsabek MB.
COVID-19 detection system using recurrent neural networks.
,29- Pinkas G
- Karny Y
- Malachi A
- et al.
SARS-CoV-2 detection from voice.
Although research on the automatic detection of diseases based on speech is rapidly expanding, it faces a number of challenges in terms of algorithm generalizability and potential application in real-world scenarios. These challenges include gender and age distribution, the presence of different mother tongues, dialects, sociolects, or cognitive aspects such as individual speech-language and reading competence that may affect various acoustic parameters.
30- Mendonça Alves L
- Reis C
- Pinheiro Â
Prosody and reading in dyslexic children.
, 31- Goyal K
- Singh A
- Kadyan V.
A comparison of Laryngeal effect in the dialects of Punjabi language.
, 32- Nagumo R
- Zhang Y
- Ogawa Y
- et al.
Automatic detection of cognitive impairments through acoustic analysis of speech.
, 33Cultural competency in voice evaluation: considerations of normative standards for sociolinguistically diverse voices.
, 34- Rojas S
- Kefalianos E
- Vogel A.
How does our voice change as we age? A systematic review and meta-analysis of acoustic and perceptual voice data from healthy adults over 50 years of age.
, 35Using prosodic and acoustic features for Chinese dialects identification.
, 36- Taylor S
- Dromey C
- Nissen SL
- et al.
Age-related changes in speech and voice: spectral and cepstral measures.
Studies on COVID-19 face additional challenges related to the fact that COVID-19 is a relatively new and not yet well understood disease with a wide range of symptoms and divergent symptom severity.
37Characteristics of SARS-CoV-2 and COVID-19.
,38Current epidemiological and clinical features of COVID-19; a global perspective from China.
Studies need to consider the symptom heterogeneity of COVID-19 positive patients and the fact that many symptoms may also occur in other diseases such as bronchial asthma or flu. Therefore, it is essential to consider the inclusion of patients with COVID-19-like symptoms but other diagnoses into COVID-19 negative study groups.
In contrast to speech, the acoustic parameters of cough sounds are less dependent on language-related aspects. Therefore, systems based on voluntarily produced cough sounds may be more easily applicable to a broader target group than speech-based systems. Cough is not only a promising bioacoustic signal since it reflects a body function performed by all people regardless of their culture or language competence, but is also one of the most prominent symptoms of COVID-19 and is closely related to the lung primarily affected by COVID-19.
Physiology of cough
Cough is an important defense mechanism of the respiratory system as it cleans the airways through high-velocity airflow from accidentally inhaled foreign materials or materials produced internally in the course of infections. A cough is composed of an inspiratory, a compressive, and an expiratory phase.
39Global physiology and pathophysiology of cough: ACCP evidence-based clinical practice guidelines.
It is initiated with the inspiration of air (about 50% of vital capacity), followed by a prompt closure of the glottis and the contraction of abdominal muscles and other expiratory muscles. This process allows the compression of the thorax and the increase of subglottic pressure. The next phase of a cough constitutes the rapid opening of the glottis resulting in a high-velocity airflow (peak expiratory airflow phase), followed by a steady-state airflow (plateau phase) for a variable – voluntarily controllable – duration.
40- Kelemen SA
- Cseri T
- Marozsan I.
Information obtained from tussigrams and the possibilities of their application in medical practice.
,41- Lee KK
- Davenport PW
- Smith JA
- et al.
CHEST Expert Cough Panel
Global physiology and pathophysiology of cough: part 1: cough phenomenology - CHEST guideline and expert panel report.
The optional final phase is the interruption of the airflow due to the closure of the glottis.
42- Lee KK
- Matos S
- Ward K
- et al.
Sound: a non-invasive measure of cough intensity.
Cough can be classified into two broad categories: wet/productive cough with sputum excreted and dry/non-productive cough without sputum.
43- Murata A
- Taniguchi Y
- Hashimoto Y
- et al.
Discrimination of productive and non-productive cough by sound analysis.
Cough sounds were found to vary significantly due to a person's body structure, sex, and the kind of sputum.
43- Murata A
- Taniguchi Y
- Hashimoto Y
- et al.
Discrimination of productive and non-productive cough by sound analysis.
For example, the sound spectrograms of wet coughs contain clear vertical lines that appear once continuous sounds break off. This manifests in audible interruptions. Moreover, the duration of the second cough phase was longer for wet coughs compared to dry coughs, whereas the durations of the first and third cough phase did not differ significantly. Also Hashimoto and colleagues
44- Hashimoto Y
- Murata A
- Mikami M
- et al.
Influence of the rheological properties of airway mucus on cough sound generation.
revealed that the ratio of the duration of the second phase to the total cough duration was significantly higher for wet coughs than for dry coughs. Chatrzarrin and colleagues
45- Chatrzarrin H
- Arcelus A
- Goubran R
- et al.
Feature extraction for the differentiation of dry and wet cough sounds.
compared acoustic characteristics of wet and dry coughs and found that the number of peaks of the energy envelope of the cough signal and the power ratio of two frequency bands of the second expiratory phase of the cough signal significantly differentiated between the two cough types. Wet cough sounds presented with more peaks and a reduced frequency band power ratio, indicating more spectral variation as compared to dry cough sounds.
Disease detection from cough sounds
A number of researchers have been interested in potential acoustic differences between voluntarily produced cough sounds of patients with pulmonary diseases and healthy individuals. Knocikova and colleagues
46- Knocikova J
- Korpas J
- Vrabec M
- et al.
Wavelet analysis of voluntary cough sound in patients with respiratory diseases.
compared the cough sounds of patients with chronic obstructive pulmonary disease (COPD), patients with bronchial asthma, and healthy controls. They found that patients with COPD had the longest cough duration and the highest power among the three groups. Higher frequencies were detected in the cough sounds of the bronchial asthma group compared with the COPD group. Furthermore, in the bronchial asthma group, the power of cough sound was shifted to a higher frequency range compared with the control group.
46- Knocikova J
- Korpas J
- Vrabec M
- et al.
Wavelet analysis of voluntary cough sound in patients with respiratory diseases.
Another study
47- Nemati E
- Rahman MJ
- Blackstock E
- et al.
Estimation of the lung function using acoustic features of the voluntary cough.
found that cough duration, MFCC1 (Mel-frequency cepstral coefficient), and MFCC9 features were the most important acoustic features for classification of pulmonary disease state (i.e., bronchial asthma, COPD, chronic cough, healthy) and disease severity, defined based on a patient's forced expiratory volume in the first second (FEV1) divided through the forced vital capacity (FVC). Similar to the speech/voice domain, various automatic approaches have proved to be effective at detecting pulmonary diseases from cough sounds
47- Nemati E
- Rahman MJ
- Blackstock E
- et al.
Estimation of the lung function using acoustic features of the voluntary cough.
,48- Infante C
- Chamberlain DB
- Kodgule R
- et al.
Classification of voluntary coughs applied to the screening of respiratory disease.
; good performance was even achieved when differentiating between two obstructive pulmonary diseases, namely bronchial asthma and COPD.
49- Infante C
- Chamberlain D
- Fletcher R
- et al.
Use of cough sounds for diagnosis and screening of pulmonary disease.
Furthermore, using acoustic features extracted from cough sounds, Nemati and colleagues
47- Nemati E
- Rahman MJ
- Blackstock E
- et al.
Estimation of the lung function using acoustic features of the voluntary cough.
automatically classified the symptom severity of patients with pulmonary diseases. In another study,
50- Sharan RV
- Abeyratne UR
- Swarnkar VR
- et al.
Predicting spirometry readings using cough sound features and regression.
cough sound analysis was used to predict spirometry results, i.e., FVC, FEV1, and FEV1/FVC, for patients with obstructive, restrictive, and combined obstructive-restrictive pulmonary diseases as well as healthy controls. Machine learning algorithms were also applied to distinguish pertussis coughs from croup and other coughs in children.
51- Parker D
- Picone J
- Harati A
- et al.
Detecting paroxysmal coughing from pertussis cases using voice recognition technology.
Nemati and colleagues
52- Nemati E
- Rahman MM
- Nathan V
- et al.
A comprehensive approach for classification of the cough type.
used a random forest algorithm to classify wet and dry coughs based on a comprehensive set of acoustic features and achieved an accuracy of 87%. Notably, the accuracy is calculated as the average of the sensitivity (88%) and specificity (86%) for classification of wet and dry cough sounds. Based on improved reverse MFCCs, Zhu and colleagues
53Automatic classification of dry cough and wet cough based on improved reverse Mel frequency cepstrum coefficients.
achieved an accuracy in the classification of wet and dry coughs of 93.66% using hidden Markov models.
COVID-19 detection based on cough sounds
A set of studies have investigated detecting COVID-19 from cough sounds. Alsabek and colleagues
54- Alsabek MB
- Shahin I
- Hassan A.
Studying the similarity of COVID-19 sounds based on correlation analysis of MFCC.
compared MFCC acoustic features in cough, breathing, and voice samples of COVID-19 positive and COVID-19 negative individuals. They found a higher correlation between the COVID-19 positive group and the COVID-19 negative group for the voice samples than for the cough and breathing samples. Therefore, they concluded that the cough and breathing of a patient may be more suitable for detecting a COVID-19 infection than his or her voice. Another study
55- Cohen-McFarlane M
- Goubran R
- Knoefel F.
Novel coronavirus cough database: NoCoCoDa.
collected cough sounds from public media interviews with COVID-19 positive patients and analyzed them for the number of peaks present in the energy spectrum and power ratio between the first two phases of each cough event. They found the majority of cough sounds to have a low power ratio and a high number of peaks, a characteristic pattern previously reported for wet coughs.
45- Chatrzarrin H
- Arcelus A
- Goubran R
- et al.
Feature extraction for the differentiation of dry and wet cough sounds.
Brown and colleagues
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
compared several hand-crafted features extracted from their collected crowd-sourced cough sounds of COVID-19 positive and COVID-19 negative individuals. They found that coughs from COVID-19 positive individuals are longer in total duration, and have more pitch onsets, higher periods, and lower root mean square (RMS) energy. In contrast, their MFCC features have fewer outliers compared to those of COVID-19 negative individuals.
The reported differences in acoustic features extracted from cough sounds of COVID-19 positive and COVID-19 negative individuals are promising for the automatic detection of COVID-19. To process hand-crafted features, traditional machine learning methods such as support vector machines (SVMs) and extreme gradient boosting were utilised.
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
,13- Schuller BW
- Batliner A
- Bergler C
- et al.
The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.
,56- Mouawad P
- Dubnov T
- Dubnov S.
Robust detection of COVID-19 in cough sounds: using recurrence dynamics and variable Markov model.
End-to-end deep learning models were developed to detect COVID-19 from the log spectrograms of cough sounds, and performed better than the linear SVM baseline.
57- Coppock H
- Gaskell A
- Tzirakis P
- et al.
End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: a pilot study.
Similarly, deep learning was also successfully used to process MFCCs
28- Hassan A
- Shahin I
- Alsabek MB.
COVID-19 detection system using recurrent neural networks.
,58- Laguarta J
- Hueto F
- Subirana B.
COVID-19 artificial intelligence diagnosis using only cough recordings in sustained vowels.
,59- Pahar M
- Klopper M
- Warren R
- et al.
COVID-19 cough classification using machine learning and global smartphone recordings.
or Mel spectrograms
60- Imran A
- Posokhova I
- Qureshi HN
- et al.
AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app.
of cough sounds. The studies above have raised the potential and shown the effectiveness of machine learning for a cough sound-based detection of COVID-19.
Contributions of this work
Due to advancements in signal processing and machine learning technology, today's computers are able to “listen to” sounds and identify acoustic patterns which often remain hidden for human listeners. The rapidly growing field of machine listening aims to teach computers to automatically process and evaluate audio content for the purpose of a wide range of acoustic detection/classification tasks. In the present study, we analyze acoustic differences in cough sounds produced by COVID-19 positive and COVID-19 negative individuals and further explore the feasibility of machine listening techniques to automatically detect COVID-19. On the one hand, we include COVID-19 positive and COVID-19 negative individuals irrespective of the presence or absence of any symptoms associated with COVID-19. On the other hand, this study aims to address the above-mentioned challenges of symptom heterogeneity of COVID-19 positive patients including asymptomatic COVID-19 infections as well as similarities of symptoms to symptom characteristics of other diseases. Thus, we also investigate the isolated scenarios of COVID-19 positive and COVID-19 negative individuals all of which showing COVID-19-associated symptoms, and of COVID-19 positive and COVID-19 negative individuals all of which not showing any COVID-19-associated symptoms. Data for our experiments is taken from the open COUGHVID database
61- Orlandic L
- Teijeiro T
- Atienza D.
The COUGHVID crowdsourcing dataset: a corpus for the study of large-scale cough analysis algorithms.
that provides 27,500 cough recordings in conjunction with information about present symptoms. We analyse the acoustic features of the
Computational Paralinguistics challengE (ComParE) feature set that recently achieved good performance for COVID-19 detection from cough sounds.
13- Schuller BW
- Batliner A
- Bergler C
- et al.
The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.
Furthermore, we train an effective (i.e., significantly better than the chance level) machine learning classifier based on the extracted
ComParE features. Finally, we investigate the contribution of the acoustic features extracted from the cough sounds to the COVID-19 status predictions of the machine learning classifier.
DISCUSSION
This study considers the presence/absence of COVID-19-associated symptoms when comparing acoustic features extracted from cough sounds produced by COVID-19 positive and COVID-19 negative individuals and when applying machine listening technology to detect COVID-19 automatically. Although the classification performance of the SVM used in our study is significantly better than chance level, there are studies reporting better performances for COVID-19 detection based on cough sounds.
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
,57- Coppock H
- Gaskell A
- Tzirakis P
- et al.
End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: a pilot study.
Brown and colleagues
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
studied three classification tasks: COVID-19/non-COVID, COVID-19 with cough/non-COVID with cough, and COVID-19 with cough/non-COVID asthma cough. The second and the third tasks were based on cough sounds and breath sounds, respectively, whereas the first task was based on both cough and breath sounds. All three tasks achieved over 80%, higher than those in our three tasks, which is possibly caused by two reasons. Firstly, the number of users in Brown and colleagues
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
is quite small and potentially less representative
75On the effect of data set size on bias and variance in classification learning.
as compared to our study: The number of users in Brown and colleagues
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
for each task is 62/220, 23/29, and 23/18, respectively, whereas the task-wise numbers of samples in our study are 210/1,201, 99/205, and 111/996. Secondly, the first and third tasks utilized breath sounds, which perhaps provide some discriminative features. Coppock and colleagues
57- Coppock H
- Gaskell A
- Tzirakis P
- et al.
End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: a pilot study.
trained deep neural networks on the log spectrograms of both cough and breath sounds from the same crowd-sourced dataset as in Brown and colleagues
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
and achieved better results on the three tasks compared with the baseline, where SVMs processed the
ComParE features. The AUCs in two of the three tasks are above 82% and the UARs are above 76%. Similarly, the better performance of this work could be caused by the limited number of participants (26/245, 23/19, and 62/293) and features from breathing sounds. In addition, an extra task of distinguishing COVID-19 and healthy participants without symptoms was set in Coppock and colleagues.
57- Coppock H
- Gaskell A
- Tzirakis P
- et al.
End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: a pilot study.
Nevertheless, the performance on COVID-19 positive samples without symptoms was not reported independently. In contrast, such performance is evaluated in our Task 3, which is crucial for preventing COVID-19 transmission. The found minor gender and age differences in detection performance are most probably related to imbalances in the dataset. There are more than twice as many data samples from male than from female participants. With regard to age, it has to be considered that with increasing age the likelihood of chronic lung and voice diseases/problems also increases which might to some extent mask symptoms caused by an acute respiratory disease.
Our study reveals several acoustic peculiarities in COVID-19. As shown in
Table 2, a set of LLDs could be helpful for differentiating COVID-19 positive individuals from COVID-19 negative ones. Across the three tasks, there are common LLDs of high relevance according to the effect size in the non-parametric group difference test, namely loudness, RMS energy, MFCCs, psychoacoustic harmonicity, spectral energy, spectral flux, spectral slope, RASTA-filtered auditory spectral bands, and HNR. Differences in RMS energy and MFCC-related features between the coughs of COVID-19 positive and COVID-19 negative individuals are also reported in Brown and colleagues.
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
In contrast to Tasks 1 and 3, i.e., tasks including asymptomatic participants, in our Task 2 all LLD categories of the
ComParE set are found to bear relevant acoustic information to distinguish between the two groups. This might be due to an increased acoustic variability of symptomatic coughs (as compared to asymptomatic coughs) being reflected in a wider range of acoustic parameters. In other words, the difference between coughs of symptomatic individuals with COVID-19 and individuals with symptoms caused by any other disease acoustically manifests more manifold than the difference between COVID-19-related and non-COVID-related coughs in a sample that also or exclusively contains asymptomatic individuals. However, the lower number of available cough samples for Task 2 as compared to the other tasks might also cause a biased distribution of feature values. The analyzed feature weights within the linear classification models show consistency with the features’ effect sizes, i.e., most top features according to the effect size also have higher weights in the linear classification models. This is a relevant finding towards the explainability of the applied machine learning approach. As indicated in
Table 2, there are less top LLDs for Task 3. That might be because LASSO tends to use less features due to the nature of L1 regularization. As a combination of LASSO and Ridge, ElasticNet is based on less features compared with Ridge.
As both speech and cough sounds are produced by the respiratory system, we herein compare and analyze peculiar acoustic parameters of patient's speech and cough sounds. When analyzing the acoustic peculiarities of patients with diseases that affect the anatomical correlates of speech production, the related studies reported that the peculiar acoustic parameters of the patients’ speech include fundamental frequency (
fo), vowel formants, jitter, shimmer, HNR, and maximum phonation time (MPT).
15- Balamurali BT
- Hee HI
- Teoh OH
- et al.
Asthmatic versus healthy child classification based on cough and vocalised /a:/sounds.
,16- Dogan M
- Eryuksel E
- Kocak I
- et al.
Subjective and objective evaluation of voice quality in patients with asthma.
,18- Jesus LMT
- Martinez J
- Hall A
- et al.
Acoustic correlates of compensatory adjustments to the glottic and supraglottic structures in patients with unilateral vocal fold paralysis.
,19- Petrović-Lazić M
- Babac S
- Vuković M
- et al.
Acoustic voice analysis of patients with vocal fold polyp.
Additionally, the peculiar acoustic parameters of voice samples of COVID-19 positive and COVID-19 negative participants were reported to include
fo standard deviation, jitter, shimmer, HNR, the difference between the first two harmonic amplitudes (H1–H2), MPT, cepstral peak prominence,
20- Asiaee M
- Vahedian-Azimi A
- Atashi SS
- et al.
Voice quality evaluation in patients with COVID-19: an acoustic analysis.
mean voiced segment length, and the number of voiced segments per second.
21- Bartl-Pokorny KD
- Pokorny FB
- Batliner A
- et al.
The voice of COVID-19: acoustic correlates of infection in sustained vowels.
We can find that there are common acoustic peculiarities between the voice of COVID-19 patients and patients with some other diseases:
fo-related features, jitter, shimmer, HNR, and MPT. In
Table 2, several acoustic LLDs of cough sounds have shown potential for distinguishing COVID-19 positive and COVID-19 negative individuals. Particularly,
fo, jitter, shimmer, and HNR have high effective sizes in Task 2, i.e., pos
+ vs neg
+. The above findings indicate that there are similarities in acoustic peculiarities of speech and cough sounds of COVID-19 patients.
Limitations
The classification performance reported in our study needs to be interpreted in the light of the well-known challenges of data collection via crowdsourcing, including data validity, data quality, and participant selection bias.
76- Afshinnekoo E
- Ahsanuddin S
- Mason CE.
Globalizing and crowdsourcing biomedical research.
, 77- Khare R
- Good BM
- Leaman R
- et al.
Crowdsourcing in biomedicine: challenges and opportunities.
, 78- Porter ND
- Verdery AM
- Gaddis SM.
Enhancing big data in the social sciences with crowdsourcing: data augmentation practices, techniques, and opportunities.
The COUGHVID database does not allow to verify the COVID-19 status of the participants, as the participants were not asked to provide a copy or confirmation of their positive or negative COVID-19 test. Another limitation is that the participants have not been instructed to record the data during a defined time window after the positive or negative COVID-19 test. Therefore, it is possible that some participants recorded their cough at the beginning of their infection, whereas others did the recording towards the end of their infection. Interestingly, the disease stage of COVID-19 was found to influence the nature of the cough (shifting from dry at an early disease stage to more wet at a later disease stage), concomitantly affecting acoustic parameters of the cough.
55- Cohen-McFarlane M
- Goubran R
- Knoefel F.
Novel coronavirus cough database: NoCoCoDa.
Moreover, the participants were asked to answer whether they had respiratory and/or muscle/pain symptoms, but no information on the severity of their symptoms is available. Although the safe recording instructions provided on the web page are reasonable with regard to the transmission of the virus, the suggestion to put the smartphone into a plastic zip bag while recording is suboptimal from an acoustic perspective. Another limitation of our study is that the participants did not receive clear instructions on how to cough, e.g., how often, or whether to take a breath between two coughs. Various audio recording devices and settings are inherent for crowdsourcing; we expect no bias towards one of the participant groups concerning the use of recording devices. We reduced files with the same location, age, and gender into a single one to promote that only one cough sound file per participant is included, but we cannot guarantee that our dataset has only one sample per participant or that we have not mistakenly merged recordings from various individuals living in the same household.
Our target in this work was to explore the hand-crafted features’ importance for automatic COVID-19 detection. Some classifiers like k-nearest neighbours were not utilized as it might be difficult for them to output the feature coefficients/importance. Other approaches, such as transfer learning and end-to-end deep learning, were not used, as their inputs are either the original audio waves or simple time-frequency representations. Therefore, it is challenging to explain the features’ contribution with these methods. Additionally, we decided to apply a cross-validation schema due to the small dataset size, thus, testing was not entirely independent from the training as hyper-parameters were optimized on the test partitions.
We decided to employ just a single dataset rather than multiple datasets in this study. We selected the COUGHVID dataset because it contains enough data and sufficient meta information to analyze the effects of COVID-19-related symptoms on acoustic parameters and automatic COVID-19 detection. Coswara
79- Sharma N
- Krishnan P
- Kumar R
- et al.
Coswara – A database of breathing, cough, and voice sounds for COVID-19 diagnosis.
was also considered at the start of the experiments. However, the symptom information is not complete and well-organized for our study to analyze the effect of symptoms for detecting COVID-19. We also considered well-structured data, including University of Cambridge dataset collected by the COVID-19 Sounds app,
11- Brown C
- Chauhan J
- Grammenos A
- et al.
Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.
diagnosis of COVID-19 using acoustics (DiCOVA) 2021 challenge data,
64- Muguli A
- Pinto L
- R N
- et al.
DiCOVA challenge: dataset, task, and baseline system for COVID-19 diagnosis using acoustics.
and INTERSPEECH
ComParE 2021 challenge data.
13- Schuller BW
- Batliner A
- Bergler C
- et al.
The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.
However, these databases did not provide (sufficient) symptom information.
CONCLUSIONS
In this study, we acoustically analyzed cough sounds and applied machine listening methodology to automatically detect COVID-19 on a subset of the COUGHVID database (1,411 cough samples; COVID-19 positive/negative: 210/1,201). Firstly, the acoustic correlates of COVID-19 cough sounds were analyzed by means of conventional statistical tools based on the ComParE set containing 6,373 acoustic higher-level features. Secondly, machine learning models were trained to automatically detect COVID-19 and evaluate the features’ contribution to the COVID-19 status predictions. A number of acoustic parameters of cough sounds, e.g., statistical functionals of the root mean square energy and Mel-frequency cepstral coefficients, were found to be relevant for distinguishing between COVID-19 positive and COVID-19 negative cough samples. Among several linear and non-linear automatic COVID-19 detection models investigated in this work, Ridge linear regression achieved a UAR of 0.632 for distinguishing between COVID-19 positive and COVID-19 negative individuals irrespective of the presence or absence of any symptoms and, thus, performed significantly better than chance level. With regard to explainability, the best performing machine learning models were found to have put higher weight on acoustic features that yielded higher effects in conventional group difference testing.
OUTLOOK
Automatic COVID-19 detection from cough sounds can be helpful for the early screening of COVID-19 infections, saving time and resources for clinics and test centers. Specifically, machine listening applications distinguishing between cough samples of symptomatic COVID-19 positive individuals and those of individuals with other diseases could advise the patient to stay at home and contact her/his doctor by phone before entering clinics/hospitals to meet medical professionals. This would help to prevent the spread of the virus in an especially vulnerable population. By distinguishing between cough samples produced by asymptomatic COVID-19 positive and COVID-19 negative individuals, an easy-to-apply instrument, such as a mobile application and a hand-held testing device, could help to prevent the unconscious transmission of the virus from asymptomatic COVID-19 positive individuals.
From our point of view, it is highly important for future studies to specify the symptoms more clearly (e.g., severity estimates, onset time of symptoms), to include additional aspects potentially affecting the cough sound, such as smoking and vocal cord dysfunctions, and to differentiate in the COVID-19 negative group between participants with chronic respiratory diseases such as asthma or COPD and patients with a temporary infection such as the flu. Furthermore, it would be interesting for future studies to acoustically analyze the cough phases separately, as previous studies reported certain phase-specific acoustic peculiarities for wet and dry coughs.
44- Hashimoto Y
- Murata A
- Mikami M
- et al.
Influence of the rheological properties of airway mucus on cough sound generation.
,45- Chatrzarrin H
- Arcelus A
- Goubran R
- et al.
Feature extraction for the differentiation of dry and wet cough sounds.
Moreover, it will be encouraging to consider more sound types (e.g., breathing and speech) and evaluate the physical and/or mental status of COVID-19 positive patients (e.g., anxiety) from speech for comprehensive COVID-19 detection and status monitoring applications.
80- Han J
- Qian K
- Song M
- et al.
An early study on intelligent analysis of speech under COVID-19: severity, sleep quality, fatigue, and anxiety.
,81- Qian K
- Schmitt M
- Zheng HY
- et al.
Computer audition for fighting the SARS-CoV-2 corona crisis – Introducing the Multitask Speech Corpus for COVID-19.
From the perspective of machine learning, feature selection methods will be investigated to extract useful features only. Deep learning models shall be explored for better performance due to their strong capability of extracting highly abstract representations. Particularly, when developing real-life applications for COVID-19 detection, it will be more efficient to skip the feature extraction procedure through training an end-to-end deep neural network with the input of audio signals or time-frequency representations. In addition to explaining linear classification models by analyzing the weights of the acoustic features in this study, explaining deep neural networks along the dimension of time frame or frequency will need to be investigated to provide a detailed interpretation for each specific cough sound, i.e., when and at which frequency band a cough sound shows COVID-19-specific acoustic peculiarities. For this purpose, a set of approaches could be employed, e.g., local interpretable model-agnostic explanations,
82- Ribeiro MT
- Singh S
- Guestrin C.
Why should I trust you?”: explaining the predictions of any classifier.
shapley additive explanations (SHAP),
83A unified approach to interpreting model predictions.
and attention mechanisms.
84- Ren Z
- Kong Q
- Han J
- et al.
CAA-Net: conditional Atrous CNNs with attention for explainable device-robust acoustic scene classification.
,85- Zhao Z
- Bao Z
- Zhao Y
- et al.
Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition.
Article info
Publication history
Published online: June 14, 2022
Accepted:
June 9,
2022
Publication stage
In Press Corrected ProofCopyright
© 2022 The Authors. Published by Elsevier Inc. on behalf of The Voice Foundation.