If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Address correspondence and reprint requests to Secundino Fernandez, University of Navarra, Department of Otorhinolaryngology Head and Neck Surgery, Head and Neck Surgery, Avda. Pio XII, 36, 31008 Pamplona, Spain.
Medical Engineering Laboratory, School of Medicine, University of Navarra, SpainVoice Laboratory, Department of Otorhinolaryngology, School of Medicine. University of Navarra, Spain
There are many physiological parameters recorded by devices that are becoming more affordable, precise and accurate. However, the lack of development in the recording of voice parameters from the physiological or medical point of view is striking, given that it is a fundamental tool for the work of many people and given the high incidence and prevalence of voice pathologies that affect people's communication. In this paper we perform a complete literature review on the dosimeters used in voice research and to present a prototype dosimeter with a pilot study to show its capabilities.
We conducted a literature review using the keywords [MONITORING], [PHONATION], [ACCUMULATOR], [PORTABLE], [DOSIMETRY], [VOICE] searching in PubMed, Trip Database, HONcode, and SciELO search engines. From our review of dosimeter designs, we created our own prototype consisting of two main components: a Knowles Electronics BU-7135-0000 accelerometer mounted on a neck brace; and the ultra-low power MSP430FR5994 microcontroller. The selected sampling frequency was 2048 Hz. The device calculates the F0 every 250 ms and the amplitude and phonation activity every 31.25 ms. A pilot study was conducted using 2 subjects: one male during 11 days and one female during 14 days.
This work includes devices that have been created during the last 45 years as tools for the diagnosis and monitoring of the treatment of cases of vocal pathology and for the detection of phonatory patterns or risk situations for developing voice disorders or vocal pathologies. We also present recordings with our new device on the pattern of daily talk time, the fundamental frequency and the relative intensity of two subjects on different days.
Interesting work has been done in the development of voice dosimeters with different approaches. In our experience it is not possible to access them for research and they are not yet in clinical use. It is possible that a joint approach with voice and voice disorders professionals and engineers working closely together could take advantage of current technology to develop a fully portable, useful, and efficient system.
, the moment subglottic pressure exceeds interaction of glottal resistance and supraglottic pressures, a wave-like movement of the vocal fold mucosa from bottom to top occurs. Voiced speech is produced by a pressure exchange system that originates in the power system, moves airflow from the lungs through the vocal folds and creates a pressure wave.
These wave-like movement of the vocal folds generate oscillations that are transmitted by the surrounding tissue and reach the surface of the skin.
The objectification and quantification of the main acoustic parameters of voice is a very important method of analysis for laryngology specialists and voice researchers when evaluating disorders of phonation. The precise acoustic characterization of the voice is a highly useful tool both for diagnosing different pathological entities and for verifying the results of different treatments. In addition, acoustic characterization allows for the evaluation of risk patterns in order to prevent future disorders. The analysis of voice disorders using instruments entails the acoustic, aerodynamic, and electroglottographic study of the respiratory, valvular, vibratory, and resonance elements that contribute to creating voice. These analytical tests are carried out using specialized equipment that is usually available in voice laboratories that are part of laryngology clinics and voice research centers. This equipment evaluates fundamental frequency (F0 in Hz), intensity (I in dBSPL), the acoustic spectrogram, vocal fold surface contact, transglottal airflow, subglottic pressure, laryngeal resistance, and other variables of interest. These studies are commonly carried out in facilities at a specific time and with certain means and conditions. Therefore, although the results are very informative, they are specific to that moment and are obtained in specific laboratory conditions. Study and continuous monitoring over long periods of time in patients’ usual surroundings of the aforementioned parameters and other highly relevant measurements, such as the amount of time the voice is used, is restricted to specific cases for research purposes.
In addition to microphonic analysis of the voice's acoustic signal, another method of measurement can be used to evaluate vocal fold phonatory activity during phonation. This method is not habitually used in the clinical setting and is based on quantifying skin vibrations that come from the vocal folds in the laryngeal area using accelerometers.
This method of measurement offers several advantages, including the possibility of quantifying and monitoring different physiological parameters that are a source of highly relevant diagnostic and prognostic information while respecting patients’ privacy in regards to the content of their speech.
To provide an adequate response to patients’ disorders and diseases, it is becoming more and more important to be able to provide individualized diagnosis and treatment. This is now referred to as “personalized medicine.” Though it may commonly be associated only with a generic study of problems, this approach refers to the analysis and control of different parameters that define problems in real-life scenarios and in an individualized manner.
In the healthcare setting, personalized diagnostic studies and treatment as well as follow-up on the patient outside of the hospital setting has a positive impact on patients’ results in terms of diagnostic precision, therapeutic efficiency, and quality of life.
The aim of this work is to present a comprehensive analysis of all dosimetry devices. This review of the devices, its characteristics and limitations, gives us an overview and helps us to understand why we have decided to develop the new equipment that we present here with preliminary data obtained with it.
There are three fundamental parameters of voice: fundamental frequency, intensity, and phonation time.
Fundamental frequency (F0)
For adults, the fundamental frequency of voice, or the frequency at which vocal folds vibrate, is between 180-250 Hz in women and 100-150 Hz in men.
Increases or decreases in fundamental frequency above the normal value for each individual without using appropriate techniques can lead to organic lesions.
The literature also states that fundamental frequency varies merely by modifying the conditions of the environment where it is measured, for example by being inside or outside of the laboratory (Rantala et al., 1988). For this reason, it is important to conduct continuous monitoring for an ample period of time in order to obtain conclusive data.
The measurement of fundamental frequency via an accelerometer signal may be carried out in different manners (Wirebrand M, 2011). The fast Fourier transform (FFT)
Intensity is the measurement of sound pressure expressed in Sound Pressure Level decibels (dBSPL), that is the ratio on the logarithmic scale between effective sound pressure (P1) and the reference pressure (20 µPa) (P0).
A sound level meter is usually used to measure intensity. However, pressure can be correlated with the vibration amplitude (the maximum extent of a vibration or oscillation, measured from the position of equilibrium) measured by the accelerometer and it is thus possible to calculate intensity.
clarified the difference between phonation time and speaking time: the first is when the vocal folds vibrate and the second includes mute and sonorous segments in addition to periods of silence. Phonation time, like fundamental frequency, can be measured from a signal received by the accelerometer (Wirebrand, 2011).
The most common voice disorders are chronic or recurrent as a result of abuse or poor use of the voice.
From these three parameters (F0, I, t), other voice parameters that may be of interest can be calculated, such as the number of vocal cycles or the distance traveled by the vocal folds as calculated by an ambulatory phonation monitor (APM)
. Various articles have related mechanical stress of the vocal folds and tissue damage with the three study parameters. Titze introduced the relationship between tension stresses in the vocal ligament and a high risk of damage to this tissue.
That same year, the results of Jiang's work indicate that higher subglottic pressure (positively related to intensity), shorter distance between arytenoid cartilages (positively related to F0) and greater elongation of the vocal cord are independently and positively correlated with stress peaks during phonation.
Other factors related to the number of vocal cycles were described in detail by Titze, Svec and Popolo, calculating a safe limit of 17 minutes of continuous phonation before generating tissue damage under normal conditions.
The most interesting, however, are probably shimmer and jitter. These parameters measure the variability of amplitude (shimmer) and of frequency (jitter), although there are different formulas for calculating each of these parameters.
Using this model, it has recently been verified that it is more precise to measure subglottic pressure than voice intensity. Therefore, measurements obtained from accelerometry provide valuable information.
The first step is to collect the information from all the voice dosimetry devices that have been developed. This work will discuss relevant aspects regarding the usefulness of these systems and will present an analysis of the technical possibilities and design of the different prototypes and equipment. After carrying out a comprehensive search in the various PubMed, Trip Database, HONcode, and SciELO search engines, a total of 23 publications have been evaluated. The search strategy included the words [MONITORING], [PHONATION], [ACCUMULATOR], [PORTABLE], [DOSIMETRY], [VOICE], and their possible combinations. The characteristics and possibilities of 19 different devices were discussed in the 23 works analyzed. The oldest work was published in 1974. Most of the works are from after 2000. The majority are research projects that involved the design and construction of different pieces of equipment and prototypes that have not been used in clinical or speech therapy settings or as an additional tool for the diagnosis and monitoring of different phonatory patterns and vocal disorders in patients with phonatory disorders or voice professionals.
After conducting a literature review of dosimetry, we designed our own version which we believe improves existing equipment because the processing is done on the device with low-power components that allow the possibility of developing wearable equipment measuring the main voice parameters.
The prototype device consists of two main components: an accelerometer mounted to neck holder, which transforms the ondulations of the vocal cords transmitted through the skin into an analog electrical signal; and the microcontroller, which performs signal acquisition and analysis.
We used BU-27135-0000 accelerometer from Knowles Electronics because of its small formfactor (7.92 mm × 5.59 mm × 2.24 mm), unidirectional and flat frequency response between 20 and 2000 Hz.
The Texas Instruments ultra-low power MSP430FR5994 microcontroller was used. Its main features include a low consumption math coprocessor, called LEA (Low Energy Accelerator), capable of performing FFT and linear filtering, which are essential in our design, has low base energy consumption and additional low consumption modes.
To make it work we used a 5V 2000mAh battery and a micro SD card where processed data was stored. Also, a 3D printed neck holder was used to place the sensor on the skin in the suprasternal region under the larynx.
The sampling rate selected was 2048 Hz. The device calculates the F0 each 250 ms and the amplitude and phonation activity each 31.25 ms.
As a preliminary test, 2 subjects (a woman and a man) who used the device for 14 and 11 days respectively were taken. The equipment was delivered at the end of each recording period to verify that the registration had been correct. Also, as a precaution, each day the data was downloaded and the battery recharged.
Subject 1 was a 46-year-old female and subject 2 a 26-year-old male.
Both subjects were asked to use their voices normally for as long as they wore the device. Likewise, they were asked to make an unusual use of their voice one of the days. This means that during a day of their choice they spoke with a higher intensity and / or frequency than usual.
For each of these days, F0, amplitude and phonation time were analyzed.
Voice dosimetry devices
Devices utilized in past literature
The principle data drawn from the different works are summarized in Table 1. It was not always possible to establish all relevant data regarding technical characteristics or the operation of the different devices. In addition to indicating the authors’ names and the reference to the work, Table 1 indicates the name of the device, if any; if it is a monitoring or response device; the type of sensor used (microphone or accelerometer); the parameters recorded (frequency, intensity, phonation time); the possible recording time; the frequency of sampling; the interval or subinterval of analysis; the size; the weight; and the cost. The majority of the devices have been designed for monitoring phonation. It is notable that only three of the devices use an accelerometer as a sensor. A microphone-contact microphone was used as the sensor in thirteen and the combination of a microphone and accelerometer was used in three. This fact is relevant given that collecting a microphone signal for analysis, regardless of how it is processed, does not protect the privacy and confidentiality of the content of the speech and recordings made. Most of the devices have been designed to record the amplitude or intensity of phonation (17/19) and phonation time (16/19). Approximately half can record phonation frequency (10/19) and only seven can record the three parameters: amplitude, frequency, and phonation time.
Among the devices studies, only four commercial devices that have attempted to meet the most salient requirements were found. These devices are: “Ambulatory Phonation Monitor” (KayPENTAX, USA), VoxLog (Sonvox AB, Sweden), VocaLog (Griffin Laboratories, USA), and Voice-Care (PR.O. VOICE, Italy). Currently, only VoxLog, Vocalog, and Voice-Care appear to still be marketed, although they are very difficult to get. Perhaps the high cost—between 500 and 5,000 dollars—and problems of a practical nature (size, weight, sensor fixation system, etc.) have contributed to the fact that their use is very much aimed at research and less towards the clinical interest that these devices undoubtedly have. For this reason, we are working on the development of a new voice dosimeter.
Measures extracted and limitations
Frequency and phonation time can be extracted directly from sensor measurements whereas voice intensity must be estimated based on the amplitude of the vibrations recorded by the device and a calibration process.
Table 2 shows the most important characteristics of the four devices that have been marketed for voice dosimetry in regards to the three basic variables. In addition, based on the phonation time and fundamental frequency, the APM calculates the wave cycles. With them, and knowing the mean vocal fold path, the distance traveled by the vocal folds can be calculated.
In regards to a response to the frequency and intensity, the devices emit warnings when the parameter associated with the response exceeds pre-established limits. The intention of this response is so that the user does not surpass certain limits for a specific voice variable.
Notably, the VoxLog device is capable of measuring ambient sound thanks to the fact that it includes a microphone. As it has a built in microphone, the VoxLog device is also able to autocalibrate intensity.
The voice dosimetry or monitoring devices are divided into either monitoring or response devices, although there are some that fit in both categories. The objective of the monitoring devices is to record the parameters measured for later analysis. Response devices are those that, according to the real-time measurements and limits set in the device, respond to some type of stimulus (acoustic or vibratory) in order to notify that the pre-established limit has been surpassed.
The majority of these devices, as stated above, use a microphone to record the signal. With the work of Cheyne et al.,
accelerometers started to be used to make these recordings.
Just as they differ in their data acquisition systems, they also differ in the localization of the sensor that records these data. Microphones can be suspended in front of the mouth or on the skin in the suprasternal region under the larynx (contact microphones). Accelerometers are all placed on the suprasternal area below the thyroid cartilage.
Another important factor that differentiates monitoring devices is the parameters they analyze. Frequency, intensity, and phonation time are the most commonly analyzed parameters, but every device focuses on different combinations of them; they may measure one, two, or three. Some devices calculate other parameters based on these measurements.
Most of the publications using voice dosimeters are related to teachers, given that the tool they use in their work is their voice. , relate the problems teachers have with their voices with the vocal loading that their phonatory system undergoes
. This same article reports that the differences between the two groups that were studied are greater at the end of every day and at the end of the week. Years later, a similar study came to the same conclusions,
Hunter & Titze, documented voice use in 57 teachers for two weeks. With the information gathered, they verified that phonation time during working hours (29.9% of each hour) was more than double with respect to non-working hours, the intensity was 2.5 dBSPL greater, and natural frequency was between 1 and 1.5 tones higher
Although teachers are the most affected group and thus the most studied, there are other professions that have increased vocal loading. They include speech therapists, trainers, nurses, telephone operators, and receptionists.
The use of a voice recording system with response developed by R. McGillivray et al. on a child population in 1993 shows that in just six 20 to 30 minute sessions, a child achieved the objective of maintaining an intensity of less than 65 dBa (McGillvray et al., 1994). Van Stan et al. described the use of response on alternate days
. The results show that on days that the response device was active, intensity decreased. However, this decrease is not assimilated, as on days without a response, the device recorded similar values to the initial values
. The study reports that 11 of the 32 patients who wore the device experienced complete recovery (polyps, nodules, or contact ulcers) and that another eight reduced the size of their lesions, avoiding surgery.
The voice study carried out by Horii & B Fuller, (1990), stated that following relatively short intubation (1.5-23.5 hours), shimmer and jitter values were greater in sustained vowels. Furthermore, following these intubations, the mean fundamental frequency and standard deviation of the reading increased. More recently, the voice has been studied before and after laryngeal surgery. Rest is recommended for patients who undergo surgery and these voice dosimetry devices allow for monitoring this rest. Indeed, Misono et al.,
verified that study patients reduced phonation time from 29% to 12% and intensity from 66.9 to 64.5 dBSPL following surgery.
Subject 1 made 14 recordings of between 4.38 and 7.17 hours. The daily speaking time varied between 0.24 and 1.03 hours. The day on which subject 1 performed an unusual pattern was day 7 (Figure 1A).
The F0 was 231.39 ± 55.62 Hz and if we eliminated the an unusual pattern day 220.53 ± 23.78 Hz. The relative intensity was 42.50 ± 2.26 m / s2 and eliminating the day 7 42.17 ± 1.87 m / s2. On the day of the an unusual pattern, the F0 was 320.81 ± 122.43 Hz and the relative intensity was 45.17 ± 3.12 m / s2 (Figure 2A and C).
Subject 2 made 11 recordings of between 2.76 and 6.39 hours with daily speaking time between 0.1 and 0.73 hours. Subject 2 performed an unusual pattern on day 9 (Figure 1B).
The F0 was 108.10 ± 17.75 Hz and if we eliminated the unusual pattern day 105.76 ± 11.36 Hz. The relative intensity was 43.68 ± 2.26 m / s2 and eliminating the day 9 43.70 ± 2.30 m / s2. On the day of the unusual pattern, the F0 was 148.91 ± 42.72 Hz and the relative intensity was 43.34 ± 1.41 m / s2 (Figure 2B and D).
Data can also be displayed by percentage of phonation time in intervals to show voice usage over time. (Figura 3).
It seems evident that being able to study the principal voice parameters and phonatory patterns of people with voice problems, those exposed to vocal overloading, and voice professionals (actors, announcers, singers, teachers, etc.) for a long period of time of days or even weeks in these people's working or resting hours is a very powerful tool for efficiently orienting and specifying diagnoses and treatments.
Despite the fact that several devices have been developed in the last 45 years for this purpose, nowadays, speech therapists, otorhinolaryngologists, singing teachers, and voice and voice disorder researchers do not use this type of device in their regular practice. There are multiple reasons for this, including technical limitations, size, patient confidentiality or privacy safeguarding, price, manufacturing, and more. Some of the devices presented in this work are mere prototypes within a research work or have been designed for a specific working group. The majority have been proposed from the engineering field and few from the point of view of voice experts (speech therapists, otolaryngologists, etc.).
The prototype that we present here has components that make it potentially wearable. It offers the possibility of objectifying and quantifying the phonation time (Figure 1) and allows identifying the days with different patterns than normal (Figure 2). In addition, it allows deepening the study of the use of the voice, being able to observe the percentage of phonation time throughout the record and assess the vocal load in the usual environment of the study subject (Figure 3).
CONCLUSION AND LIMITATIONS
Interesting work has been done in the development of voice dosimeters with different approaches. In our experience it is not possible to access them for research and they are not yet in clinical use.
It is possible that a joint approach with both engineers and professionals devoted to voice and voice disorders working closely together could take advantage of current technology in order to develop an entirely wearable system. This system would be portable, perfectly adapted to the patient, and respect the confidentiality and privacy of the speech by not recording acoustic signals with a microphone but rather vibrations. It would allow for longer recordings with immediate processing and storage of the signal. It would have low battery consumption and a very affordable price so that its use would become routine in different clinical settings, speech therapist offices, singing schools, occupational medicine offices, and more.
Unfortunately, we have not had access to the equipment that has been commercialized to be able to make a comparison and we have had to rely exclusively on the information found in the bibliography. On the other hand, although the case with two subjects is illustrative of the potential of the device, we hope to be able to expand these records in the future.
No ethical objections were observed by the Ethics Committee of the University of Navarra.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
The authors received no financial support for the research, authorship, and/or publication of this article.
This study has been supported by the “Asociación de Amigos de la Universidad de Navarra”.
A review of singing voice subsystem interactions-toward an extended physiological model of “Support”.