New Methodologies for the Analysis of the Speech Signal on Mobile Devices

speechanalysisThis research activity aims to the study and development of a new methodologies for the analysis of the speech signal on mobile devices. The goal is the realizing of an easy and fast instrument for people screening useful for detecting a probable pathologies of phonatory apparatus, namely “dysphonia”.

Although about 29% of the population has suffered from dysphonia at least once in a lifetime, in particular professional voice users such as singers, actors or teachers, voice disorders are still little known by most people, who therefore tend to underestimate the symptoms and delay obtaining appropriate treatment, influenced on their health and work functions.

Dysphonia is an alteration of the sound structure of the voice, understood primarily as a difficulty to control the volume or quality of the voice. It may have a multifactorial origin, an organic nature, such as congenital anomalies, lesions of the larynx or gastro-esophageal reflux. Voice disorders can be associated to unhealthy lifestyles as incorrect use of the voice, abuse of alcohol and smoking.

Ear nose and throat (ENT) clinicians and speech therapists usually combine different techniques to evaluate voice pathologies: laryngovideostroboscopic examination, an anamnestic evaluation, a subjective self-assessment of the voice and an acoustic analysis.

The estimation of fundamental frequency is an essential component in the acoustic analysis, providing information about the probably presence of voice disorders. For this reason, we have developed a robust methodology for the fundamental frequency estimation of the voice signal by analysing a continuous vocalization of the vowel /a/ of five seconds in length. Fundamental steps of methodology are shown in the figure.

This methodology is also able to classify the user’s voice as possibly pathological or healthy and to evaluate undesired noise that can introduce errors in the processing of the signal, compromising the classification performance, increasing the potential number of false-positive diagnoses.