Computer-Aided Diagnosis of Speech Disorder Signal in Parkinson’s Disease

Computer-aided diagnosis (CAD) can be used as a decision support system by physicians in the diagnosis and treatment of disordered speech especially those who specialize in neurophysiology diseases. Parkinson's disease (PD) is a progressive disorder of the nervous system that affects movement. It develops gradually, sometimes starting with a barely noticeable tremor in speech. It has been found that 80% of persons with PD reported speech and voice disorders. Parkinson's disease symptoms worsen as the condition progresses over time. Therefore, Speech may become soft or slurred and these deficits in speech intelligibility impact on health status and quality of life. Different researchers are currently working in the analysis of speech signal of people with PD, including the study of different dimensions in speech such as phonation, articulation, prosody, and intelligibility. Here, we present the characteristics and features of normal speech and speech disorders in people with PD and the types of classification for implementation of the efficacy of treatment interventions. The results show that our classification algorithm using ANN is outperformed KNN and SVM. ANN is a practical and useful as a predictive tool for PD screening with a high degree of accuracy, approximately 96.1% of a correct detection rate (sensitivity 94.7%, and specificity 96.6%). Based on the high levels of accuracy obtained by our proposed algorithm, it can be used for enhancing the detection purpose to discriminate PD patients from healthy people. Our algorithm may be used by the clinicians as a tool to confirm their diagnosis.


INTRODUCTION
Many neurologic diseases [1] have been effected on speech production such as Parkinson's disease (PD), stroke [2], and multiple sclerosis [3]. People with Parkinson's develop problems with their speech and communication that can include: lack of volume, hoarse voice, lack of expression and indistinct speech [4]. Interestingly, there were differences concerning the perceptual speech ratings between male and female PD speakers which had been previously reported concerning some distinctive measures of acoustic speech analysis [5].
Although Parkinson's disease can't be cured, medications may markedly improve its symptoms. In occasional cases, early classification of speech abnormalities is critical for a speech-language therapist (SLT) to regulate certain regions of the brain and improve symptoms [6]. Schulz and Grant [7] conducted a review of the different treatment approaches for people with PD up to the time of their manuscript and examined the effects of these treatments on speech. Treatment methods reviewed included speech therapy, pharmacological intervention, and surgical procedures. A well-studied technique for improving vocal volume was the Lee Silverman Voice Treatment (LSVT). LSVT is an intensive program of therapy that aims to improve intelligible oral communication by teaching speakers to increase their vocal loudness using tasks designed to maximize voice and breathing functions [8].
Many methods for pathological voice classification have been proposed using this database. In general, pattern recognition of speech signal based on analysis of the effects on features of various data including normal and pathological voices [9]. The choice of the employed features is generally driven by the accuracy of classification, time needed for classification, and cost of performing classification.

Data Extraction
The present study included PD database by the National Center for Voice and Speech (NCVS) that comprises 31 people which composed of 23 with PD, and 10 healthy controls an extension of the database used in [15]. The 10 healthy controls consist of 4 males and 6 females, had an age range of 46 to 72 years with (mean ± standard deviation) 61 ± 8.6 years. The dataset was created by Max Little [16] of the University of Oxford, in collaboration with the NCVS, Denver, Colorado, who recorded the speech signals. The original study published the feature extraction methods for general voice disorders. Each column in the dataset is a particular voice measure, and each row corresponds one of 195 voice recording from these individuals ("name" column).
The main aim of the data is to discriminate healthy people from those with PD, according to "status" column which is set to 0 for healthy and 1 for PD. The data is in ASCII CSV format. The rows of the CSV file contain an instance corresponding to one voice recording. There are around six recordings per patient, the name of the patient is identified in the first column.
We divided our dataset to training data and testing data half by half. The training set is used for classifier design of known data while the testing set is used for performance assessment of unknown data. We used three methods for classification; artificial neural network (ANN) [17], K-nearest neighbor (K-NN) classifier [18], and support vector machines (SVM) [19] by using Matlab program to distinguish the differences between normal and abnormal speech.

Selection features from the speech signals
There are several types of features of a speech signal such as features that describing the phonation [20], features that describing the speech intensity, features that describing the speech quality, and non-linear dynamic features. We selected features that describing the phonation, which is the vibration of the vocal folds to create sound, such as fundamental frequency (F0), and 5 types of jitter (e.g. Pitch Perturbation Quotient PPQ). Features that describing the speech intensity are 6 types of shimmer (e.g. Amplitude Perturbation Quotient APQ). Features that describing the speech quality are the harmonic-to-noise ratio (HNR), and the noise-to-harmonic ratio (NHR). The set of Non-linear dynamic features includes correlation dimension (D2), Recurrence Period Density Entropy (RPDE), Detrended Fluctuation Analysis (DFA) and Pitch Period Entropy (PPE). DFA is a method for determining the statistical self-affinity of a signal.
Most speech features based on Multidimensional Voice Program (MDVP), which is a computer program which analyzes various aspects of voice, can detect abnormal voice patterns of patients with upper airway pathology [21]. We introduced selection features of our dataset as shown in Table 1. All features for speech dataset serve as input to a set of statistical classifiers. The general algorithm that is applied in this work is depicted in Figure 1.

RESULTS
Classification of speech dataset may be a useful tool when diagnosing the speech disorders, as well as in evaluating response to treatment. We have applied 22 features extraction from our speech signal to three classification methods. These features were extracted to form the features vectors. In this work, the T-test [22] was used to test the significance of each feature to be used in classifying different speech types. Student's t-distribution or simply the t-distribution is a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. In ANN, we supposed that type of network was feed-forward and the number of hidden neurons were 15. In KNN, different numbers of K-nearest Neighbor were used and the best accuracy was given at k=1 as seen in Figure 2. The SVM is a powerful classification technique in the statistical learning theory field, which was developed as a binary classifier. We evaluated the classification performance of three classifiers configurations on the testing set. Sensitivity, specificity for testing and accuracy of each classifier are computed as in Table 2.   The results confirm that the ANN classifier provides best results in the accuracy, sensitivity and specificity. On the other hand, SVM classifier shows the lowest specificity. KNN and SVM classifier show the lowest sensitivity. As can be seen, there is the noticeable increase in the value of the specificity of ANN compared to that of the performance ones.
The results show that our classification algorithm using ANN is outperformed KNN and SVM. ANN is a practical and useful as a predictive tool for PD screening with a high degree of accuracy, approximately 96.1% of a correct detection rate (sensitivity 94.7%, and specificity 96.6%). Also, we found that our results better than in [14] for recognition the speech signal of PD patient.

DISCUSSION
Some people with PD experience changes in cognition and language, which make it difficult to think quickly, to manage multiple tasks, to find words or to understand complex sentences. These voice changes may make a person's speech less precise and more difficult to understand. Some individuals with PD may speak more slowly. Others, perhaps 10 percent, accelerate their speech so much that they stumble over sounds, and seem to be stuttering. Therefore, CAD algorithms have been developed to help clinicians give an accurate diagnosis and to reduce the number of wrong decisions regarding the PD. In another word, it may be used by the clinicians as a tool to confirm their diagnosis.
Over the years, there has been an improvement in the detection algorithms but their performance is still not perfect. Another issue is extracting and selecting appropriate features that will give the best classification results. Furthermore, the choice of a classifier has a great influence on the final result and classifying abnormalities of speech is a difficult task even for expert clinicians. Further developments in each algorithm step are required to improve the overall performance of computer-aided detection and diagnosis algorithms.

Measuring Performance with ROC Curves
Receiver Operating Characteristic (ROC) curves are a common approach used to assess the performance of a classifier [23]. An ROC curve determines the number of true positive, true negatives, false positives, and false negatives and produces a summary statistic for performance, detailing model accuracy, sensitivity, and specificity. It achieves this by plotting the true positive rate against the false positive rate at different possible cut points. The area under the curve (AUC) value measures the discrimination, that is, the ability of the model to correctly classify the true positives and negatives.

CONCLUSION
Speech disturbance is a common symptom for individuals with PD. CAD may be defined as a diagnosis tool which takes into account the computer output as a second opinion. The purpose of CAD is to improve the diagnostic accuracy and the consistency of the speech signal interpretation. It should be of a particular value to enable the early detection of speech disturbance. Based on the high levels of accuracy obtained by our proposed algorithm, it can be used for enhancing the detection purpose to discriminate PD patients from healthy people.
Further work is required to establish appropriate methods of speech investigations which produce objective data, fulfill the demands of validity and reproducibility, time and cost effectiveness, and mirror bests the functional disability of patients.