KNN makes a prediction about a new instance by searching through the entire set to find the k “closest” instances. k Nearest Neighbors (KNN): Makes predictions about the validation set using the entire training set. The class with the largest probability is the prediction. LDA estimates the mean and variance for each class from the training data, and then uses properties of statistics (Bayes theorem, Gaussian distribution, etc) to compute the probability of a particular instance belonging to a given class. Linear Discriminant Analysis (LDA): Assumes that the data is Gaussian and each feature has the same variance. Training the model will learn the optimal weights and biases. An output of 1 represents one class, and an output of 0 represents the other. Performance Metrics TP = true positive, FP = false positive, TN = true negative, FN = false negative Accuracy: (TP+TN)/(P+N) Matthews Correlation Coefficient: 1=perfect, 0=random, -1=completely inaccurate Algorithms Employed Logistic Regression (LR): Uses the sigmoid logistic equation with weights (coefficient values) and biases (constants) to model the probability of a certain class for binary classification. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. Symptoms include: “frozen” facial features, bradykinesia (slowness of movement), akinesia (impairment of voluntary movement), tremor, and voice impairment. Background Parkinson's Disease Parkinson’s is a progressive neurodegenerative condition resulting from the death of the dopamine containing cells of the substantia nigra (which plays an important role in movement). Voice analysis gives the added benefit of being non-invasive, inexpensive, and very easy to extract clinically. Why speech features? Speech is very predictive and characteristic of Parkinson’s disease almost every Parkinson’s patient experiences severe vocal degradation (inability to produce sustained phonations, tremor, hoarseness), so it makes sense to use voice to diagnose the disease. Because of these difficulties, I investigate a machine learning approach to accurately diagnose Parkinson’s, using a dataset of various speech features (a non-invasive yet characteristic tool) from the University of Oxford. This is not much better than random guessing, but an early diagnosis is critical to effective treatment. A study from the National Institute of Neurological Disorders finds that early diagnosis (having symptoms for 5 years or less) is only 53% accurate. Unfortunately, this method of diagnosis is highly inaccurate. Instead, doctors must perform a careful clinical analysis of the patient’s medical history. There is no single test which can be administered for diagnosis. Parkinson’s is characterized primarily by the deterioration of motor and cognitive ability. GitHub - Aastha2104/Parkinson-Disease-Prediction: Introduction Parkinson’s Disease is the second most prevalent neurodegenerative disorder after Alzheimer’s, affecting more than 10 million people worldwide.