Below is a short summary of this project. All relevant Python code can be found in my github repository.
Overview
Natural Language Processing and Machine Learning techniques have been successfully employed in predicting and identifying psychiatric illness, especially in the case of Schizophrenia (see 1 and 2).
In this project we use similar techniques but turn our attention to Bipolar Disorder (BD) and attempt to identify its multiple states/episodes by analyzing verbal fluency tasks performed by patients.
Word similarity measures were extracted using the fastText model developed by Facebook’s AI Research (FAIR) lab, and supervised machine learning classification was performed using logistic regression, multinomial naive bayes, and support-vector-machines.
The Data
The data consisted of verbal fluency tasks, which are psychological tests in which participants have to produce as many words as possible with specific constraints. These were performed by 140 subjects with BD and 31 subjects from the control group at the University Hospital of Strasbourg. A total of nine verbal fluency tasks were administered in French. Six of these were associational, i.e. the patients were given an initial cue word (courage, debut, douleur, royaume, serpente).
The remaining three consisted of:
- a free fluency trial (flib) where they had to produce as many words as possible;
- a letter fluency trial (flit), where all words should start with the letter ‘p’;
and a category fluency trial (fcat), where patients had to produce as many words that fit under the semantic category ‘animal’.
Subjects were given two minutes to perform all tasks except the free fluency trial, where they were given 150 seconds. All words were lemmatized, connectives and pronouns were removed.
Patients with bipolar disorder can exhibit one of five different states: mania, mixed-mania, euthymia, mixed-depression, depression. Each subject would fall into one of five categories related to the state/episode they were presenting at the time of the task or be in the control group.
Logistic Regression and Multinomial Naive Bayes
For each type of task a list of all words produced by the subjects was extracted, and for each verbal fluency task performed by each subject a vector of word occurrences was created. In both cases grid-search-cross-validation method was used for tuning hyper-parameters on the training data, and then evaluated on the test data.
The multinomial naive Bayes classifier in general slightly outperformed the logistic regression classifier, yet both of them had accuracy of 70% in separating mania and mixed-mania in six out of the nine verbal fluency tasks.
Prediction of mania versus mixed-mania using Multinomial Naive Bayes classifier.
Verbal Fluency Task | Accuracy | F1 score |
---|---|---|
courage | 0.5 | 0.67 |
debut | 0.6 | 0.75 |
douleur | 0.7 | 0.82 |
fcat | 0.7 | 0.82 |
flib | 0.7 | 0.82 |
flit | 0.6 | 0.67 |
piscine | 0.7 | 0.82 |
royaume | 0.7 | 0.82 |
serpent | 0.7 | 0.82 |
Prediction of mania vs mixed-mania using logistic regression classifier.
Verbal Fluency Task | Accuracy | F1 score |
---|---|---|
courage | 0.5 | 0.67 |
debut | 0.7 | 0.82 |
douleur | 0.7 | 0.82 |
fcat | 0.7 | 0.82 |
flib | 0.7 | 0.82 |
flit | 0.7 | 0.82 |
piscine | 0.6 | 0.71 |
royaume | 0.7 | 0.82 |
serpent | 0.7 | 0.82 |
Support Vector Machines
From each task, five features were extracted:
- unique_entries: the number of unique entries;
- repeat_entries: the number of repeated entries ;
- repeat_words: the number of repeated words, since some entries consisted of more than one word;
- avg_global_sim: the mean similarity between all unique words;
- avg_neigh_sim: the mean similarity between every pair of words in sequence.
The similarities between words was extracted using a pre-trained fastText model for the French language trained on Common Crawl.
Again, a grid-search-cross-validation method was used on the training data for finding the best hyper-parameters (C, gamma) and the best kernel (linear, radial-basis-function), and then evaluated on the test data. We can see below that 5 tasks were able to correctly distinguish mania from depression with accuracy higher than 70%, and one with an accuracy of 93% (courage), and furthermore, all tasks were able to distinguish mania from mixed mania with accuracy of 70%.
Prediction of mania versus mixed-mania using support-vector machine classifier.
Verbal Fluency Task | Accuracy | F1 score | kernel |
---|---|---|---|
courage | 0.7 | 0.82 | rbf |
debut | 0.7 | 0.80 | rbf |
douleur | 0.7 | 0.80 | rbf |
piscine | 0.7 | 0.80 | rbf |
royaume | 0.7 | 0.82 | rbf |
serpent | 0.7 | 0.82 | rbf |
fcat | 0.7 | 0.82 | rbf |
flib | 0.7 | 0.82 | rbf |
flit | 0.7 | 0.82 | rbf |
Prediction of mania vs depression using support-vector machine classifier.
Verbal Fluency Task | Accuracy | F1 score | Kernel |
---|---|---|---|
courage | 0.91 | 0.93 | rbf |
debut | 0.55 | 0.67 | rbf |
douleur | 0.73 | 0.80 | rbf |
piscine | 0.73 | 0.80 | rbf |
royaume | 0.36 | 0.46 | rbf |
serpent | 0.82 | 0.89 | rbf |
fcat | 0.73 | 0.82 | linear |
flib | 0.45 | 0.4 | rbf |
flit | 0.36 | 0.46 | rbf |
Conclusions and Further Research
The support-vector machine classifier outperformed both the naive Bayes and the logistic regression classifier in distinguishing hypo-maniac states, i.e. mania and mixed-mania, and mania from depression. This seems to indicates a non-linear aspect of these classification problems.
Further accuracy can most likely be obtained by combining classifiers of different verbal fluency tasks into one via an additive model or ensemble.
References
- Figueroa-Barra, A., Del Aguila, D., Cerda, M. et al. Automatic language analysis identifies and predicts schizophrenia in first-episode of psychosis. Schizophr 8, 53 (2022). https://doi.org/10.1038/s41537-022-00259-3
- Bedi, G., Carrillo, F., Cecchi, G. et al. Automated analysis of free speech predicts psychosis onset in high-risk youths. npj Schizophr 1, 15030 (2015). https://doi.org/10.1038/npjschz.2015.30