Dublin Core
Title
Comparing Sentiment Analysis Emotion Classification Models for Psychotherapy Use
Abstract
Sentiment analysis could be a powerful tool in evolving psychotherapy. There is a rapid increase in patients seeking mental health help, and NLP could help make their experience more accessible and efficient. A sentiment analysis-based journal could help users track their thought patterns, their severity, and progress over time. This paper investigates the effectiveness of Naive Bayes, Random Forest, Support Vector Machine, XGBoost and BERT algorithms paired with TF-IDF, Bag of Words, One-Hot Encoding and Word2Vec feature extraction algorithms in emotion classification of text for future journal use. Comparative analysis helps understand which algorithms could be best suited for this type of multi-label classification, and broadens current research by testing several algorithms, which can show what should be further worked upon in the field, and which algorithms are best to avoid. Many studies test only one or two algorithms, leaving less room for comparison on the same dataset and under the same conditions, so it is unclear if the accuracy differences in different studies are derived from a better model or a better dataset. Furthermore, other studies do not provide a comparative analysis of feature extraction models. The four machine learning algorithms were trained on a dataset of 17.449 emotion-annotated sentences after preprocessing steps including tokenization, lemmatization, and vectorization for feature extraction. Out of classical models, Naive Bayes performed the worst with a 76% accuracy, and XGBoost performed the best with a 0.88 accuracy. Furthermore, BERT accomplished 93% accuracy, making it the best performing model in the study. Each algorithm performed better with a different vectorization method. This shows improvement over other research in the field, and the potential of sentiment analysis in aiding psychotherapy needs.
Keywords
Random Forest, Support Vector Machine, XGBoost, Naıve Bayes, BERT, sentiment analysis, psychotherapy, journal, emotion classification, TF-IDF, Bag of Words, One-Hot Encoding. Word2Vec
Language
English language