Predicting Sleep Disorders Using Machine Learning Algorithms

Dublin Core

Title

Predicting Sleep Disorders Using Machine Learning Algorithms

Author

Fikret Zajmović

Abstract

Sleep disorders such as insomnia and obstructive sleep apnea (OSA) affect millions globally and are linked to significant physical, cognitive, and psychological impairments. Traditional diagnostic methods—including polysomnography and self-reported questionnaires—are resource-intensive, time-consuming, and often unsuitable for large-scale or early-stage screening. To address these limitations, this study proposes a non-invasive, machine learning–based framework for the automated classification of sleep disorders using demographic, behavioral, and physiological features.
The research utilizes the publicly available Sleep Health and Lifestyle Dataset, comprising 400 records with 13 features, including age, gender, BMI category, sleep duration, stress level, blood pressure, and physical activity level. Five supervised learning algorithms were developed and evaluated: Logistic Regression, Random Forest, Support Vector Machine (SVM), XGBoost, and an Artificial Neural Network (ANN). The models were trained to classify individuals into one of three sleep health categories: No Disorder, Insomnia, or Sleep Apnea.
A comprehensive preprocessing pipeline was implemented, involving data cleaning, feature scaling, one-hot encoding, and SMOTE-based class balancing. Model development followed a nested 5-fold cross-validation strategy, with hyperparameter optimization conducted using GridSearchCV. Performance was evaluated using standard classification metrics: accuracy, macro-averaged precision, recall, F1-score, and ROC-AUC.

Results showed that XGBoost and ANN achieved the highest performance, with almost all scores exceeding 0.9, indicating strong predictive accuracy and generalization across validation folds. Feature importance analysis revealed that sleep duration, blood pressure, and BMI category were the most influential predictors. Visualization tools—including confusion matrices, radar charts, and feature importance plots—were used to enhance model interpretability and diagnostic transparency.
Despite the promising results, limitations exist. The relatively small dataset (n = 400) and the absence of critical variables such as sleep stage architecture, oxygen saturation, and environmental or comorbidity data constrain generalizability and clinical applicability. Future research should focus on incorporating larger, more diverse datasets and integrating longitudinal or real-time data from wearable devices to improve predictive robustness.
In conclusion, this study demonstrates the feasibility and effectiveness of machine learning algorithms in classifying sleep disorders using non-invasive inputs. The findings support the development of scalable, AI-driven diagnostic tools that can enhance sleep disorder screening in both clinical and consumer health settings, contributing to the advancement of telemedicine, digital health innovation, and personalized preventive care.

Keywords

sleep disorder classification, insomnia, sleep apnea, machine learning, XGBoost, artificial neural networks, SMOTE, predictive modeling, telehealth, health informatics

Document Viewer