Sentiment Analysis And Price Prediction For Accommodation Reviews In Bosnia And Herzegovina: A Comparative Study Of Nltk And Hugging Face Nlp Techniques

Dublin Core

Title

Sentiment Analysis And Price Prediction For Accommodation Reviews In Bosnia And Herzegovina: A Comparative Study Of Nltk And Hugging Face Nlp Techniques

Author

Amila Čaušević

Abstract

The growing field of natural language processing (NLP) has huge potential in the advancement of consumer feedback and its application in determining pricing strategy in the hospitality industry. In this thesis, sentiment analysis and price predictions of accommodation reviews in Bosnia and Herzegovina are analyzed through a comparative study of two of the most commonly used approaches in NLP: NLTK - representing traditional methods, and Hugging Face - representing modern techniques. Initially, a long process of text preprocessing is performed that includes tokenization, lemmatization, stopword removal, and filtering of positive and negative reviews. Quantitative analysis such as word frequency distributions, measures of lexical diversity, and word co-occurrence tests reveal patterns within language use as well as the relationship between review attributes and sentiment.

Different frameworks  for sentiment analysis are then compared. The Hugging Face sentiment pipeline and more modern and recent transformer architectures like BERT, RoBERTa, and XLNet are compared with more traditional techniques (e.g., NLTK/VADER). Metrics for evaluation such as accuracy, precision, recall, and F1-score are used to assess the performance of the sentiment models. In order to develop predictive price models based on regression techniques like Linear Regression, Random Forest, and Gradient Boosting, the thesis additionally integrates sentiment scores with quantitative metadata, such as review ratings, location ratings, and accommodation categories. The results show that Random Forest regression is the most effective method for identifying subtle, non-linear sentiment-price correlations, even though transformer-based sentiment analysis can show promise in identifying subtle signals within guest reviews. Last but not least, this work offers helpful recommendations to help hoteliers in Bosnia and Herzegovina to create focused pricing strategies while also enhancing the general guest experience.

Keywords

NLP, Sentiment Analysis, Accommodation Reviews, Price Prediction, Bosnia and Herzegovina

Document Viewer