<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/">
<rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3565">
    <dcterms:title><![CDATA[Application of Machine Learning in Neuromarketing Research for the Analysis of Customer Preferences<br />
<br />
]]></dcterms:title>
    <dcterms:abstract><![CDATA[<p>Neuromarketing combines neuroscience and marketing to analyze consumer behavior through tools like electroencephalography (EEG), which captures subconscious and emotional responses. This thesis applies machine learning (ML) techniques to EEG data for predicting purchase decisions, addressing the limitations of traditional marketing methods. Using the NeuMa dataset, which includes EEG and eye-tracking data, key features such as frontal alpha asymmetry (FAA), power spectral density (PSD), and alpha-beta power ratios were extracted to build predictive models. Four ML algorithms—Support Vector Machines (SVM), Random Forest (RF), Artificial Neural Networks (ANN), and Convolutional Neural Networks (CNN)—were evaluated based on accuracy, ROC AUC, and execution time. SVM emerged as the best-performing model, achieving 94.3% accuracy. 99% ROC AUC, with efficient processing time, making it suitable for neuromarketing research. The results confirm the critical role of EEG features from the frontal region, particularly FAA and alpha-beta power ratios, in predicting consumer preferences. These metrics reflect emotional and subconscious responses, emphasizing their importance in purchase decisions. This study demonstrates the value of integrating EEG with ML for consumer analysis, offering a scalable, unbiased, and data-driven approach to marketing research. By combining neuroscience with modern methods, this research provides a foundation for improving consumer preference analysis. It highlights the potential of EEG-based metrics and ML models to enhance marketing strategies, moving beyond traditional self-report methods toward more objective and accurate insights.</p>]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3564">
    <dcterms:title><![CDATA[Cancer Cells Detection Using Supervised Machine Learning]]></dcterms:title>
    <dcterms:abstract><![CDATA[Cancerous cells invade and destroy the healthy tissue of the body, including organs. It often begins in one part of the body before spreading uncontrollably to other areas of the organism. According to the World Health Organization (WHO), cancer is the cause of death worldwide - taking around 10 million lives yearly. The predominant cancers are colon, breast, lung, rectum, and prostate. Early detection is crucial to increase survival chances tremendously. Machine Learning (ML) tools have the potential to recognize key features in complex datasets enabling the classification of low and high risk patients. This research focuses on the use of supervised machine learning algorithms, such as Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs), and Decision Trees (DTs) for the development of predictive models, expected to result in effective and accurate decision-making based on available scientific experiments. The results of the study have showcased high accuracy rates (above 90 percent) on all applied models, with the highest accuracy scores using Feed Forward Neural Networks (approx. 97 percent). The use of machine learning methods can enhance the overall understanding of cancer progression, early detection, and treatment; however, thorough medical validation from professionals is essential for these methods to be adopted into routine clinical practice. The idea of adopting machine learning in the medical field is not to substitute human intelligence but to aid patients in receiving faster healthcare.]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3593">
    <dcterms:title><![CDATA[Comparative Analysis of Machine Learning Algorithms for Real Estate Price Prediction: Bosnia and Herzegovina vs. USA]]></dcterms:title>
    <dcterms:abstract><![CDATA[Real estate markets are impacted by a variety of variables, including changes in the population, urban development projects, and changes in economic policy. This thesis sets out to investigate the effectiveness of machine learning algorithms in predicting real estate prices, paying close attention to the particular circumstances of Bosnia and Herzegovina as well as the United States. While the US real estate market has a long history and is well-known for its capacity to bounce back from downturns in the economy, the tale of the BiH real estate industry is very different. In contrast to the United States, which has seen centuries of economic expansion, financial crises, and legislative changes, Bosnia and Herzegovina's market development is a result of a combination of past influences and present difficulties. Beyond simple quantitative comparisons, our research takes a holistic method to uncover the predictive capability of machine learning models.<br /><br />We explore the complexities of random forests and decision trees, making use of their ability to reveal intricate patterns in real estate databases. This research also includes time series modeling to recognize and comprehend the evolving patterns that characterize real estate dynamics throughout time. The analysis of SARIMAX, ARIMA, and Holt-Winters time-series models shows ARIMA's consistent accuracy, while SARIMAX and Holt-Winters excel in stability and trend capture, respectively. In machine learning, Decision Trees offer interpretability, while Random Forests show reduced error rates and enhanced accuracy. In the US dataset, SARIMAX has a Mean Absolute Percentage Error (MAPE) of 3.35% and ARIMA achieves 1.66%, while Holt-Winters shows 3.54%. Decision Trees have a MAPE of 2.97%, and Random Forests achieve 2.10%. In the BiH dataset, SARIMAX has a MAPE of 5.08%, ARIMA achieves 1.22%, while Holt-Winters shows 2.17%. Decision Trees have a MAPE of 0.83%, and Random Forests achieve 0.82%.]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3608">
    <dcterms:title><![CDATA[Comparison of Performanse and Security Aspects of Database Access via Stored Procedures and APIs]]></dcterms:title>
    <dcterms:abstract><![CDATA[Modern applications typically get the information in one of two modalities, namely API as an intermediary layer or stored procedures in the same database. The aim of this study is to contrast these methods, mainly performance-wise, and then securitywise, as well as suitability for maintenance as well as scalability. The project will implement identical stored procedures in the PostgreSQL database, and a API backend in Python. Execution time for a query, resource consumption as well as susceptibility to security flaws will be evaluated. The plan is to perform 10 runs for each comparison so as to ensure the obtained results are as accurate as well as dependable as possible. And one of the aims is to devise practical recommendations as to when to apply a stored procedure, and when the API method, where a boundary (equilibrium) has to be drawn between the logic of intermixing in the same data as well as the logic in the app layer.<br />
Today with applications being used in distributed environments on a widespread basis, awareness of them is most important in ensuring smooth and effective development of information systems, particularly in those fields where a lot of information has to be processed, such as e-business, banks etc.]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3567">
    <dcterms:title><![CDATA[Credit Card Fraud Detection Using Machine Learning Algorithms and Data Analysis Techniques]]></dcterms:title>
    <dcterms:abstract><![CDATA[In today’s world usage of card-based and online payment methods is rapidly increasing, and with this growth comes the issue of cybersecurity and overall fraud. The credit card fraud rate has never been higher, and it is following a growing trend.<br /><br />Therefore, improvement of credit card fraud detection systems is the main priority for all banks, systems that are providing credit card-based payments and all the participants in the digital payments market. This also comes for the purpose of the large percentage of the population that is using their credit cards daily, from everyday payments to international transactions that are of great value.<br /><br />The goal was to train multiple models to define if referenced transactions should be treated as fraud, and the results were measured by standard machine learning parameters. The model that had best results is Ensembled model using Decision Tree, Logistic Regression and K-Nearest Neighbor models with overall accuracy of 99.91% with Feature Selection algorithm applied. Ensemble method combines multiple models and creates the model with the best metrics possible. Along with this model, we have trained Logistic Regression model, K-Nearest Neighbors, Support Vectors Machines and Neural Networks, with accuracies respectively 88.37%, 85.48%, 00.73% and 98.11% with features selected.<br /><br />This research also covers the part of data preprocessing, as this step is crucial when building a model for credit card fraud detection systems. These systems must be fast and precise in order to be usable, as they are dealing with large sets of imbalanced data.<br />
<p>At the end of the study, individuals will have better insight in credit card transactions, will also be familiar with the different methods for detecting credit card frauds and will have insight in which model suits the needs of this case the most.</p>]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3638">
    <dcterms:title><![CDATA[ERP Project Failure Prediction Using Machine Learning Algorithms]]></dcterms:title>
    <dcterms:abstract><![CDATA[<div style="text-align:justify;">Enterprise Resource Planning (ERP) systems are of immense importance in simplifying business operations. However, most ERP projects fail owing to the complexity and scope of the projects. The present research attempted to determine the outcomes of ERP projects by employing machine learning methods and addressing factors which determine whether projects fail or succeed. This dissertation obtained data from different aspects of the projects that included successful and unsuccessful ERP deployments in terms of within which industry, project magnitude, the level of budget and time exceeding, background of team experience as well as technical challenges faced amongst others.<br />The research includes machine learning methods such as logistic regression, decision trees, and random forests in order to assess the importance of the relevant predictors of any project. By training and testing these applications on a sample composed of both successful and non-successful ERP projects, the goal of the model is to seek for factors and patterns which could help in forecasting troubling tendencies. This research is aimed at devising a functional framework that can be used by project managers, enabling them to take action before issuing their project plans for ERP systems. Such a predictive model could significantly help in decreasing the rates of ERP failures and hence assist businesses in carrying out successful implementations and enhancing their returns on technology investment.</div>]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3609">
    <dcterms:title><![CDATA[Leveraging of Machine Learning for Early Cancer Risk Identification and Predictive Flagging]]></dcterms:title>
    <dcterms:abstract><![CDATA[Early detection of cancer remains a vital component in reducing mortality and enhancing treatment outcomes. Traditional diagnostic approaches, such as biopsies, imaging scans, and clinical assessments, often identify cancer at a stage where the disease has already advanced. This delay in detection arises because early-stage cancers typically exhibit minimal or no symptoms, increasing the risk of late diagnoses and reduced chances of recovery.<br />
This proposed study investigates the potential of machine learning methodologies in facilitating early cancer risk assessment by analyzing complex medical datasets. The primary objective is to assess whether machine learning models can reliably identify patients at heightened risk before the disease becomes clinically evident. Through this approach, the study aims to contribute to the development of predictive systems that can trigger early interventions and encourage proactive health monitoring.<br />
The research seeks to answer the core question: “Can machine learning models effectively assess the risk of early-stage cancer using molecular-level data, such as gene expression profiles, prior to the onset of clinical symptoms?”<br />
Sub-questions to be explored include the accuracy of early-stage cancer detection using machine learning, the types of data that most influence prediction performance, and the feasibility of using such models to prompt timely medical evaluations in the absence of traditional diagnostic markers.<br />
The findings are expected to support advancements in personalized medicine by laying the groundwork for tools that assist in identifying high-risk individuals, potentially transforming the current approach to cancer screening and prevention.<br />
]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3639">
    <dcterms:title><![CDATA[Machine Learning-Driven Prediction of Heart Strokes]]></dcterms:title>
    <dcterms:abstract><![CDATA[<div style="text-align:justify;">Heart strokes remain one of the leading health risks in the world today. Timely prediction can significantly improve patient outcomes and healthcare resource allocation. This study aims to harness machine learning techniques to develop efficient predictive models for the early detection of heart strokes. <br />Research is based on a dataset created by combining different (five) datasets. The dataset encompasses patient demographics, clinical measurements, and historical medical records. The analysis focused on five machine learning models: Logistic Regression, Decision Tree, K-Nearest Neighbors, Random Forest, and Support Vector Machine. <br />The goal was not only to test different algorithms, but also to understand how data preparation, feature selection, and model choice impact the final results. The models were trained and tested on both the original dataset and an extended version, where new features were added by combining existing ones. <br />The results showed that models such as Logistic Regression, Decision Tree, and KNN performed better when applied to the original data. The Decision Tree model achieved an accuracy of 87.8% and an F1 score of 0.881, while Logistic Regression and K-Nearest Neighbors each attained F1 scores of 0.850 and 0.849, respectively. On the other hand, Random Forest and SVM showed significant improvements with the extended dataset. Random Forest performed the best overall, with an F1 score of 0.920 and an accuracy of 91.6% with enhanced results. <br />SVM also benefited from enhanced results, improving its F1 score from 0.892 to 0.879, which highlights how specific models can leverage additional features for improved generalization. <br />This tool could help detect risks earlier, allowing for timely interventions and prevention, thereby reducing the burden of strokes on healthcare systems and improving patient care. Limitations include data quality and availability, as well as potential bias in healthcare records.</div>]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3594">
    <dcterms:title><![CDATA[Optimization of Email Marketing Campaigns Leveraging Machine Learning Techniques]]></dcterms:title>
    <dcterms:abstract><![CDATA[Email marketing is widely recognized as an effective digital marketing channel, offering a considerable return on investment (ROI). One key challenge is determining the optimal day and time to send emails to maximize customer response rates. This thesis explores the application of machine learning (ML) algorithms to predict the best send times for email marketing campaigns, focusing on improving response rates. The research utilizes historical email marketing data, including customer demographics, response behavior, and email send dates. Based on this data, various machine learning models, including decision trees and random forests, as well as ensemble methods at the end, will be used to predict the optimal day for sending emails. The study will also examine how factors like customer age and tenure influence response rates at different times. The question is if the machine learning-based predictions of the optimal send day and time will significantly improve response rates compared to traditional methods. Also, incorporating demographic factors, such as age and tenure, hopefully will improve the accuracy of these predictions. The expected outcome is that MLbased optimization will outperform traditional scheduling methods, providing a more effective and data-driven strategy for email campaign timing.]]></dcterms:abstract>
</rdf:Description><rdf:Description rdf:about="https://omeka.ibu.edu.ba/items/show/3640">
    <dcterms:title><![CDATA[Predictive Modeling for Diabetes: A Comprehensive Analysis]]></dcterms:title>
    <dcterms:abstract><![CDATA[<p>Diabetes is a growing global health issue, and early prediction is key to preventing its effects. This thesis develops predictive models for diabetes using various machine learning methods, including Logistic Regression, Decision Trees, K-Nearest Neighbors (KNN), Random Forest, Support Vector Machine (SVM), and XGBoost, using the Diabetes Health Indicators dataset, which covers clinical, lifestyle, and demographic factors. Feature selection identifies the most important diabetic predictors, and model performance is evaluated using macro average and weighted average metrics, accuracy, precision, recall, F1-score, and error metrics (MSE and RMSE) to provide a thorough evaluation of model performance across the classes. Both SVM and Random Forest performed best overall, with an accuracy of 0.86. They also performed exceptionally well in weighted average and macro average measures, with overall recall and F1-scores of 0.86. SVM has the highest precision performance at 0.88, with Random Forest achieving the next best score of 0.87. These models are very dependable for diabetes prediction tasks because of their remarkable balance while handling both classes. SVM and Random Forest offer more dependable performance on a range of metrics, as evidenced by the weaker outcomes of Decision Tree, KNN, XGBoost, and Logistic Regression.</p>]]></dcterms:abstract>
</rdf:Description></rdf:RDF>
