Dublin Core
Title
Real Time AI Music Generation System Based on Weather Conditions
Abstract
The Generative Artificial Intelligence is gaining popularity every day and uses contextual data to personalize and enhance user experiences. The Thesis explores music generation that is conditioned on weather and it influences musical compositions by connecting MIDI music data with corresponding weather attributes, for example sunny and cloudy weather.
In this thesis, three generative models are compared for the task of weather based music generation; Conditional Variational Autoencoder (cVAE), a Conditional Generative Adversarial Network (cGAN) and Long Short-Term Memory (LSTM) network. Mentioned models are implemented and trained on combination of large MIDI corpus with historical weather information.
Performance of models in this paper are evaluated using metrics that capture musical diversity and quality, such as pitch range, unique pitches and pitch variance, and for fidelity to real data is measured by mean squared error and KL divergence. Results of the paper showed that the cVAE produced the most diverse music for music that is context sensitive. Building upon these findings, the thesis presents the architecture and functionality of a real time full stack application. This system acquires weather data, processes it through the cVAE and generates musical compositions. These compositions are then delivered and rendered through a web interface. This research shows the potential of combining environmental data with AI generated music and gives a framework for applications such as adaptive game soundtracks, mood-based music therapy, or dynamic background music systems that respond to the user's environment. Thesis also points out limitations of the models and gives future research directions in terms of hybrid architectures, richer environmental data representations, and user perception studies.
In this thesis, three generative models are compared for the task of weather based music generation; Conditional Variational Autoencoder (cVAE), a Conditional Generative Adversarial Network (cGAN) and Long Short-Term Memory (LSTM) network. Mentioned models are implemented and trained on combination of large MIDI corpus with historical weather information.
Performance of models in this paper are evaluated using metrics that capture musical diversity and quality, such as pitch range, unique pitches and pitch variance, and for fidelity to real data is measured by mean squared error and KL divergence. Results of the paper showed that the cVAE produced the most diverse music for music that is context sensitive. Building upon these findings, the thesis presents the architecture and functionality of a real time full stack application. This system acquires weather data, processes it through the cVAE and generates musical compositions. These compositions are then delivered and rendered through a web interface. This research shows the potential of combining environmental data with AI generated music and gives a framework for applications such as adaptive game soundtracks, mood-based music therapy, or dynamic background music systems that respond to the user's environment. Thesis also points out limitations of the models and gives future research directions in terms of hybrid architectures, richer environmental data representations, and user perception studies.
Keywords
AI Music Generation, cVAE, cGAN, LSTM, MIDI
