Symbolic Music Generation using a Variational Autoencoder and LSTM-Based Sequence Modeling
Symbolic Music Generation using a Variational Autoencoder and LSTM-Based Sequence Modeling
Ketan Kanjiya1, Piyush Sonani2, Upendrasinh Zala3
1Chief Research Officer, Kshatrainfotech Pvt Ltd, Ahmedabad, Gujarat, India
2Chief Technology Officer, Kshatrainfotech Pvt Ltd, Ahmedabad, Gujarat, India
3Chief Executive Officer, Kshatrainfotech Pvt Ltd, Ahmedabad, Gujarat, India
Abstract - Symbolic music generation has become an important application of deep learning, enabling computational models to learn musical patterns and generate new compositions directly from data. This paper presents a Variational Autoencoder and Long Short-Term Memory based framework for learning latent representations of symbolic music and generating coherent musical sequences. The proposed model is trained on piano-roll representations derived from the Nottingham MIDI dataset, where latent embeddings capture underlying melodic and temporal structures. The effectiveness of the model is evaluated through reconstruction metrics and latent-space analysis. Experimental results demonstrate stable training behaviour, strong reconstruction performance with an F1-score of 84.0%, and a well-organized latent space that supports meaningful music generation. The findings indicate that the proposed VAE-LSTM architecture effectively learns musical representations and can generate diverse symbolic music while preserving important structural characteristics of the training data.
Key Words: Music Generation, Variational Autoencoder (VAE), Long Short-Term Memory (LSTM), Latent Representation Learning, Deep Learning, MIDI