Winter Week 2 - Dataset and Model Research
- Ashish Sareen
- Jan 15, 2020
- 2 min read
This week, I collected data to form the dataset for the chord recognition model and researched potential model types. I found several potential datasets, most promising of which was a dataset from the Montefiore Institute. This dataset contains more than 2000 audio files representing 10 of the most widely used chords used in popular music. These chords are played on different instruments, such as guitar, accordion, piano, and violin. The majority of the audio files, however, are from 4 different guitars, each in both an average noisy environment and a noiseless environment. The authors of the dataset intended it be used for a machine learning model, as all of the files are nicely labeled and organized. The dataset will require minimal trimming before being used in our model.
The next step was to explore different models in order to figure out which would enable accurate, real-time prediction. From my research, I found that similar chord recognition projects used a variety of ML model types, such as simple linear classifiers, Hidden Markov models (HMM), and Support Vector Machines (SVM). I decided to move forward with SVM as I found that it works well for multi-class classification. Chord recognition falls under multi-class as there are distinct chromagram patterns A.K.A chords which we are attempting to classify. Furthermore, it should work well on our embedded system due to its relatively low computational complexity for prediction. The brunt of the complexity comes in training, which will be done ahead of time. I was able to find a C library called libsvm which handles the creation of SVM models and prediction using C data types.
This week, my partner was able to successfully test chromagram construction from chord audio files, and experiment with filtering and other signal processing methods. With these components from our work, my partner and I will be able to begin creating models and testing them during the next lab session.
Good job.