Tablature Notation from Monophonic Guitar Audio Using CNN
Simola, Inkariina (2023)
Simola, Inkariina
2023
Tietojenkäsittelyopin maisteriohjelma - Master's Programme in Computer Science
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2023-06-12
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202306136720
https://urn.fi/URN:NBN:fi:tuni-202306136720
Tiivistelmä
Automatic Music Transcription for instruments with fretboards, such as the guitar, involves transcribing audio into either standard notation or tablature notation. Tablature notation provides a one-to-one mapping between the symbol for a note and the string-fret combination used to produce it, and is often preferred over standard notation for this reason. Detecting the string-fret combination used to produce a note involves pitch detection and string detection, which are usually performed in this order in existing approaches. This Master's Thesis focuses on electric guitar string detection from monophonic samples using a convolutional neural network (CNN).
A dataset containing over 10000 guitar notes with a detectable fundamental frequency was collected from three electric guitars and feature engineered to extract spectrogram, Mel-spectrogram and constant-Q transform per sample. Three convolutional neural networks were trained, one on each feature, to detect the guitar string from which each original sample had originated. The models were subjected to 6-fold stratified cross-validation. A string detection accuracy of 0.932 was achieved with the model trained on the constant-Q transform data.
A dataset containing over 10000 guitar notes with a detectable fundamental frequency was collected from three electric guitars and feature engineered to extract spectrogram, Mel-spectrogram and constant-Q transform per sample. Three convolutional neural networks were trained, one on each feature, to detect the guitar string from which each original sample had originated. The models were subjected to 6-fold stratified cross-validation. A string detection accuracy of 0.932 was achieved with the model trained on the constant-Q transform data.