Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Score-informed separation of classical music

Tunturi, Eetu (2025)

 
Avaa tiedosto
TunturiEetu.pdf (1.785Mt)
Lataukset: 



Tunturi, Eetu
2025

Tietotekniikan DI-ohjelma - Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
Hyväksymispäivämäärä
2025-07-02
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202507027494
Tiivistelmä
Music source separation is the task of separating a mixture of instruments into constituent tracks. Music source separation models usually only use an audio mixture to do the separation, but additional information can be used to improve the model’s separation capability. This thesis proposes two new methods of using musical score to improve music source separation. The proposed methods are compared to a baseline system. The baseline is a neural network architecture called X-UMX, which uses the magnitude spectrograms of the mixture to create separation masks that are applied to the magnitude spectrogram of the mixture and combined with the original phases to obtain the separated signals. A score-informed model is proposed, which modifies the X-UMX architecture to use the concatenation of the score and magnitude spectrogram of the audio mixture as its input. Additionally, the score-only model is proposed, which relies solely on the score to perform separation by replacing the magnitude spectrogram of the mixture with the score as the only input to X-UMX. The models are trained using the synthetic SynthSOD dataset. The musical score for the SynthSOD dataset was not previously made available, so as a part of this work the scores are published. The MIDI files used to synthesize the dataset are not temporally aligned with the audio, so an automatic audio-to-score alignment system is used to align them. The alignment and publication of the scores is the second notable contribution of this work besides the proposed models. The proposed systems are evaluated using the URMP and Aalto anechoic orchestra datasets, which contain real recordings. The score-informed model improves compared to the baseline system, but neither generalizes to the real data. The score-only model performs slightly worse than the other models in synthetic data but achieves a big improvement in synthetic-to-real generalization. The scarcity of real recordings is a big problem in the separation of classical music, and better synthetic-to-real generalization is an important step toward solving that problem.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [41307]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste