A Matrix-Based Approach To Source Separation Evaluation:Using Source-to-Single-Interference Ratios
Kosonen, Malakias (2026)
Kosonen, Malakias
2026
Tieto- ja sähkötekniikan kandidaattiohjelma - Bachelor's Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2026-03-16
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2025122912232
https://urn.fi/URN:NBN:fi:tuni-2025122912232
Tiivistelmä
Source separation models are commonly evaluated with four metrics; Source-to-Distortion Ratio (SDR), Source-to-Interference Ratio (SIR), Source-to-Noise Ratio (SNR) and Source-to-Artifact Ratio (SAR). This thesis focuses on SIR and how it can be modified to gain additional information. SIR is a sum of the interference from all other sources. This does not show which sources contribute to the interference the most. The proposed Source-to-Single-Interference Ratio (SSIR) shows how much each source contributes to each source's interference. In this thesis, the implementation and possibilities of SSIR are derived and explored. MUSDB18HQ is used as the evaluation test dataset. Three state-of-the-art models (XUMX, XUMX-L, HT Demucs) are evaluated with SSIR and their performance is analyzed based on the metric. The results show that the models separate vocals well from drums and bass, while bass and "other" were harder to separate well regardless of model. The thesis also goes over possible causes for this observation. The results and analysis show that a larger training dataset results in better separation overall. In the future, SSIR could be used to evaluate a greater variety of models and possibly find pitfalls and/or improvements in them.
Kokoelmat
- Kandidaatintutkielmat [10844]
