Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Väitöskirjat
  • Näytä viite
  •   Etusivu
  • Trepo
  • Väitöskirjat
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Clustering Analysis and Active Learning for Sound Event Detection and Classification

Zhao, Shuyang (2022)

 
Avaa tiedosto
978-952-03-2266-3.pdf (6.995Mt)
Lataukset: 



Zhao, Shuyang
Tampere University
2022

Tieto- ja sähkötekniikan tohtoriohjelma - Doctoral Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2022-01-19
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-2266-3
Tiivistelmä
The objective of the thesis is to develop techniques that optimize the performances of sound event detection and classification systems at minimal supervision cost. The state-of-the-art sound event detection and classification systems use acoustic models developed using machine learning techniques. The training of acoustic models typically relies on a large amount of labeled audio data. Manually assigning labels to audio data is often the most time-consuming part in a model development process. Unlabeled data is abundant in many practical cases, but the amount of annotations that can be made is limited. Thus, the practical problem is optimizing the accuracies of acoustic models with a limited amount of annotations.

In this thesis, we started with the idea of clustering unlabeled audio data. Clustering results can be used to derive propagated labels from a single label assignment; meanwhile, clustering itself does not require labeled data. Based on this idea, an active learning method was proposed and evaluated for sound classification. In the experiments, the proposed active learning method based on k-medoids clustering outperformed reference methods based on random sampling and uncertainty sampling. In order to optimize the sample selection after annotating the k medoids, mismatch-first farthest-traversal was proposed. The active learning performances were further improved according to the experimental results.

The active learning method proposed for sound classification was extended to sound event detection. Sound segments were generated based on change point detection within each recording. The sound segments were selected for annotation based on mismatch-first farthest-traversal. During the training of acoustic models, each recording was used as an input of a recurrent convolutional neural network. The training loss was derived from frames corresponding to only annotated segments. In the experiments on a dataset where sound events are rare, the proposed active learning method required annotating only 2% of the training data to achieve similar accuracy, with respect to annotating all the training data.

In addition to active learning, we investigated using cluster analysis to group recordings with similar recording conditions. Feature normalization according to cluster statistics was used to bridge the distribution shift due to mismatched recording conditions. The achieved performance clearly outperformed feature normalization based on global statistics and statistics per recording.

The proposed active learning methods enable efficient labeling on large-scale audio datasets, potentially saving a large amount of annotation effort in the development of acoustic models. In addition, core ideas behind the proposed methods are generic and they can be extended to other problems such as natural language processing, as is investigated in [8].
Kokoelmat
  • Väitöskirjat [5189]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste