Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Multilabel Sound Event Classification with Neural Networks

Cakir, Emre (2014)

 
Avaa tiedosto
cakir.pdf (615.6Kt)
Lataukset: 



Cakir, Emre
2014

Master's Degree Programme in Information Technology
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2014-12-03
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201412051592
Tiivistelmä
There are multiple sound events simultaneously occuring in a real-life audio recording collected e.g. at a busy street in rush hour. The events may include traffic noise, sound of rain, people talking etc. The humans are amazingly good at distinguishing these individual events, but as of yet, there is not any machine that can detect these events with (even close to) human accuracy. Polyphonic nature of the environmental audio recordings makes it hard to detect single sound events when many events are overlapping. With the gigantic audio database and state-of-the-art machine learning methods of the digital age, this is bound to change.
In this thesis, we use frequency-domain features to represent the audio input and multilabel deep neural networks (DNN) to detect multiple, simultaneous sound events in a real-life recording. We extract frequency-domain features from these recordings in short time frames. DNNs are artificial neural networks (ANN) with two or more hidden layers and they are especially good at modeling highly nonlinear relations and finding intermediate representations between system input and output. This is exactly the case in real-life sound event detection. Every feature extract is used as a training example and we train the neural network with these examples.
For the evaluation of this work, we focus on the performance of different topologies of DNNs used in this task. There are a large number of hyper parameters that define the structure of a DNN, such as the number of neurons in a layer, the learning rate used during learning, number of the hidden layers etc. The effects of each of these parameters are investigated in detail. A detection accuracy of 66.5% is achieved, which outperforms the state-of-the-art method by a large margin.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [41871]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste