Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Väitöskirjat
  • Näytä viite
  •   Etusivu
  • Trepo
  • Väitöskirjat
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Multi-Task Networks and Anomaly Detection in Computer Vision

Lagos Benitez, Juan Pablo (2025)

 
Avaa tiedosto
978-952-03-3950-0.pdf (27.10Mt)
Lataukset: 



Lagos Benitez, Juan Pablo
Tampere University
2025

Tieto- ja sähkötekniikan tohtoriohjelma - Doctoral Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2025-05-28
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-3950-0
Tiivistelmä
In this dissertation, we address key challenges in computer vision, focusing on multitask learning, unstructured environments, and the use of heterogeneous datasets for image anomaly detection. We propose novel methods and datasets across four core studies, yielding quantitative improvements across tasks.

We first present a multi-task convolutional neural network (CNN) that jointly performs semantic segmentation and depth completion, demonstrating a significant improvement in performance compared to single-task networks. When evaluated on the Virtual KITTI 2 dataset, our approach achieved a notable increase in both depth and segmentation accuracy, underscoring the benefits of joint training.

Next, we extend the multi-task approach to panoptic segmentation and depth completion, again using Virtual KITTI 2. Our model processes RGB images and sparse depth maps to deliver dense depth maps, along with semantic, instance, and panoptic segmentation. Despite handling multiple tasks, the model maintained high accuracy without a significant increase in computational cost.

For real-world applications, we introduce the FinnWoodlands dataset, containing 4,226 manually annotated objects for instance, semantic, and panoptic segmentation, with 60.6% of the annotations corresponding to three tree species ("Spruce," "Birch," and "Pine"). We benchmarked three state-of-the-art models, revealing the challenges posed by unstructured forest environments and the need for more robust models for such scenarios.

Finally, we present two novel datasets, CARS-AD and ROADS-AD, for unsupervised anomaly detection (AD). These datasets introduce diverse anomalies across thousands of samples, with pixel-wise ground truth annotations. Our benchmarks highlight the limitations of existing AD models, with the best-performing methods, Csflow and U-Flow on CARS-AD and Reverse Distillation on ROADS-AD, showcasing the complexity of these new datasets.

Our results demonstrate the effectiveness of multi-task networks in holistic scene understanding, cost-effective data collection for complex environments, and the critical role of heterogeneous datasets in advancing image anomaly detection. This research paves the way for future work in both structured and unstructured settings, pushing the boundaries of state-of-the-art techniques.
Kokoelmat
  • Väitöskirjat [5015]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste