Visual Similarity for Object Detection and Alignment
Shokrollahi Yancheshmeh, Fatemeh (2024)
Shokrollahi Yancheshmeh, Fatemeh
Tampere University
2024
Tieto- ja sähkötekniikan tohtoriohjelma - Doctoral Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2024-02-02
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-3257-0
https://urn.fi/URN:ISBN:978-952-03-3257-0
Tiivistelmä
Over recent years, the use of digital image and video content in science and industry has increased dramatically. Some of its various applications include 3-D reconstruction, skin cancer segmentation, autonomous driving, and robotics. These applications require vision systems capable of interpreting image or video content accurately. For instance, in autonomous driving, the object needs to be correctly detected and localized prior to taking any action. Depending on the type of object (pedestrian, car, traffic signs, etc.) and the distance, the decision made in a similar situation can be different. The fundamental problems in visual object detection and classification are: learning discriminative features and modeling the variation of visual appearance of objects within the same class (e.g., cat). Appearance variation, object viewpoints, and camera viewpoints make these problems even more challenging.
In this study, the author aims to define the similarity between image objects by employing graph theory. This thesis introduces an unsupervised framework for constructing a visual similarity network (VSN) of images. This VSN automatically discovers sub-classes and continues latent attributes. The constructed VSN has experimentally demonstrated improvement in the accuracy of image alignment and object detection.
In this study, the author aims to define the similarity between image objects by employing graph theory. This thesis introduces an unsupervised framework for constructing a visual similarity network (VSN) of images. This VSN automatically discovers sub-classes and continues latent attributes. The constructed VSN has experimentally demonstrated improvement in the accuracy of image alignment and object detection.
Kokoelmat
- Väitöskirjat [5027]