Large Scale Image Retrieval of Small Objects
Granat, Jesper (2020)
Granat, Jesper
2020
Tietotekniikan DI-tutkinto-ohjelma - Degree Programme in Information Technology, MSc (Tech)
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2020-04-08
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202004073115
https://urn.fi/URN:NBN:fi:tuni-202004073115
Tiivistelmä
Image retrieval is a classic problem in computer vision. The goal is to find relevant images from a database given an example of a relevant image. This thesis focuses on a specific subset of image retrieval where the objects in the image can be small thus making image retrieval difficult due to background clutter. The most widely used methods in the literature are developed for images of buildings covering most of the image making the retrieval task easier than using small objects with varying backgrounds. Due to the small size of the object, its retrieval with existing methods can be difficult. Additionally, most image retrieval benchmarks use manually annotated bounding boxes for the query object.
This problem is approached with an object detection approach. More specifically, since no more training data is desired, weakly supervised object detection is used to determine regions in the image where the retrieval object lies. This means that the same training data can be used for training in the object detection and in the retrieval phases. Additionally, no additional annotation is needed, as all can be done with image-level labels.
The empirical results of the method are studied from two points of view. First, to what extent can small object retrieval be improved by locating the object in images before retrieval, and to what extent can the manual annotation be used in the most common benchmarks be automated with weakly supervised object detection. The results show that the weakly supervised object detection improves slightly the image retrieval results for small objects from similar viewpoints but does not significantly improve results for large objects. On the other hand, using an object detector is beneficial compared with using the whole image when cropping at all is beneficial.
This problem is approached with an object detection approach. More specifically, since no more training data is desired, weakly supervised object detection is used to determine regions in the image where the retrieval object lies. This means that the same training data can be used for training in the object detection and in the retrieval phases. Additionally, no additional annotation is needed, as all can be done with image-level labels.
The empirical results of the method are studied from two points of view. First, to what extent can small object retrieval be improved by locating the object in images before retrieval, and to what extent can the manual annotation be used in the most common benchmarks be automated with weakly supervised object detection. The results show that the weakly supervised object detection improves slightly the image retrieval results for small objects from similar viewpoints but does not significantly improve results for large objects. On the other hand, using an object detector is beneficial compared with using the whole image when cropping at all is beneficial.