Perceptual Approaches in Image and Video Analysis
Birinci, Murat (2017)
Birinci, Murat
Tampere University of Technology
2017
Teknis-taloudellinen tiedekunta - Faculty of Business and Technology Management
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-15-3982-4
https://urn.fi/URN:ISBN:978-952-15-3982-4
Tiivistelmä
Recent advances in digital technology enabled the use of multimedia in various fields of our lives. Education, health, security, entertainment, business and many other sectors started using all kinds of multimedia material for their benefits to provide better services. In order to utilize the full potential of such material and enable their effective consumption in those areas, accurate analysis and understanding of the multimedia content is essential. Content based multimedia analysis aims to provide this insight through various computer algorithms and extract relevant information to support different fields. When designing such algorithms, in order to lead to practical solutions, it is essential to keep in mind that both performance and efficiency are of significant importance. Considering the fact that humans have remarkable ability in analyzing visual content, this thesis presents algorithms for image and video analysis by taking the perspective of human visual perception.
The algorithms presented in this thesis follow the perceptual rules proposed by Gestalt Psychology, which suggests that our perceptions are based on the emergent properties that result from the organization of individual percepts. Such a stance is often overlooked – if not ignored – in content analysis algorithms, and the offered solutions are generally based on analyzing individual components only. This typically results in either inadequate or overcomplicated solutions. By following the perceptual organization rules defined by Gestalt Psychology, it has been shown in this thesis that content analysis can be performed in a significantly more efficient and effective manner. These improvements are revealed in miscellaneous topics, such as color content description, image segmentation, object recognition and video shot change detection.
The main contribution of this thesis is to demonstrate the significance of taking a perceptual standpoint in image and video content analysis. This significance can be examined through the benefits it brings in, namely the improvements in performance and efficiency. Performance improvements in this thesis are realized in the aforementioned fields, specifically by attaining more accurate characterization of the color composition of an image, more precise segmentation of the objects, higher accuracy in recognizing objects and higher accuracy in detecting shot boundaries in a video. Achieving such improvements via simple and lightweight algorithms without over complicating or over engineering the underlying problem proves the efficiency of the proposed algorithms. Algorithms presented in this thesis are evaluated according to both criteria, i.e. performance and efficiency, and it will be shown in the thesis that they achieve exceptional results when compared to the state of the art. In other words, describing the color content of an image, segmenting an image into meaningful objects, recognizing objects and detecting shot changes in a video are all successfully accomplished with minimal effort – just as we humans perform such tasks.
The algorithms presented in this thesis follow the perceptual rules proposed by Gestalt Psychology, which suggests that our perceptions are based on the emergent properties that result from the organization of individual percepts. Such a stance is often overlooked – if not ignored – in content analysis algorithms, and the offered solutions are generally based on analyzing individual components only. This typically results in either inadequate or overcomplicated solutions. By following the perceptual organization rules defined by Gestalt Psychology, it has been shown in this thesis that content analysis can be performed in a significantly more efficient and effective manner. These improvements are revealed in miscellaneous topics, such as color content description, image segmentation, object recognition and video shot change detection.
The main contribution of this thesis is to demonstrate the significance of taking a perceptual standpoint in image and video content analysis. This significance can be examined through the benefits it brings in, namely the improvements in performance and efficiency. Performance improvements in this thesis are realized in the aforementioned fields, specifically by attaining more accurate characterization of the color composition of an image, more precise segmentation of the objects, higher accuracy in recognizing objects and higher accuracy in detecting shot boundaries in a video. Achieving such improvements via simple and lightweight algorithms without over complicating or over engineering the underlying problem proves the efficiency of the proposed algorithms. Algorithms presented in this thesis are evaluated according to both criteria, i.e. performance and efficiency, and it will be shown in the thesis that they achieve exceptional results when compared to the state of the art. In other words, describing the color content of an image, segmenting an image into meaningful objects, recognizing objects and detecting shot changes in a video are all successfully accomplished with minimal effort – just as we humans perform such tasks.
Kokoelmat
- Väitöskirjat [4865]