Image Database Retrieval Methods Based on Feature Histograms
Zhong, Daidi (2008)
Zhong, Daidi
Tampere University of Technology
2008
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-200902131002
https://urn.fi/URN:NBN:fi:tty-200902131002
Tiivistelmä
Proliferation of digital capture, storage and networking systems makes creation of image collections very easy. A difficult and unsolved problem is the formation of computerized image databases which would enable precise retrieval using key images and specified search criteria like expressed in task find me the most similar picture . The task is easy for humans but it is hard to implement in computers due to the complexity of visual information and lack of highly efficient algorithms. In this thesis a class of algorithms for image database retrieval is proposed and investigated. These algorithms are based on two conceptual principles. First principle is that image information used for retrieval should be effectively reduced in order to preserve key information and prevent the growth in computational complexity. Second principle is proper combination of image statistical and structural information in which the latter one is used as little as possible since the former one is much easier to describe and compute.
Several methods are developed in the thesis to fulfill those principles, acting on three levels of processing hierarchy from local to global. At a bottom level, local features are constructed from the coefficients of quantized block transforms. Block transforms are widely used in image and video compression and are well known for their excellent ability of preserving perceptual information under heavy quantization. Quantization acts for the concentration of block-wise information in a more condense way, which is highly desirable for the retrieval tasks. In the thesis several new types of local features are introduced and their properties are described. At an intermediate level, histograms of local image features are used as descriptors of global statistical information. Histogram similarity measure is introduced and methods for combining feature histograms are investigated. Finally, at the top level, in the thesis the combination of histograms from image sub-areas is defined as a way to incorporate structural information. The three information processing levels are composed into an overall image database retrieval system. The system parameters, like quantization level, histogram length and image subareas, are optimized iteratively using training datasets. The optimized system performance is evaluated on the example of available face databases using standardized evaluation procedures. The results show that the performance approaches best other methods proposed and sometimes exceeds them. This indicates that proposed methods for the description and combination of statistical and structural information are very effective for the image database retrieval.
Several methods are developed in the thesis to fulfill those principles, acting on three levels of processing hierarchy from local to global. At a bottom level, local features are constructed from the coefficients of quantized block transforms. Block transforms are widely used in image and video compression and are well known for their excellent ability of preserving perceptual information under heavy quantization. Quantization acts for the concentration of block-wise information in a more condense way, which is highly desirable for the retrieval tasks. In the thesis several new types of local features are introduced and their properties are described. At an intermediate level, histograms of local image features are used as descriptors of global statistical information. Histogram similarity measure is introduced and methods for combining feature histograms are investigated. Finally, at the top level, in the thesis the combination of histograms from image sub-areas is defined as a way to incorporate structural information. The three information processing levels are composed into an overall image database retrieval system. The system parameters, like quantization level, histogram length and image subareas, are optimized iteratively using training datasets. The optimized system performance is evaluated on the example of available face databases using standardized evaluation procedures. The results show that the performance approaches best other methods proposed and sometimes exceeds them. This indicates that proposed methods for the description and combination of statistical and structural information are very effective for the image database retrieval.
Kokoelmat
- Väitöskirjat [4906]