Implementation of Depth Map Filtering on GPU
Sen, Sumeet (2016)
Sen, Sumeet
2016
Master's Degree Programme in Information Technology
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2016-04-06
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201603033615
https://urn.fi/URN:NBN:fi:tty-201603033615
Tiivistelmä
The thesis work was part of the Mobile 3DTV project which studied the capture, coding and transmission of 3D video representation formats in mobile delivery scenarios. The main focus of study was to determine if it was practical to transmit and view 3D videos on mobile devices. The chosen approach for virtual view synthesis was Depth Image Based Rendering (DIBR).
The depth computed is often inaccurate, noisy, low in resolution or even inconsistent over a video sequence. Therefore, the sensed depth map has to be post-processed and refined through proper filtering. Bilateral filter was used for the iterative refinement process, using the information from one of the associated high quality texture (color) image (left or right view).
The primary objective of this thesis was to perform the filtering operation in real-time. Therefore, we ported the algorithm to a GPU. As for the programming platform we chose OpenCL from the Khronos Group. The reason was that the platform is capable of programming on heterogeneous parallel computing environments, which means it is platform, vendor, or hardware independent.
It was observed that the filtering algorithm was suitable for GPU implementation. This was because, even though every pixel used the information from its neighborhood window, processing for one pixel has no dependency on the results from its surrounding pixels. Thus, once the data for the neighborhood was loaded into the local memory of the multiprocessor, simultaneous processing for several pixels could be carried out by the device.
The results obtained from our experiments were quite encouraging. We executed the MEX implementation on a Core2Duo CPU with 2 GB of RAM. On the other hand we used NVIDIA GeForce 240 as the GPU device, which comes with 96 cores, graphics clock of 550 MHz, processor clock of 1340 MHz and 512 MB memory.
The processing speed improved significantly and the quality of the depth maps was at par with the same algorithm running on a CPU. In order to test the effect of our filtering algorithm on degraded depth map, we introduced artifacts by compressing it using H.264 encoder. The level of degradation was controlled by varying the quantization parameter. The blocky depth map was filtered separately using our implementation on GPU and then on CPU. The results showed improvement in speed up to 30 times, while obtaining refined depth maps with similar quality measure as the ones processed using the CPU implementation.
The depth computed is often inaccurate, noisy, low in resolution or even inconsistent over a video sequence. Therefore, the sensed depth map has to be post-processed and refined through proper filtering. Bilateral filter was used for the iterative refinement process, using the information from one of the associated high quality texture (color) image (left or right view).
The primary objective of this thesis was to perform the filtering operation in real-time. Therefore, we ported the algorithm to a GPU. As for the programming platform we chose OpenCL from the Khronos Group. The reason was that the platform is capable of programming on heterogeneous parallel computing environments, which means it is platform, vendor, or hardware independent.
It was observed that the filtering algorithm was suitable for GPU implementation. This was because, even though every pixel used the information from its neighborhood window, processing for one pixel has no dependency on the results from its surrounding pixels. Thus, once the data for the neighborhood was loaded into the local memory of the multiprocessor, simultaneous processing for several pixels could be carried out by the device.
The results obtained from our experiments were quite encouraging. We executed the MEX implementation on a Core2Duo CPU with 2 GB of RAM. On the other hand we used NVIDIA GeForce 240 as the GPU device, which comes with 96 cores, graphics clock of 550 MHz, processor clock of 1340 MHz and 512 MB memory.
The processing speed improved significantly and the quality of the depth maps was at par with the same algorithm running on a CPU. In order to test the effect of our filtering algorithm on degraded depth map, we introduced artifacts by compressing it using H.264 encoder. The level of degradation was controlled by varying the quantization parameter. The blocky depth map was filtered separately using our implementation on GPU and then on CPU. The results showed improvement in speed up to 30 times, while obtaining refined depth maps with similar quality measure as the ones processed using the CPU implementation.