Parallelizing Ground Plane Estimation of Multilayer LiDAR Point Cloud for Anti-Collision System
Angelma, Max (2019)
Angelma, Max
2019
Automaatiotekniikka
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2019-04-24
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201903291345
https://urn.fi/URN:NBN:fi:tty-201903291345
Tiivistelmä
The problem researched in this thesis is implementing GPU acceleration into LiDAR point cloud processing. This problem is important because 3D LiDARs are being developed by many start-up companies towards lower prices and higher resolutions. While this is mainly driven by automotive industry, automated container handling cranes could benefit from same sensors as well. A use case selected is development of more advanced anti-collision system for Automated Rubber Tyre Gantry cranes from an existing idea description (Mannari 2018).
This idea proposed using RANSAC algorithm for plane segmentation. Literature review showed that RANSAC is indeed somewhat computationally expensive, and that there have been some previous studies about parallelizing it with good results. Some other studies were also reviewed regarding ground plane estimation problem, and in these the focus seemed to be at inventing more efficient algorithms instead of trying to parallelize existing ones.
To study the problem experimentally, development environment was set up with a gaming laptop (Nvidia GTX 1060 graphics card), necessary software frameworks (CUDA, Point Cloud Library) and Velodyne VLP-16 LiDAR. While studying CUDA programming, it was again found out that efficient algorithms are the key when trying to accelerate programs.
After encountering multiple problems during compilation of applications, integrating CUDA to PCL proved to be difficult, and scope was changed a bit. To isolate the effect of GPU acceleration, C++ math library based applications were written. For a single frame of 27867-point cloud and using unnecessarily high number of RANSAC maximum hypotheses (131072), relative speedup of 129X was achieved. Other test for the same frame was comparison of PCL’s RANSAC function with two different values of probability (1 and 0.99), which lead to 739X speedup. PCL’s better speedup ratio is caused by z = 0.99 allowing the algorithm to adjust its maximum iterations, whereas with z = 1 computation runs until pre-set maximum is reached. However, these two tests confirmed that using algorithms efficiently might make more sense than GPU acceleration.
Finally, the PCL application was extended with live stream grabbing from VLP-16. Comparing input from VeloView (averaging 101 milliseconds per frame) to output (107 ms per frame) and measuring the plane segmentation part (0.4 ms per frame) meant that RANSAC only accounted for a 6.3 % of the overhead added by application. In conclusion, GPU acceleration proved to be both useful and useless. If the absolute best ground plane is needed to be found with RANSAC, sure, but for an industrial application such as ARTG anti-collision, especially with lower resolution LiDAR such as VLP-16, serial execution performs just fine.
This idea proposed using RANSAC algorithm for plane segmentation. Literature review showed that RANSAC is indeed somewhat computationally expensive, and that there have been some previous studies about parallelizing it with good results. Some other studies were also reviewed regarding ground plane estimation problem, and in these the focus seemed to be at inventing more efficient algorithms instead of trying to parallelize existing ones.
To study the problem experimentally, development environment was set up with a gaming laptop (Nvidia GTX 1060 graphics card), necessary software frameworks (CUDA, Point Cloud Library) and Velodyne VLP-16 LiDAR. While studying CUDA programming, it was again found out that efficient algorithms are the key when trying to accelerate programs.
After encountering multiple problems during compilation of applications, integrating CUDA to PCL proved to be difficult, and scope was changed a bit. To isolate the effect of GPU acceleration, C++ math library based applications were written. For a single frame of 27867-point cloud and using unnecessarily high number of RANSAC maximum hypotheses (131072), relative speedup of 129X was achieved. Other test for the same frame was comparison of PCL’s RANSAC function with two different values of probability (1 and 0.99), which lead to 739X speedup. PCL’s better speedup ratio is caused by z = 0.99 allowing the algorithm to adjust its maximum iterations, whereas with z = 1 computation runs until pre-set maximum is reached. However, these two tests confirmed that using algorithms efficiently might make more sense than GPU acceleration.
Finally, the PCL application was extended with live stream grabbing from VLP-16. Comparing input from VeloView (averaging 101 milliseconds per frame) to output (107 ms per frame) and measuring the plane segmentation part (0.4 ms per frame) meant that RANSAC only accounted for a 6.3 % of the overhead added by application. In conclusion, GPU acceleration proved to be both useful and useless. If the absolute best ground plane is needed to be found with RANSAC, sure, but for an industrial application such as ARTG anti-collision, especially with lower resolution LiDAR such as VLP-16, serial execution performs just fine.