Visual object detection and 6D pose estimation for robotic applications using deep learning
Äijälä, Tomi (2024)
Äijälä, Tomi
2024
Automaatiotekniikan DI-ohjelma - Master's Programme in Automation Engineering
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-08-26
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202408268291
https://urn.fi/URN:NBN:fi:tuni-202408268291
Tiivistelmä
Estimating the object’s 6D pose from a single image is one of the more challenging tasks related to computer vision and machine learning. 6D pose information is utilized in variety of fields and is relevant in multiple applications such as robotics, autonomous driving and virtual reality. Many strategies have been developed for the 6D pose estimation task, but the current trend is moving towards estimating the 6D pose by deep learning-based methods.
This master’s thesis introduces 2 deep learning-based 6D pose estimation methods, Deep Object Pose Estimation (DOPE) and FoundationPose. DOPE requires extensive amount of training data of the object of interest with annotations describing the object. A pipeline for data generation for this purpose was created using BlenderProc2. The target for the pose estimation methods is to detect the pose of on object accurately and in a time suitable for real-time application. Performance of both of the methods was tested using synthetic data and the FoundationPose’s performance was also tested using real-world application. Both of the methods are capable of estimating the pose of an object of interest accurately but the 6D pose estimations computational time still has some room for improvement.
This master’s thesis introduces 2 deep learning-based 6D pose estimation methods, Deep Object Pose Estimation (DOPE) and FoundationPose. DOPE requires extensive amount of training data of the object of interest with annotations describing the object. A pipeline for data generation for this purpose was created using BlenderProc2. The target for the pose estimation methods is to detect the pose of on object accurately and in a time suitable for real-time application. Performance of both of the methods was tested using synthetic data and the FoundationPose’s performance was also tested using real-world application. Both of the methods are capable of estimating the pose of an object of interest accurately but the 6D pose estimations computational time still has some room for improvement.