Demonstration of Object Recognition Using DOPE Deep Learning Algorithm for Collaborative Robotics
Linnosmaa, Essi (2022)
Linnosmaa, Essi
2022
Teknisten tieteiden kandidaattiohjelma - Bachelor's Programme in Engineering Sciences
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2022-03-22
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202202282200
https://urn.fi/URN:NBN:fi:tuni-202202282200
Tiivistelmä
When collaborating on a common task, passing, or receiving various objects such as tools between each other is one of the most common interaction methods among humans. Similarly, it is expected to be a common and important interaction method in a fluent and natural human-robot collaboration.
This thesis studied human-robot-interaction in the context of unilateral robot-to-human handover task. More specifically, it focused on studying grasping an object using a state-of-the-art machine learning algorithm called Guided Uncertainty-Aware Policy Optimization (GUAPO). Within the broader scope of the whole GUAPO algorithm, it was limited to only demonstrating the object detection and pose estimation part of the task. In this case, it was implemented using an object pose estimation algorithm called Deep Object Pose Estimation (DOPE). DOPE is a deep learning approach to predict image key points from a large-enough set of training data of an object-of-interest. The challenge of having enough training data for teaching a supervised machine learning-based machine vision algorithm was tackled by creating a synthetic (computer generated) dataset. The dataset needed to represent the real-life scenario closely to beat the so-called reality-gap. This dataset was created with Unreal Engine 4 (UE4) and NVIDIA Deep learning Dataset Synthesizer (NDDS).
During the experimental part, a 3D model of the object-of-interest was created using Blender and the object was imported into the created UE4 environment. NDDS was used to create and extract the training dataset for DOPE. DOPE’s functionality was successfully tested with a pre-trained network and then it was manually shown that it is possible to start training the DOPE algorithm with the dataset created. However, the lack of computing power became the limitation of this work, and it was not possible to train the DOPE algorithm enough to recognize the object-of-interest. The results prove this to be an effective way to approach training object recognition algorithms, albeit being technologically challenging to do from scratch, as knowledge of broad sets of software and programming skills are needed.
This thesis studied human-robot-interaction in the context of unilateral robot-to-human handover task. More specifically, it focused on studying grasping an object using a state-of-the-art machine learning algorithm called Guided Uncertainty-Aware Policy Optimization (GUAPO). Within the broader scope of the whole GUAPO algorithm, it was limited to only demonstrating the object detection and pose estimation part of the task. In this case, it was implemented using an object pose estimation algorithm called Deep Object Pose Estimation (DOPE). DOPE is a deep learning approach to predict image key points from a large-enough set of training data of an object-of-interest. The challenge of having enough training data for teaching a supervised machine learning-based machine vision algorithm was tackled by creating a synthetic (computer generated) dataset. The dataset needed to represent the real-life scenario closely to beat the so-called reality-gap. This dataset was created with Unreal Engine 4 (UE4) and NVIDIA Deep learning Dataset Synthesizer (NDDS).
During the experimental part, a 3D model of the object-of-interest was created using Blender and the object was imported into the created UE4 environment. NDDS was used to create and extract the training dataset for DOPE. DOPE’s functionality was successfully tested with a pre-trained network and then it was manually shown that it is possible to start training the DOPE algorithm with the dataset created. However, the lack of computing power became the limitation of this work, and it was not possible to train the DOPE algorithm enough to recognize the object-of-interest. The results prove this to be an effective way to approach training object recognition algorithms, albeit being technologically challenging to do from scratch, as knowledge of broad sets of software and programming skills are needed.
Kokoelmat
- Kandidaatintutkielmat [8996]