Natural Language Pathfinding for Industrial Applications on a Collaborative Robot : Application of CLIPort for OpenDR
Petäjä, Mikael (2023)
Petäjä, Mikael
2023
Master's Programme in Automation Engineering
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2023-11-21
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202311159680
https://urn.fi/URN:NBN:fi:tuni-202311159680
Tiivistelmä
Robotics and automation are common in modern industry but is often limited to pre-known workspaces and rigid tasks. With machine learning robotics can be made to derive task-relevant context from the workspace and act without explicit directions. This could benefit task robustness as well as allow for greater co-operation with humans in collaborative tasks.
Machine learning applications are particularly interesting in human-robot collaboration because of the difficulty of predicting the impact a human actor has on the workspace using traditional algorithms. The framework presented in this thesis attempts to react to changes in the workspace but does not directly detect human behaviour or attempt to avoid human body parts by identifying them in the workspace.
This thesis implements a CLIPort based natural language instruction tool for controlling a Franka Panda robotic arm. A final dataset-model -pair introduced as well as two prototype dataset-model -pairs for possible future improvements and development process explanation. A literature review was done to briefly discuss similar systems and other applications of machine learning on robotics. Backend systems, such as CLIP and CLIPort are briefly introduced as well along with other relevant works.
Results show that the presented model can achieve a location accuracy of 90.42% in examined industrial tasks. For certain object-task-environment configurations, this accuracy was observed to be up to 100.00%, but the overall the framework was found to successfully execute a complete pick & place task 75.07% of the time. The datasets with which the models were trained are examined and future improvements are considered with suggestions based on scope.
The most important contribution of the thesis is the demonstration that the implemented framework is suitable to industrial task execution. Other notable contributions include the identification of error-producing situations, and format and quantity recommendations for demonstrations in datasets.
Machine learning applications are particularly interesting in human-robot collaboration because of the difficulty of predicting the impact a human actor has on the workspace using traditional algorithms. The framework presented in this thesis attempts to react to changes in the workspace but does not directly detect human behaviour or attempt to avoid human body parts by identifying them in the workspace.
This thesis implements a CLIPort based natural language instruction tool for controlling a Franka Panda robotic arm. A final dataset-model -pair introduced as well as two prototype dataset-model -pairs for possible future improvements and development process explanation. A literature review was done to briefly discuss similar systems and other applications of machine learning on robotics. Backend systems, such as CLIP and CLIPort are briefly introduced as well along with other relevant works.
Results show that the presented model can achieve a location accuracy of 90.42% in examined industrial tasks. For certain object-task-environment configurations, this accuracy was observed to be up to 100.00%, but the overall the framework was found to successfully execute a complete pick & place task 75.07% of the time. The datasets with which the models were trained are examined and future improvements are considered with suggestions based on scope.
The most important contribution of the thesis is the demonstration that the implemented framework is suitable to industrial task execution. Other notable contributions include the identification of error-producing situations, and format and quantity recommendations for demonstrations in datasets.