Comparison of monolithic and hybrid controllers for multi-objective sim-to-real learning
Dag, Atakan (2021)
Dag, Atakan
2021
Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2021-04-28
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202104142975
https://urn.fi/URN:NBN:fi:tuni-202104142975
Tiivistelmä
Simulation to real (Sim-to-Real) is a way to accomplish robotic task by first training in the simulator and then transferring the learned model into the real environment. It is one of the most popular way to solve a robotic task since it is easy to implement and test compared to analytically solving the trajectories. Recent successful sim-to-real works have solved single objective tasks such as "take the object". However, in reality, problems consist of multiple objectives such as "avoid humans while reaching the target". The most standard reinforcement learning (RL) solution for multi-objective problems is to train a single (monolithic) controller by using a complicated problem specific reward function. The problem with this approach is the difficulty of shaping the reward function (reward engineering). Recently, a hybrid controller solution was proposed, which uses multiple pre-trained single-objective controllers and switches between them according to some conditions.
In this research, comparisons of these two approaches are done for a problem where a robotic manipulator tries to reach a target while avoiding collisions with an obstacle. For both of the methods, controllers are trained and tested in a simulator environment and then verified by a real set-up. Our results show that the hybrid controller has a better performance in terms of success and failure rates. Also, training the hybrid controller is easier than training the monolithic controller due to the easiness of shaping the reward functions of single-objective controllers and training of hybrid controller is less time-consuming than the training of monolithic controller.
In this research, comparisons of these two approaches are done for a problem where a robotic manipulator tries to reach a target while avoiding collisions with an obstacle. For both of the methods, controllers are trained and tested in a simulator environment and then verified by a real set-up. Our results show that the hybrid controller has a better performance in terms of success and failure rates. Also, training the hybrid controller is easier than training the monolithic controller due to the easiness of shaping the reward functions of single-objective controllers and training of hybrid controller is less time-consuming than the training of monolithic controller.