Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Voice control of industrial robots with large language models trained on code

Parikka, Ossi (2024)

 
Avaa tiedosto
ParikkaOssi.pdf (90.46Mt)
Lataukset: 



Parikka, Ossi
2024

Tietotekniikan DI-ohjelma - Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-12-05
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2024112910646
Tiivistelmä
As industry in the in the EU and elsewhere moves from mass manufacturing towards specialized, customizable products, the requirements for industrial robots are changing. Increasingly complex manufacturing tasks can't always be fully automated, but having humans do everything manually is also impractical. To enable the industry of tomorrow, intelligent human-robot collaboration (HRC) is needed. In turn, humans must be able to effecively communicate with the robots they are working with.

One important part of this puzzle is voice control. Speech is perhaps the most natural way of communicating for most humans, and it also has the advantage of leaving the user's hands free for other tasks. However, current voice control technology is often limited in functionality, and requires the use of specific phrases to trigger a limited set of functions.

This thesis presents a prototype implementation of a voice control system based on code generating language models (LLMs). The prototype combines text to speech, robot control, object detection, programming and learning policies by speech to form a higly flexible system with the ability to adapt to different tasks. It is also modular and upgradeable with improved components, and requires no internet connection to work.

Over the course of this work, the prototype is planned, developed and tested. The planning started by establishing the requirements for effective HRC and finding suitable technologies through literature review. Based on the results, the choice was made to use code generating LLMs through a method known as Code As Policies. It was determined that the LLM to be used should be StarCoder2-15B. In addition to the LLM, a speech to text component was determined to be necessary, and two options, Wav2Vec 2.0 and Whisper were established.

In the design phase, a structure was planned by separating functionality into modules. A set of requirements to guide the process were also chosen. Development then begun starting with the most important modules, and was carried out in three stages until a full prototype was completed. The final result has modules for speech-to-text, user interface, code generation, robot cotrol and handling of policies. Communication between modules and hardware is handled with Robot Operating System 2 (ROS).

The final prototype was evaluated with practical experiments involving individual commands, teaching capabilities, and an assembly task representing a simplified industrial use case. Additionally, conversational behaviour was shown and command latency measured. Based on these results, the prototype shows promising performance with natural language commands, especially when policy writing and teaching capabilities are used to create specialized functionality. Some concerns were raised regarding the reliability and safety of using AI this way, but overall the prototype offers a decent starting point for future development.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [39885]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste