Machine Learning Operations Architecture In Healthcare Big Data Environment : Batch versus online inference
Siltala, Ville (2023)
Siltala, Ville
2023
Tietojenkäsittelyopin maisteriohjelma - Master's Programme in Computer Science
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2023-04-13
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202304123665
https://urn.fi/URN:NBN:fi:tuni-202304123665
Tiivistelmä
Developing and operating machine learning systems is associated with uncertainties incomparable to traditional software engineering. Managing and mitigating these uncertainties is critical especially when creating machine learning systems for clinical healthcare use. By incorporating processes and tools to develop and deploy machine learning systems in a controlled, automated, and monitored manner, machine learning operations aims to ensure quality and reliability in machine learning systems. This study provides an examination of machine learning operations in the context of healthcare and big data. First, a study project was conducted to design a machine learning operations architecture for building a machine learning based NLP solution to be integrated into an existing clinical healthcare software application. Two separate model deployment and inference architectures were designed. To test the applicability of these architectures in the context of big data, an empirical study was conducted. The results showed the batch inference architecture using Spark NLP had better performance compared to a Docker container based online inference architecture. In conclusion, the study project involving the design of a machine learning operations architecture, as well as the empirical comparison of batch inference and online inference, offer insights into the field of machine learning operations. The proposed model and the results of the comparison can be used to develop machine learning systems and make informed decisions on the selection of an inference architecture.