Demystifying Data Science Projects: a Look on the People and Process of Data Science Today
Aho, Timo; Sievi-Korte, Outi; Kilamo, Terhi; Yaman, Sezin Gizem; Mikkonen, Tommi (2020-11)
Aho, Timo
Sievi-Korte, Outi
Kilamo, Terhi
Yaman, Sezin Gizem
Mikkonen, Tommi
11 / 2020
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202101051053
https://urn.fi/URN:NBN:fi:tuni-202101051053
Kuvaus
Peer reviewed
Tiivistelmä
Processes and practices used in data science projects have<br/>been reshaping especially over the last decade. These are different from<br/>their software engineering counterparts. However, to a large extent, data<br/>science relies on software, and, once taken to use, the results of a data<br/>science project are often embedded in software context. Hence, seeking<br/>synergy between software engineering and data science might open<br/>promising avenues. However, while there are various studies on data science<br/>work<br/>ows and data science project teams, there have been no attempts<br/>to combine these two very interlinked aspects. Furthermore, existing<br/>studies usually focus on practices within one company. Our study<br/>will fill these gaps with a multi-company case study, concentrating both<br/>on the roles found in data science project teams as well as the process.<br/>In this paper, we have studied a number of practicing data scientists to<br/>understand a typical process <br/>flow for a data science project. In addition,<br/>we studied the involved roles and the teamwork that would take place<br/>in the data context. Our analysis revealed three main elements of data<br/>science projects: Experimentation, Development Approach, and Multidisciplinary<br/>team(work). These key concepts are further broken down to<br/>13 different sub-themes in total. The found themes pinpoint critical elements<br/>and challenges found in data science projects, which are still often<br/>done in an ad-hoc fashion. Finally, we compare the results with modern<br/>software development to analyse how good a match there is.
Kokoelmat
- TUNICRIS-julkaisut [24199]