Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Data Vault 2.0 Automation Solutions for Commercial Use

Jenni, Laukkanen (2020)

 
Avaa tiedosto
LaukkanenJenni.pdf (5.562Mt)
Lataukset: 



Jenni, Laukkanen
2020

Laskennallisen suurten tietoaineistojen analysoinnin maisterikoulutus, FM (engl) - Master's Degree Programme in Computational Big Data Analytics
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2020-08-28
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202008286755
Tiivistelmä
As the amount of data and the need for its processing and storage have increased, methods for its management and reporting have been intensely developed. However, these methods require a lot of skills, time, and manual work.
Efforts have been made to fully automate data warehousing solutions in various areas, such as loading data at different stages of data warehousing. However, few solutions automate data warehouse construction, and learning how to use these data warehouse automation solutions requires a certain amount of expertise and time.
In this research, we discuss different solution options for automating data warehouse construction. From the point of view of organizations, the study identifies different options such as purchasing, collaborating with other organizations to obtain or building the solution. In addition to market analysis, we also create and implement an automated tool for building a Data Vault 2.0 type data warehouse by leveraging metadata as well as sources RDBMS relationships to predict critical components of Data Vault 2.0 data warehousing, most of which are usually defined by experts.
Based on the metadata collected and processed, the classification algorithm was able to correctly classify an average of 85.89% of all given observations correctly and 55.11% correctly for business keys alone. The algorithm was able to classify more correctly the observations that were not business keys than the business keys themselves. However, the correctness of the classification has the most significant impact on what the Automation tool that builds Data Vault 2.0 inserts into the target tables of the data model, rather than what kind of tables and what source table they consist of. The model generated by the tool corresponded well to the target model implemented at the beginning of the study. What came to hubs and satellites, without taking into account a couple of missing hubs and the content of some hubs due to shortcomings in the classification of business keys, the model would have been able to be used as an enterprise data warehouse. Links differed more from the original target, but after testing, the link variations produced by the tool worked well either way.
There are still many shortcomings and areas for development in the created and implemented tool of the research, which, however, have been considered in the logic and structure of the tool. Also, the tool can be implemented with even a small amount of financial capital but requires a lot of experience and expertise on the subject.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [34700]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Yhteydenotto | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Yhteydenotto | Tietosuoja | Saavutettavuusseloste