Exploiting Genomic Relations in Big Data Repositories by Graph-Based Search Methods
Musa, Aliyu; Dehmer, Matthias; Yli-Harja, Olli; Emmert-Streib, Frank (2019-11-22)
Musa, Aliyu
Dehmer, Matthias
Yli-Harja, Olli
Emmert-Streib, Frank
22.11.2019
Machine Learning and Knowledge Extraction
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-201910073726
https://urn.fi/URN:NBN:fi:tuni-201910073726
Kuvaus
Peer reviewed
Tiivistelmä
We are living at a time that allows the generation of mass data in almost any field of science. For instance, in pharmacogenomics, there exist a number of big data repositories, e.g., the Library of Integrated Network-based Cellular Signatures (LINCS) that provide millions of measurements on the genomics level. However, to translate these data into meaningful information, the data need to be analyzable. The first step for such an analysis is the deliberate selection of subsets of raw data for studying dedicated research questions. Unfortunately, this is a non-trivial problem when millions of individual data files are available with an intricate connection structure induced by experimental dependencies. In this paper, we argue for the need to introduce such search capabilities for big genomics data repositories with a specific discussion about LINCS. Specifically, we suggest the introduction of smart interfaces allowing the exploitation of the connections among individual raw data files, giving raise to a network structure, by graph-based searches.
Kokoelmat
- TUNICRIS-julkaisut [19351]