VarioML framework for comprehensive variation data representation and exchange
Byrne, Myles G; Fokkema, Ivo; Lancaster, Oven; Adamusiak, Tomasz; Ahonen-Bishopp, Anni; Atlan, David; Beroud, Christophe; Cornell, Michael; Dalgeish, Raymond; Devereaux, Andrew; Patrinos, George; Swertz, Morris A; Taschner, Peter; Thorisson, Gudmundur; Vihinen, Mauno; Brookes, Anthony; Muilu, Juha (2012)
Byrne, Myles G
Fokkema, Ivo
Lancaster, Oven
Adamusiak, Tomasz
Ahonen-Bishopp, Anni
Atlan, David
Beroud, Christophe
Cornell, Michael
Dalgeish, Raymond
Devereaux, Andrew
Patrinos, George
Swertz, Morris A
Taschner, Peter
Thorisson, Gudmundur
Vihinen, Mauno
Brookes, Anthony
Muilu, Juha
2012
BMC Bioinformatics 13
254
Biolääketieteellisen teknologian yksikkö - Institute of Biomedical Technology
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:uta-201212111091
https://urn.fi/URN:NBN:fi:uta-201212111091
Kuvaus
BioMed Central open access
Tiivistelmä
Background
Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement.
Results
The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e.g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components.
Conclusions
VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity.
Keywords:
LSDB; Variation database curation; Data collection; Distribution
Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement.
Results
The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e.g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components.
Conclusions
VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity.
Keywords:
LSDB; Variation database curation; Data collection; Distribution
Kokoelmat
- Artikkelit [6140]