Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Novellette: an RNA-sequencing data analysis pipeline for detecting novel transcripts

Seppälä, Janne (2013)

 
Avaa tiedosto
seppala.pdf (2.626Mt)
Lataukset: 



Seppälä, Janne
2013

Biotekniikan koulutusohjelma
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical EngineeringLuonnontieteiden tiedekunta - Faculty of Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2013-12-04
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201312191514
Tiivistelmä
Proteins are the key factors in every living organism and they contribute to almost every biological process and structure in a cell. Proteins are formed through the process of protein synthesis, in which the genetic code of DNA in genes is first transcribed into RNA and then finally translated into a protein. Each gene in a cell is a short part of a longer DNA molecule, and each gene encodes the synthesis of a certain protein or pro-teins. Currently the state-of-the-art tool to evaluate which genes are expressed in a given biological sample is RNA-sequencing, which captures the RNA content of the cells at a specific time point. Although RNA-sequencing is often used to measure the expression of known genes in the genome, it can also be utilized to search for new genes, or novel transcripts.
To date, no standard tool has been published that effectively and thoroughly identi-fies novel transcripts from RNA-sequencing data. In this work, a tool aiming to solve this issue – denoted Novellette – is presented. Novellette attempts to identify differen-tially expressed regions in the genome that do not overlap with any known genes, and then performs a full gene structure analysis to the regions. For this process, information both from processed RNA-sequencing data and the known DNA sequence of the studied organism is utilized when searching for features in the novel transcript candidates that are common for protein-coding genes. The features are then scored and the final novel transcript candidates are ranked based on their score values. In addition to developing an RNA-sequencing tool in this work, the basics of statistical testing and other mathematical methods related to RNA-sequencing data analysis are introduced and the normality of count based RNA-sequencing data is assessed with publically available data.
The results from analyses performed with various input data show that Novellette is able to reliably detect novel transcripts and distinguish protein-coding regions from non-coding regions in the genome with the proposed scoring approach. In addition, the count based RNA-sequencing data is shown to very poorly follow the normal distribu-tion, hence pinpointing the importance of statistical hypothesis testing methods that do not assume data normality. In conclusion, a functional and useful bioinformatics tool has been developed in this work that has the potential to become a standard method for novel transcript identification.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [40800]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste