Data Fusion Methods and an Application on Exploration of Gene Regulatory Mechanisms
Dai, Xiaofeng (2010)
Dai, Xiaofeng
Tampere University of Technology
2010
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-200912287263
https://urn.fi/URN:NBN:fi:tty-200912287263
Tiivistelmä
Understanding the regulatory mechanisms of gene regulatory networks (GRN) is an important topic in the field of Systems Biology. It has been widely accepted that holistic approaches are needed to explore biological systems given, for example, the noisy dynamics of gene expression and the complex interactions between genes and between gene expression products and other cellular components. As new advanced high throughput technologies emerge, i.e., as more information sources become available, thorough investigation of this problem is becoming feasible to be addressed from multiple perspectives.
The main objective of this thesis is to provide solutions to problems related to gene regulatory mechanisms with data fusion methods, aiming at a more precise understanding of a GRN's structure and its dynamics. This thesis can be divided into two parts: the presentation of the new data fusion methods here proposed to explore GRNs' topologies and, subsequently, the application of one method to investigate the dynamics of such networks.
In the `Methods' chapter, two methods are proposed: one for transcription factor binding sites (TFBS) prediction and the other for gene clustering. The results from TFBS prediction can be used as an input for the gene clustering algorithm. Particularly, a new data fusion method is developed and novel information sources are explored to improve TFBS prediction accuracy in comparison with previous methods. Three finite joint mixture models are developed to cluster genes from multiple data sources: the beta-Gaussian mixture model (BGMM), the stratified beta-Gaussian mixture model (sBGMM) and the Gaussian-Bernoulli mixture model (GBMM). These methods are shown to significantly improve the accuracy of TFBS predictions and clustering results.
In the `Application' chapter, one of the developed methods is applied to detect noisy attractors in delayed stochastic models of GRNs. The detection of noisy attractors is carried out for a model of a genetic toggle switch (TS) and for a model of an excitable genetic circuit of Bacillus subtilis responsible for phenotypic changes, by fusing multiple data sources extracted from the dynamics of the corresponding GRN. The results suggest that resorting to a single data source alone is, in general, insufficient to reveal the underlying structure of the GRN or to capture the changes in the dynamics of a GRN modeled according to the delayed stochastic framework.
In summary, this thesis focuses on developing and applying data fusion methods to explore the topology and dynamics of a GRN, including TFBS prediction, gene clustering and noisy attractor detection. The developed algorithms and strategies are applicable to investigate real biological phenomena, and the findings can be used to guide future wet- or dry-lab experiments.
The main objective of this thesis is to provide solutions to problems related to gene regulatory mechanisms with data fusion methods, aiming at a more precise understanding of a GRN's structure and its dynamics. This thesis can be divided into two parts: the presentation of the new data fusion methods here proposed to explore GRNs' topologies and, subsequently, the application of one method to investigate the dynamics of such networks.
In the `Methods' chapter, two methods are proposed: one for transcription factor binding sites (TFBS) prediction and the other for gene clustering. The results from TFBS prediction can be used as an input for the gene clustering algorithm. Particularly, a new data fusion method is developed and novel information sources are explored to improve TFBS prediction accuracy in comparison with previous methods. Three finite joint mixture models are developed to cluster genes from multiple data sources: the beta-Gaussian mixture model (BGMM), the stratified beta-Gaussian mixture model (sBGMM) and the Gaussian-Bernoulli mixture model (GBMM). These methods are shown to significantly improve the accuracy of TFBS predictions and clustering results.
In the `Application' chapter, one of the developed methods is applied to detect noisy attractors in delayed stochastic models of GRNs. The detection of noisy attractors is carried out for a model of a genetic toggle switch (TS) and for a model of an excitable genetic circuit of Bacillus subtilis responsible for phenotypic changes, by fusing multiple data sources extracted from the dynamics of the corresponding GRN. The results suggest that resorting to a single data source alone is, in general, insufficient to reveal the underlying structure of the GRN or to capture the changes in the dynamics of a GRN modeled according to the delayed stochastic framework.
In summary, this thesis focuses on developing and applying data fusion methods to explore the topology and dynamics of a GRN, including TFBS prediction, gene clustering and noisy attractor detection. The developed algorithms and strategies are applicable to investigate real biological phenomena, and the findings can be used to guide future wet- or dry-lab experiments.
Kokoelmat
- Väitöskirjat [4850]