Homology modeling and docking study of Danio rerio Carbonic Anhydrase VI - Pentraxin protein and bioinformatics analysis of extra-cellular CAs
Manandhar, Prajwol (2015)
Manandhar, Prajwol
2015
Master's Degree Programme in Bioinformatics
BioMediTech - BioMediTech
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2015-11-24
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:uta-201512022478
https://urn.fi/URN:NBN:fi:uta-201512022478
Tiivistelmä
Background and Aims
Computational prediction and protein structure modeling are the marvelous inventions of computer sciences that have come to the rescue of various biological problems. The technology has revolutionized the biological world of research and helped scientists and researchers to gain insights into their biological questions much efficiently to design experimental research. Carbonic anhydrase (CA) is ubiquitous enzyme existing in all living beings and most importantly serves in catalyzing the reversible reaction of carbon dioxide and bicarbonate interconversion. There are at least 16 different isozymic forms of CAs in higher vertebrates which are mainly categorized on the basis of their sub-cellular localizations, broadly extracellular and intracellular. And recently, certain sub-population of transmembrane isoform CA IX, which is an extracellular CA, has been reported to also exist in nucleus i.e. in the intracellular environment. Likewise, it had been discovered that CA VI, another extracellular isoform, of non-mammalian vertebrates have an additional novel domain related to Pentraxins. The main goal of this research was to look for computational prediction of the nuclear-cytoplasmic signals in the sequences of all three transmembrane CAs: CA IX, CA XII and CA XIV. And, another goal was to model the complete structure of the complex of CA VI and Pentraxin domains of zebrafish Danio rerio. While additionally, some preliminary sequence analyses of the extracellular CAs and Pentraxin proteins were also targeted.
Methods
For the first goal, the orthologous sequences of all transmembrane CAs, CA VI and Pentraxin proteins CRP and SAP were retrieved from Ensembl database, and was addressed to analyses to identify some key features through certain bioinformatics tools. The nuclear localization signal was predicted from NucPred webserver tool while the nuclear export signal was predicted from NetNES webserver tool for transmembrane CAs. While for other sequence analyses, sub-cellular localization prediction was done from TargetP webserver, transmembrane helix prediction was done from TMHMM webserver. As for the second goal, the structures of both CA domain and Pentraxin domain of zebrafish was modeled first using homology modeling technique from their respective template structures analyzed from the PDB database. The homology modeling was done in MODELLER interface of Chimera visualization software. And subsequently, these two generated comparative models of each of the domains were docked together computationally using HADDOCK docking suite available in the webserver.
Results
Almost all analyzed transmembrane CA sequences were predicted to have N-terminal signal peptide, with few exception of some sequences that have missing N-terminal regions in their sequence reads. The NetNES webserver tool predicted the NES sequence motifs mostly in the starting region of the transmembrane helical domain of the transmembrane CAs. In addition, the NucPred webserver tool predicted NLS sequence motifs at the cytoplasmic domains of transmembrane CAs, right at the region where the transmembrane domain ends and the cytoplasmic domain starts. Most of the analyzed sequences of transmembrane CAs were predicted to have these nuclear-cytoplasmic signal motifs with just a few exceptions. Sequence analyses of transmembrane CAs revealed there were dimerization signal motifs in the transmembrane regions of CA XII and CA XIV that could drive the dimerization in the tertiary structure of the proteins. Moreover, there were two extra Cysteine residues conserved among the Pentraxin domain of non-mammalian CA VI which are not present in any of classical Pentraxin CRP and SAP. The comparative models of zebrafish CA VI domain was generated using human CA VI structure as the template and its RMSD was calculated to be 0.254 Å with reference to the template structure. Similarly, the comparative models of zebrafish Pentraxin domain was generated using human SAP structure as the template and its RMSD was calculated to be 0.288 Å with reference to the template structure. Successively, these comparative models of each domain were computationally docked using HADDOCK webserver software, and a docked complex of complete model of zebrafish CA VI with Pentraxin was generated having Haddock score of -115.9 +/- 5.2 and Z-score of -2.5.
Conclusion
The transmembrane CAs are predicted to have NLS and NES sequence motifs in their transmembrane and cytoplasmic domains distinct to these isozyme groups of CAs, which could reflect on their secondary role in the nucleus apart from the normal CA role in extracellular region. Similarly, computational modeling and/or docking study could be very useful for generating models of such biomolecular complexes whose structure would be otherwise difficult to determine through experimental procedures. A good quality model of the zebrafish CA VI with Pentraxin domain was generated through computational modeling and docking procedures that could be useful for researchers for concluding various interpretations.
Computational prediction and protein structure modeling are the marvelous inventions of computer sciences that have come to the rescue of various biological problems. The technology has revolutionized the biological world of research and helped scientists and researchers to gain insights into their biological questions much efficiently to design experimental research. Carbonic anhydrase (CA) is ubiquitous enzyme existing in all living beings and most importantly serves in catalyzing the reversible reaction of carbon dioxide and bicarbonate interconversion. There are at least 16 different isozymic forms of CAs in higher vertebrates which are mainly categorized on the basis of their sub-cellular localizations, broadly extracellular and intracellular. And recently, certain sub-population of transmembrane isoform CA IX, which is an extracellular CA, has been reported to also exist in nucleus i.e. in the intracellular environment. Likewise, it had been discovered that CA VI, another extracellular isoform, of non-mammalian vertebrates have an additional novel domain related to Pentraxins. The main goal of this research was to look for computational prediction of the nuclear-cytoplasmic signals in the sequences of all three transmembrane CAs: CA IX, CA XII and CA XIV. And, another goal was to model the complete structure of the complex of CA VI and Pentraxin domains of zebrafish Danio rerio. While additionally, some preliminary sequence analyses of the extracellular CAs and Pentraxin proteins were also targeted.
Methods
For the first goal, the orthologous sequences of all transmembrane CAs, CA VI and Pentraxin proteins CRP and SAP were retrieved from Ensembl database, and was addressed to analyses to identify some key features through certain bioinformatics tools. The nuclear localization signal was predicted from NucPred webserver tool while the nuclear export signal was predicted from NetNES webserver tool for transmembrane CAs. While for other sequence analyses, sub-cellular localization prediction was done from TargetP webserver, transmembrane helix prediction was done from TMHMM webserver. As for the second goal, the structures of both CA domain and Pentraxin domain of zebrafish was modeled first using homology modeling technique from their respective template structures analyzed from the PDB database. The homology modeling was done in MODELLER interface of Chimera visualization software. And subsequently, these two generated comparative models of each of the domains were docked together computationally using HADDOCK docking suite available in the webserver.
Results
Almost all analyzed transmembrane CA sequences were predicted to have N-terminal signal peptide, with few exception of some sequences that have missing N-terminal regions in their sequence reads. The NetNES webserver tool predicted the NES sequence motifs mostly in the starting region of the transmembrane helical domain of the transmembrane CAs. In addition, the NucPred webserver tool predicted NLS sequence motifs at the cytoplasmic domains of transmembrane CAs, right at the region where the transmembrane domain ends and the cytoplasmic domain starts. Most of the analyzed sequences of transmembrane CAs were predicted to have these nuclear-cytoplasmic signal motifs with just a few exceptions. Sequence analyses of transmembrane CAs revealed there were dimerization signal motifs in the transmembrane regions of CA XII and CA XIV that could drive the dimerization in the tertiary structure of the proteins. Moreover, there were two extra Cysteine residues conserved among the Pentraxin domain of non-mammalian CA VI which are not present in any of classical Pentraxin CRP and SAP. The comparative models of zebrafish CA VI domain was generated using human CA VI structure as the template and its RMSD was calculated to be 0.254 Å with reference to the template structure. Similarly, the comparative models of zebrafish Pentraxin domain was generated using human SAP structure as the template and its RMSD was calculated to be 0.288 Å with reference to the template structure. Successively, these comparative models of each domain were computationally docked using HADDOCK webserver software, and a docked complex of complete model of zebrafish CA VI with Pentraxin was generated having Haddock score of -115.9 +/- 5.2 and Z-score of -2.5.
Conclusion
The transmembrane CAs are predicted to have NLS and NES sequence motifs in their transmembrane and cytoplasmic domains distinct to these isozyme groups of CAs, which could reflect on their secondary role in the nucleus apart from the normal CA role in extracellular region. Similarly, computational modeling and/or docking study could be very useful for generating models of such biomolecular complexes whose structure would be otherwise difficult to determine through experimental procedures. A good quality model of the zebrafish CA VI with Pentraxin domain was generated through computational modeling and docking procedures that could be useful for researchers for concluding various interpretations.