event

Ph.D. Defense of Dissertation: Ivan Antonov

Primary tabs

Ph.D. Defense of Dissertation Announcement
Title: Detection of frameshifts and improving genome annotation

Ivan Antonov
School of Computational Science and Engineering
College of Computing
Georgia Institute of Technology

Date: Monday, November 5, 2012
Time: 4:30PM
Location: IBB 1128 (Parker H. Petit Institute for Bioengineering & Bioscience building)

Committee:

  • Prof. Mark Borodovsky (Advisor & Committee Chair, Department of Biomedical Engineering & School of Computational Science and Engineering)
  • Prof. Brian Hammer (School of Biology)
  • Prof. King I. Jordan (School of Biology, Center for Bioinformatics and Computational Genomics)
  • Prof. Kostas T. Konstantinidis (School of Civil and Environmental Engineering, Center for Bioinformatics and Computational genomics)
  • Prof. Le Song (School of Computational Science and Engineering) Prof. Pavel Baranov (Biochemistry Department, University College Cork, Ireland)


Abstract:
Analysis of intronless gene sequences available in the public databases revealed that some protein coding regions contain frameshifts, i.e. frame transition from one reading frame to another. A frameshift could be related to a sequencing error, an indel mutation or a recoding event (effective change of genetic code unit (codon) conserved in evolution and involved in regulation of gene expression). Annotators of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein coding genes. Developing tools for identification of genes with frameshifts and methods for characterization of their true nature will help improve annotations of genomes in databases.

In this dissertation research, we present a new algorithm and software program, called GeneTack, the tool for ab initio frameshift detection in intronless protein-coding nucleotide sequences. We demonstrated that the frameshift prediction accuracy of GeneTack was noticeably higher than the accuracy of two earlier developed programs (FrameD and FSFind.
GeneTack was used to screen 1,106 complete prokaryotic genomes and 1,165,799 eukaryotic mRNAs. Genes with predicted frameshifts (fs-genes) were grouped into clusters based on sequence similarity, conservation of predicted frameshift position, and its direction. Notably, 4,302 prokaryotic fs-genes from 146 clusters were predicted to be programmed frameshift candidates.

Wet lab experiments were performed to verify predicted programmed frameshifts in 20 out of the 146 clusters; programmed frameshifting with efficiency higher than 10% was experimentally observed for genes in four clusters, which is a significant advance in understanding gene structure and expression regulation. The clusters of eukaryotic genes are included genes with known programmed frameshifts as well as candidate dual coding genes.

All the tools and the database of fs-genes are available at the GeneTack web site http://topaz.gatech.edu/GeneTack/

Status

  • Workflow Status:Published
  • Created By:Jupiter
  • Created:10/30/2012
  • Modified By:Fletcher Moore
  • Modified:10/07/2016

Categories

  • No categories were selected.

Keywords

  • No keywords were submitted.