event

Ph.D. Defense of Dissertation: Ivan Antonov

Primary tabs

Ph.D. Defense of Dissertation Announcement

Title: Detection of frameshifts and improving genome annotation

Ivan Antonov
School of Computational Science and Engineering

College of Computing Georgia Institute of Technology

Date: Thursday, October 4, 2012
Time: 3:00PM
Location: TBA

Committee:

  • Prof. Mark Borodovsky (Advisor & Committee Chair, School of Computational Science and Engineering)
  • Prof. Brian Hammer (School of Biology)
  • Prof. King I. Jordan (School of Biology)
  • Prof. Kostas T. Konstantinidis (School of Civil and Environment Engineering)
  • Prof. Le Song (School of Computational Science and Engineering)
  • Prof. Pavel Baranov (Biochemistry Department, University College Cork, Ireland)


Abstract:
Analysis of intronless gene sequences available in the public databases revealed that some protein coding regions contain frameshifts, i.e. sudden frame transition from one reading frame to another. Frameshift in a protein coding gene could be due to a sequencing error, an indel mutation or a recoding event (programmed frameshift). Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein coding genes. Identification of genes with frameshifts and revealing their true nature will improve the current genome annotation.

In this dissertation research, we present a new program called GeneTack for ab initio frameshift detection in intronless protein-coding nucleotide sequences. We observed that the frameshift prediction accuracy of GeneTack was higher by a significant margin than the accuracy of two earlier developed programs (FrameD and FSFind). GeneTack was used to screen 1,106 complete prokaryotic genomes and 1,165,799 eukaryotic mRNAs. Genes with predicted frameshifts (fs-genes) were grouped into clusters based on sequence similarity, conservation of predicted frameshift position, and its direction. 5,632 prokaryotic fs-genes from 239 clusters were predicted to be programmed frameshift candidates. Experiments were performed for sequences derived from 20 out of the 239 clusters; programmed ribosomal frameshifting with efficiency higher than 10% was observed for four clusters. Eukaryotic clusters included known programmed frameshift genes and several candidates of dual coding genes.

All the tools and the database of fs-genes are available at the GeneTack web site http://topaz.gatech.edu/GeneTack/

Status

  • Workflow Status:Published
  • Created By:Jupiter
  • Created:09/24/2012
  • Modified By:Fletcher Moore
  • Modified:10/07/2016

Categories

  • No categories were selected.

Keywords

  • No keywords were submitted.