Software System for Microbial Genome Sequence Annotation

The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis.

A standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. AGeS was designed to support three main capabilities. The first is the storage of input contig sequences in FASTA format and the resulting annotation data in a central, customized database, where the data manipulation and visualization steps are performed through easy-to-use graphical user interfaces (GUIs). The second is the annotation of microbial genomes using an integrated software pipeline, which analyzes sequence contigs and locates genomic regions that code for proteins, RNAs, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then annotated using an in-house-developed, high-throughput pipeline, the Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using the opensource genome browser GBrowse. Full genome and protein annotation, storage, and visualization for bacterial genomes have been implemented.

The AGeS system was designed and implemented to provide a standalone, integrated solution that users can install on their computers. AGeS can be installed on either a standalone Linux computer or a Linux cluster. When run on a multicore Linux computer or a Linux cluster, AGeS supports OpenMPI for parallel execution and PBS for batch submission. The AGeS system has been designed for easy integration with future sequence analysis modules. Its Web applications use technologies based on open standards, including Java, JavaScript, and XML.

This work was done by Kamal Kumar, Valmik Desai, Li Cheng, Maxim Khitrov, Deepak Grover, Ravi Vijaya Satya, Chenggang Yu, Nela Zavaljevski, and Jaques Reifman of the Army Medical Research and Materiel Command. ARL-0129



This Brief includes a Technical Support Package (TSP).
Document cover
AGeS: A Software System for Microbial Genome Sequence Annotation

(reference ARL-0129) is currently available for download from the TSP library.

Don't have an account? Sign up here.