Genomes of a given bacterial species can show great variation in gene content and thus systematic analysis of the entire gene repertoire, termed the pan-genome, is important for understanding bacterial intra-species diversity, population genetics, and evolution. Here, we analyzed the pan-genome from 30 completely sequenced strains of the human gastric pathogen Helicobacter pylori belonging to various phylogeographic groups, focusing on 991 accessory (not fully conserved) orthologous groups (OGs). We developed a method to evaluate the mobility of genes within a genome, using the gene order in the syntenically conserved regions as a reference, and classified the 991 accessory OGs into five classes: Core, Stable, Intermediate, Mobile, and Unique. Phylogenetic networks based on the gene content of Core and Stable classes are highly congruent with that created from the concatenated alignment of fully conserved core genes, in contrast to those of Intermediate and Mobile classes, which show quite different topologies. By clustering the accessory OGs on the basis of phylogenetic pattern similarity and chromosomal proximity, we identified 60 co-occurring gene clusters (CGCs). In addition to known genomic islands, including cag pathogenicity island, bacteriophages, and integrating conjugative elements, we identified some novel ones. One island encodes TerY-phosphorylation triad, which includes the eukaryote-type protein kinase/phosphatase gene pair, and components of type VII secretion system. Another one contains a reverse-transcriptase homolog, which may be involved in the defense against phage infection through altruistic suicide. Many of the CGCs contained restriction-modification (RM) genes. Different RM systems sometimes occupied the same (orthologous) locus in the strains. We anticipate that our method will facilitate pan-genome studies in general and help identify novel genomic islands in various bacterial species.
Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan;Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Institute of Medical Sciences, the University of Tokyo, Minato-ku, Tokyo, Japan;Genetic Information Research Institute, Los Altos, California, United States of America;Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Biostatistics Center, Kurume University, Kurume, Fukuoka, Japan;Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan;Institute of Medical Sciences, the University of Tokyo, Minato-ku, Tokyo, Japan;Tohoku University, Graduate School of Life Sciences, Sendai, Japan;Kyorin University, Faculty of Medicine, Mitaka, Japan
Recommended Citation:
Ikuo Uchiyama,Jacob Albritton,Masaki Fukuyo,et al. A Novel Approach to Helicobacter pylori Pan-Genome Analysis for Identification of Genomic Islands[J]. PLOS ONE,2016-01-01,11(8)