2015 ; 25(6):
|Affiliation||Faculty of Biotechnology, College of Applied Life Sciences, Jeju National University, Jeju 690-756, Republic of Korea|
|Title||Bioinformatic Suggestions on MiSeq-Based Microbial Community Analysis|
J. Microbiol. Biotechnol.2015
|Abstract||Recent sequencing technology development has revolutionized fields of microbial ecology.
MiSeq-based microbial community analysis allows us to sequence more than a few hundred
samples at a time, which is far more cost-effective than pyrosequencing. The approach,
however, has not been preferably used owing to computational difficulties of processing huge
amounts of data as well as known Illumina-derived artefact problems with amplicon
sequencing. The choice of assembly software to take advantage of paired-end sequencing and
methods to remove Illumina artefacts sequences are discussed. The protocol we suggest not
only removed erroneous reads, but also dramatically reduced computational workload, which
allows even a typical desktop computer to process a huge amount of sequence data generated
with Illumina sequencers. We also developed a Web interface (http://biotech.jejunu.ac.kr/
~abl/16s/) that allows users to conduct fastq-merging and mothur batch creation. The study
presented here should provide technical advantages and supports in applying MiSeq-based
microbial community analysis.|
|Keywords||Miseq, Mothur, Web interface, 16S rRNA gene, Microbial community, V4|
Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. 2010. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7: 335-336.
Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, et al. 2012. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6: 1621-1624.
Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. 2011. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. USA 108 (Suppl 1): 4516-4522.
Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, et al. 2009. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 37:D141-D145.
Degnan PH, Ochman H. 2012. Illumina-based analysis of microbial community diversity. ISME J. 6: 183-194.
Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. 2011. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27: 2194-2200.
Harismendy O, Ng PC, Strausberg RL, Wang X, Stockwell TB, Beeson KY, et al. 2009. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol. 10: R32.
Huse SM, Welch DM, Morrison HG, Sogin ML. 2010. Ironing out the wrinkles in the rare biosphere through improved OTU clustering. Environ. Microbiol. 12: 1889-1898.
Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. 2013. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl. Environ. Microbiol. 79: 5112-5120.
Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9: 357-359.
Liu B, Yuan J, Yiu SM, Li Z, Xie Y, Chen Y, et al. 2012. COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly. Bioinformatics 28: 2870-2874.
Liu Z, Lozupone C, Hamady M, Bushman FD, Knight R. 2007. Short pyrosequencing reads suffice for accurate microbial community analysis. Nucleic Acids Res. 35: e120.
Magoc T, Salzberg SL. 2011. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27: 2957-2963.
Masella AP, Bartram AK, Truszkowski JM, Brown DG, Neufeld JD. 2012. PANDAseq: paired-end assembler for illumina sequences. BMC Bioinformatics 13: 31.
Mizrahi-Man O, Davenport ER, Gilad Y. 2013. Taxonomic classification of bacterial 16S rRNA genes using short sequencing reads: evaluation of effective study designs. PLoS One 8: e53608.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. 2013. The SILVA ribosomal RNA gene database project:improved data processing and web-based tools. Nucleic Acids Res. 41: D590-D596.
Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75: 7537-7541.
Schuster SC. 2008. Next-generation sequencing transforms today's biology. Nat. Methods 5: 16-18.
Youssef N, Sheik CS, Krumholz LR, Najar FZ, Roe BA, Elshahed MS. 2009. Comparison of species richness estimates obtained using nearly complete fragments and simulated pyrosequencing-generated fragments in 16S rRNA gene-based environmental surveys. Appl. Environ. Microbiol. 75: 5227-5236.
Zhang J, Kobert K, Flouri T, Stamatakis A. 2014. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30: 614-620.
Zhou HW, Li DF, Tam NF, Jiang XT, Zhang H, Sheng HF, et al. 2011. BIPES, a cost-effective high-throughput method for assessing microbial diversity. ISME J. 5: 741-749.