----------------------------------------------------------------------- BIOINFORMATICS COLLOQUIUM College of Science George Mason University ----------------------------------------------------------------------- Comprehensive Full Genomic Sequencing of 2009 Novel H1N1 Viruses by High Throughput "Next-Gen" Sequencing Huo-Shu Houng USA MEDCOM WRAIR Background of 2009 Novel H1N1 Pandemic Outbreaks: The genomes of the last three pandemic influenza viruses (1918 H1N1, 1957 H2N2 and 1968 H3N2) all originated in whole or in part from non-human reservoirs, and the HA genes of all of the pandemic viruses ultimately originated from avian influenza viruses. Novel 2009 influenza A (H1N1) is a new flu virus of swine origin that was first detected in Mexico and the United States in March and April 2009. Since its initial identification and announcement of the unusual outbreaks, the 2009 swine H1N1 virus then quickly spread into Mexico's neighboring country, US via mostly Spring break tourists. Following Mexico and US reported cases, confirmed outbreaks of 2009 swine H1N1 were rapidly proliferated and spread into countries beyond America continental, such as Europe, Asia, Africa, South America possibly through the efficient modern traveling system. WHO then upgraded and announced the novel 2009 H1N1 infections as worldwide flu pandemic infections on June 10, 2009. The novel H1N1 flu mainly spreads in the same way that regular "seasonal influenza" spreads, which is through the air from coughs and sneezes or touching those infected surfaces. It seems that new cases in the U.S. and most cases throughout the world have so far been mild relative to the initial reported cases in Mexico. But because this is a new virus, most people do not have immunity to it, and illness may eventually become more severe and widespread in different demographic and population groups as a result. Along with the actual spread of viral infections, availabilities of 2009 swine H1N1 specific sequences deposited to NCBI's database GenBank also rapidly proliferated starting early April through July, 2009. Laboratories of worldwide origins using mostly Sanger-Dideoxy-terminator sequencing method sequenced most of 2009 novel H1N1 sequences. Based on the up to date sequence comparisons, it is clear that not all deposited 2009 swine H1N1 sequences were identical. However, it was uncertain whether the differences of those supposedly identical/similar causative agents were due to various clinical relevancies, i.e., severe or mild infections. Or it was also possible that different sequences were actually derived from different sequencing schemes using various RT-PCR amplification primers and protocols employed by wide-range laboratories all over the world. Abstract: Since its initial introduction in 2005, the 454 Roche FLX sequencing platform had been utilized for ultra-depth sequencing projects for various microorganisms. The massively parallel pico-liter scale amplifications and pyrosequencing of individual DNA molecules (Margulies et al. 2005) allow scientists to investigate the heterogeneous populations of microbial words that play important role in determining disease outcome and drug resistance. Here, we systematically investigate the potential of ultra-deep pyrosequencing to determine and assemble full genome sequences of 2009 novel H1N1 viruses from worldwide geographic origins. A robust RT-PCR protocol was established to efficiently amplify across the boards of all 8 2009 novel H1N1 RNAs into sufficient cDNA quantities, i.e., greater than 5 ug to be processed and sequenced by the Roche 454 FLX system using MID bar-coding system. Massive DNA sequences, i.e. >1,000,000 reads with mean >200 base pairs in length derived from de novo sequences of each individual cDNA fragments were readily obtainable from each individual Roche 454 FLX sequencing run containing up to 24 bar-coded full genomic influenza A cDNA of difference origins. In addition to general consensus sequences routinely detected by traditional Sanger sequencing method, rare genetic variants, i.e. 1-2% of total viral population could also be detected and confirmed from pyrosequencing that might play important roles in determining/predicting viral virulence or anti-viral drug resistance. Our readiness to handle the next wave of 2009 H1N1 outbreaks could be greatly enhanced by using Roche 454 as a feasible platform to sequence and analyze large number of 2009 novel H1N1 genomes for the imminent large-scale 2009 winter influenza season in north hemisphere.