Bugs in the arctic, how to get and process vast amounts of genomic data

Presenter: James Foster, Professor of Biological Sciences, and Bioinformatics & Computational Biology, University of Idaho

Abstract: The next computational challenge for bioinformatics of genomes is the vast amount of data available from so-called "next generation sequencers". These devices can generate over 54TB of data a year, and the data must usually be analyzed within a few weeks of acquisition. This requires high-end computing. The Initiative for Bioinformatics and Evolutionary STudies (IBEST) at the University of Idaho has recently installed a Roche 454/FLX high-throughput sequencer. To process the data from this device, we have more than tripled our computing capacity to over 750 cores in multiple clusters, and over 30TB of available storage. We plan to more than quadruple this capacity in the next two years. In this talk I will describe the 454 technology, and present a case study of a research project to characterize the bacterial diversity exposed as a glacier retreats on the island of Spittsbergen, far above the arctic circle