It may be awhile before it shows up on your desktop, but the Pacific Northwest National Laboratory has just made a massive leap in the ability to sequence genomes.
ScalaBLAST is a software tool for use with multiprocessor systems, dividing up the work of analyzing biological information. With the PNNL's supercomputer -- number 30 in the latest Top 500 supercomputer list -- the software allows genome sequencing to happen hundreds of times faster than before. Prior to ScalaBLAST, sequencing the DNA of a single organism took 10 days, a remarkably short time compared to the months and years such a process took less than a decade earlier. With ScalaBLAST, the same machine can analyze 13 organisms in 9 hours, or about 42 minutes per organism.
The PNNL scientists are enthusiastic about the opportunities this could provide:
"Access to and understanding the pieces of genome sequences will allow researchers to understand the body's cellular machinery and discover clues to some types of cancer. And it will help in developing drugs or detection methods to be used for particular diseases," said T.P. Straatsma, a PNNL senior research scientist.And it likely will help in other areas of human health. It's fair to say that, in the realm of human health and disease, if you can solve a problem in one area, you can often solve it in others – that's the nature of human biology," Straatsma said.
Having the ability to process large data sets with this computational tool can also provide new insight into how microorganisms can process toxic pollutants through processes like bioremediation. It also can help understand the components of biological systems, leading to better detection methods for homeland security purposes and making it possible to more quickly identify and respond to threats or develop biological countermeasures.
But the critical element to keep in mind is the rate at which supercomputing is improving, and the reduction in cost of extremely fast systems. The PNNL's supercomputer is #30 now, but last November, it was #16, and it was the fifth fastest supercomputer in the world in November of 2003. By next year, it probably won't be in the top 50. And while we're not likely to see 2,000-processor desktop systems any time soon, we could well see distributed processing arrangements offering equivalent power at a fraction of the cost.
Sequencing an organism in 45 minutes brings individual genome sequencing within reach. We talked a bit about what that means last August, including the Harvard Personal Genome Project. It also makes sequencing the planet and "frozen arks" far simpler, as well -- making it possible to preserve, at least in code, some of the planet's vanishing biodiversity.









