Chemistry and life sciences tend to push the envelopes of research and demand data-intensive operations, simulations and imaging capabilities that can put a major strain on supercomputing systems. Cray is bridging gaps between supercomputing performance and life science’s needs, giving researchers the tools they need to deal with sophisticated project requirements.
The 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and 12th Annual European Conference on Computational Biology (ECCB) will be held in Berlin, Germany, from July 21 – 23, 2013. ISMB/ECCB is one of the most important conferences in Bioinformatics worldwide. Organizers expect more than 1,500 participants to attend this year’s event. ISMB/ECCB provides an opportunity for attendees to learn about the latest developments in the field of computational biology. Multiple parallel sessions offer experimentalists, participants from industry and mainstream ISMB attendees an opportunity to learn from each other’s field. Researchers working in new areas such as next-generation sequencing (NGS) will have an opportunity to explore how unique workflow challenges in the field depend on combining traditional experimentation methods with computational simulations.
Improving DNA sequencing technology
Recent progress in DNA sequencing technology has yielded a new class of devices that allow for the analysis of genetic material with unprecedented speed and efficiency. Next-generation sequencing increasingly shifts the burden from chemistry done in a laboratory to a string manipulation problem, well suited to high-performance computing (HPC). The outputs of these machines are short strings of letters (each letter corresponds to a nucleotide) from the DNA. Each string is called a “read.” The aggregate of all the “reads” contains a person’s genome. The challenge is to assemble all these small reads into a larger sequence of character that can be used for interpretation. Accomplishing this is dependent on software that can leverage high-performance computing architectures.
In general, assembling short reads into a useful form is done by either compiling individual reads (de novo) or mapping these pieces against a reference (mapping). However, the success of the new technology to generate data faster has come at a price. Sequencers produce reads that are too small (< 150 bp) for overlap-layout-consensus assemblers. Instead, de Bruijn graph-based assemblers have proven to be successful at assembling short reads.
Cray Presenting at ISMB
Mark your calendars for a presentation given myself and Bill Long on the use of high-performance computers and de novo assemblers for next-generation sequencing applications. (Session TT25 – Genomic Applications: De Novo Assemblers Parallelization and Code Optimization, Monday, July 22 at 2:40 p.m.)
Carlos P. Sosa, Principal Engineer