At last week’s Bio-IT Conference & Expo, Cray’s Urika-XA™ extreme analytics platform was selected as Best of Show in the IT Infrastructure and Hardware category. For Cray, this recognition was a validation of the work we’ve undertaken in recent years to leverage our experience and expertise in supercomputing and analytics and apply it to the unique challenges of the life sciences industry. I’d like to highlight the three reasons I believe we were selected.
First, for IT organizations in the life sciences, the struggle to address two corporate imperatives — reducing the costs of computing while enabling the timely delivery of results from complex computing tasks — creates an almost overwhelming challenge. For many companies, a shift of computing to the cloud has become a standard answer. But what if the ensuing “time to result” is unacceptable for emerging processes like next-generation sequencing for precision medicine? And what about IT groups that experience “sticker shock” when their monthly bills for their operational and research workloads are well beyond a few proof-of-concept workloads?
An alternative, as discussed by Kevin Leong, is the move toward converged high performance platforms that reduce the need for in-house expertise — particularly skills required to “stand up” a complex big data analytics system, while providing performance capacity to meet current and future genomics workloads. Eliminating time-consuming setup and administration, while accelerating “time-to-result” on a high performance platform, seemed to resonate well with the Bio-IT audience.
A second reason I believe we were selected relates to the work results we demonstrated with innovative software providers. Dave Anstey recently wrote a blog post discussing our work efforts with third parties that are moving NGS workflows to Hadoop® and Spark™ environments.
- Starting with the Halvade scalable sequence wrapper for MapReduce — promoted by Intel and others — we ran performance tests using the GATK (Genome Analysis Toolkit from The Broad Institute) workflow running a Burrows-Wheeler Aligner (BWA). We are able to show a move from a single-node Hadoop cluster to a 50-node cluster reduces runtimes from five days to approximately three hours, and a related move to our Urika-XA platform reduces total runtimes to just two hours.
- We also demonstrated the value that came from the combination of the Urika-XA system and the “Lumenogix Bioinformatics-in-a-Box™.” We demonstrated the completion of a 50x whole genome sequence in fewer than 45 minutes — a process that took almost three times longer when run in a cloud environment.
Finally, I believe that the reputation and track record Cray brings to HPC environments caught the judges’ attention. It’s one thing to be an integrator assembling commodity subsystems into a cluster, but when we say we have experience and expertise in designing and delivering supercomputing solutions, we can back that up with success story after success story. We were asked directly by the judges why our platform was innovative. Our response was simply this: It’s easy enough to throw together parts to make a system, but we integrated compute and storage subsystems well-suited to the task at hand; incorporated the industry-leading Cloudera Hadoop big data and Apache Spark analytics packages, and wrapped the system with the Urika-XA management system to provide unified management for compute, storage and Hadoop. This is a system that is designed to address the problem of extreme analytics from proof of concept through production.
The expert panel that recognized Cray included judges from academia and industry who screened eligible new products and heard presentations on site. To quote Bio-IT World Editor Allison Proffitt, “The innovation on display by Bio-IT World exhibitors never disappoints.” We are pleased to be recognized for the innovation we were able to share with the attendees at Bio-IT as well.