Hail Powers Precision Medicine on the Urika-GX Platform

Hail is an open-source platform for analyzing variants in genomic data on top of Apache Spark™. It takes advantage of three key elements of Spark’s design: Scalability. Datasets are multi-terabyte and growing rapidly. Simpler APIs. They hide the complexity of distributed computing and parallel execution, and let biologists explore data using familiar biological terms. Algorithms for large-scale linear algebra and ML. Performant code leverages both linear algebra legacy and custom libraries (for example, Cray compilers and tools). Hail provides a parallel Scala API as well as parallel Python API, and adds powerful, expressive high-level layers that include: fast, easy data ingest in various data formats, especially if you’re ... [ Read More ]

SeisSpace and Parallel File Systems: What are You Using?

Here’s a few questions for all you SeisSpace® doodlebuggers out there. What parallel file system are you using for your secondary storage requirements? Have you ever thought about using a different file system? Have you ever wondered if you could combine both your primary and secondary storage requirements on the same parallel file system — and if you did, what would happen? Well, stop wondering. Over the last several months, Cray invested the expertise of its oil & gas performance engineering team in testing those questions, along with a few others that you may be interested in. Cray’s team, along with Dan Grygier, CTO of Taming Traces Consulting, used Cray’s CS400™ cluster supercomputer (which we qualified for SeisSpace last year) ... [ Read More ]

DataWarp™ I/O Accelerator Speeds Scientific Discovery at NERSC

Scientists and researchers worldwide rely upon the scalability of Cray® XC™ supercomputing systems to solve their problems faster than our competition can. So when the productivity of the highly capable Cray system is throttled by the limited capabilities of the storage system, it has to be aggravating for them to have so much valuable compute power at their fingertips and then have to … wait … for … the …. data …. to … arrive. In a recent podcast led by Addison Snell, CEO of Intersect360 Research, Debbie Bard, a big data analyst at the National Energy Research Scientific Computing Center (NERSC), talks about the fifth-fastest computing system in the world*, the Cray® XC40™-based “Cori” system at NERSC. Six thousand scientists and ... [ Read More ]

2017 Spark Summit East Reveals Progress but Not Disruption

This was a conference to marvel at — and not just for the lineup. How many gigs start with a New England Patriots Super Bowl victory parade and end with a quasi-blizzard? Boston’s “ambiance” aside, engaged attendees sunk their teeth into a feast of tech, much of it centered on real-time performance of analytics workflows, especially in the context of the latest buzz — machine and deep learning (ML and DL), along with AI. It’s hard to believe we’re in the third wave of AI! I started contributing during the second wave in the late 1980s, doing research with neural networks and heuristic search algorithms to auto-sort packages for shipping companies. Among the challenges in the latest wave, tech like Apache® Spark™ plays a significant role on ... [ Read More ]

Inventions at Cray: Solving the Hard Problems

In the U.S., Feb. 11 is National Inventor’s Day, timed to coincide with the birthday of Thomas Edison. That’s reason enough to celebrate the inventors among us. The phrase “computer vector register processing” may not sound very inspiring, but that’s what Seymour Cray’s patent for the supercomputer, issued in 1976, was called. Forty years later, his invention still inspires scientists and engineers to change the world. Cray continues to nurture the spirit of invention both internally and among its customers and partners. For his part, Seymour Cray obtained numerous patents throughout his career, but it was U.S. Patent No. 4,128,880 (“Computer vector register processing”) that got him inducted into the National Inventors Hall ... [ Read More ]