Big Data Analytics: How Much Will it Cost?

If you’re concerned about how long will it take before you see ROI from an investment in big data analytics technology, you’re not alone. According to research by Enterprise Strategy Group (ESG), 77 percent of individuals who lead their organizations’ big data and analytics strategies believe ROI can take six months to start showing up. Add to that the fact that data analytics technology is just one part of a much broader environment that cuts across numerous teams, and it’s no wonder that finding the best solution can be a daunting task. To help organizations sift through the challenges, ESG, an IT industry analysis and strategy firm, has published a white paper titled “Improving Analytics Economics with Cray; Comparisons to the ... [ Read More ]

2017 Spark Summit East Reveals Progress but Not Disruption

This was a conference to marvel at — and not just for the lineup. How many gigs start with a New England Patriots Super Bowl victory parade and end with a quasi-blizzard? Boston’s “ambiance” aside, engaged attendees sunk their teeth into a feast of tech, much of it centered on real-time performance of analytics workflows, especially in the context of the latest buzz — machine and deep learning (ML and DL), along with AI. It’s hard to believe we’re in the third wave of AI! I started contributing during the second wave in the late 1980s, doing research with neural networks and heuristic search algorithms to auto-sort packages for shipping companies. Among the challenges in the latest wave, tech like Apache® Spark™ plays a significant role on ... [ Read More ]

Cray “Blue Waters” Supercomputer Tackles Gerrymandering

Yan Liu and Wendy K. Tam Cho

Redistricting — the process by which congressional and state legislative district boundaries are drawn — sounds like an unremarkable government chore. And, in theory, it should be. But, too often, it is subject to “gerrymandering,” or manipulation, by the majority political party. Decades ago, University of Illinois political science professor Wendy K. Tam Cho (pictured above) realized that what’s needed is a computational tool that would help the courts objectively measure the fairness of a legislative map. She developed a tool that could generate hundreds of millions of voter district maps that would serve as a “comparison set” — a way to measure the level of partisanship exhibited by any particular electoral map. But any further work ... [ Read More ]

“LEBM”: Cray Creating New Extension to LUBM Benchmark

I’ve written a few posts about the Cray Graph Engine (CGE), a robust, scalable graph database solution. CGE is a graph database that uses Resource Description Framework (RDF) triples to represent the data, SPARQL as the query language and extensions to call upon a set of “classical” graph algorithms. There are two main advantages of CGE. One is that it scales a lot better than most other graph databases — because the other ones weren’t designed by supercomputer wizards. (Of course I would say that.) The other advantage is that not only does CGE scale well, it performs unusually well on complex queries on large, complex graphs. Typical of a lot of complex graph queries: Where are all the places where the red pattern matches a pattern in ... [ Read More ]

50 Billion Triples: Digging a Bit Deeper

In the last two installments of this series (part 1 and part 2), we discussed some higher-level thoughts on striking a balance between easy ingest and more prep work as well as some initial queries to get a sense of an unknown graph’s structure and other characteristics. The queries that we have run to this point were intended to discover structure within our dataset at a macro level. The algorithm that we will run now requires us to consider the dimensionality of our graph. In other words, using algorithms such as centrality or community detection on the entire graph without context is meaningless; we need to run these algorithms on subsets of the data. Prior to delving into the following queries, a quick note: All of the algorithms ... [ Read More ]