Making Sense of 50 Billion Triples: Getting Started

50-Billion-Triples-blog-image

In my last post I outlined some of the reasons why the promise of graph analytics takes thought and planning to really capitalize on its potential. Now, I’d like to focus on getting started, appropriately, at the beginning with understanding the data itself. So let’s discuss methods of gaining an initial understanding of our data such that we can then feed that newfound understanding back into our ingestion process. First, we need to be able to query the database for some basic information that will tell us not only what is there but how the relationships are expressed. We would also, ideally, find out if there is a hierarchy to the relationships we find. By hierarchy, I mean that graphs can use containers, if you will, to organize ... [ Read More ]

Big Data Advantage, part 3: “The Dude Abides.”

BigData-Blog3

In my prior two posts about analytics, I highlighted the vast opportunity available in big data and the obstacles that prevent organizations from attaining tangible benefits: Complexity across fronts An onslaught of analytics tools The difficulty retaining the right skillsets Slowdowns in getting to insights and decisions But these hurdles can be overcome. For innovative businesses grappling with the realities of big data, an agile analytics environment provides the best of all approaches. Such a platform enables you to seize your big data advantage with a potent combination of system agility and the pervasive speed needed to deliver high-frequency insights. To address this need, Cray has fused supercomputing technology ... [ Read More ]

Making Sense of 50 Billion Triples: No Free Lunch

50-Billion-Triples-blog-image

A lot of grandiose claims have been made promising that graph databases would allow easy ingest of all manner of disparate data and make sense of it ­– and uncover hidden relationships and meaning. This is, in fact, possible — but there are a few considerations that you need to account for to make your database useful to an analyst charged with making sense of the information. There simply is no free lunch; where time and effort are saved in one place, they must be expended (at least partially) elsewhere. Let’s take a look at the fundamental difference between graph databases and relational databases from which these claims stem: Rather than store data in rows and columns, graph databases store data in a simpler format that describes a ... [ Read More ]

Big Data Advantage, part 2: “This is a very complicated case . . . You know, a lotta ins, a lotta outs, lotta what-have-yous.”

Big-Data-Advantage-part-2 (1)

In my last post on big data advantages, I wrote about the potential impact of big data and the types of things companies are looking to get out of their information. However, only four percent of companies extract the full value of their information assets, while 43 percent “obtain little tangible benefit from their information," according to PwC and Iron Mountain in their report “Seizing the information advantage.” Four percent — ouch! You see, although the internet of things and big data analytics have paved the way to previously unimaginable possibilities, they have also opened a Pandora’s box of complexity. And that leads me to another famous quote from “The Big Lebowski.” “This is a very complicated case . . . You know, a lotta ins, ... [ Read More ]

Cray and Forrester Research on Agile Analytics Platform

Analyst-Blog2

A few weeks ago, we hosted a webinar on “The Need for an Agile Analytics Platform” with Cray’s SVP of Products, Ryan Waite, and featuring guest Mike Gualtieri, principal research analyst at Forrester Research. They looked at how analytics environments are becoming more dynamic and how a more agile platform can eliminate many challenges. I found Mike’s characterization of “perishable” insights an important reminder of how quickly some decisions need to be made and the impact of unknowingly stale information.  However, even more interesting for me were the concepts around the analytics workflows themselves becoming more sophisticated and what that means for data scientists and their productivity. Here’s a three-minute snippet where Mike ... [ Read More ]