50 Billion Triples: Digging a Bit Deeper

50-Billion-Triples-blog-image

In the last two installments of this series (part 1 and part 2), we discussed some higher-level thoughts on striking a balance between easy ingest and more prep work as well as some initial queries to get a sense of an unknown graph’s structure and other characteristics. The queries that we have run to this point were intended to discover structure within our dataset at a macro level. The algorithm that we will run now requires us to consider the dimensionality of our graph. In other words, using algorithms such as centrality or community detection on the entire graph without context is meaningless; we need to run these algorithms on subsets of the data. Prior to delving into the following queries, a quick note: All of the algorithms ... [ Read More ]

Making Sense of 50 Billion Triples: Getting Started

50-Billion-Triples-blog-image

In my last post I outlined some of the reasons why the promise of graph analytics takes thought and planning to really capitalize on its potential. Now, I’d like to focus on getting started, appropriately, at the beginning with understanding the data itself. So let’s discuss methods of gaining an initial understanding of our data such that we can then feed that newfound understanding back into our ingestion process. First, we need to be able to query the database for some basic information that will tell us not only what is there but how the relationships are expressed. We would also, ideally, find out if there is a hierarchy to the relationships we find. By hierarchy, I mean that graphs can use containers, if you will, to organize ... [ Read More ]

Big Data Advantage, part 3: “The Dude Abides.”

BigData-Blog3

In my prior two posts about analytics, I highlighted the vast opportunity available in big data and the obstacles that prevent organizations from attaining tangible benefits: Complexity across fronts An onslaught of analytics tools The difficulty retaining the right skillsets Slowdowns in getting to insights and decisions But these hurdles can be overcome. For innovative businesses grappling with the realities of big data, an agile analytics environment provides the best of all approaches. Such a platform enables you to seize your big data advantage with a potent combination of system agility and the pervasive speed needed to deliver high-frequency insights. To address this need, Cray has fused supercomputing technology ... [ Read More ]

Making Sense of 50 Billion Triples: No Free Lunch

50-Billion-Triples-blog-image

A lot of grandiose claims have been made promising that graph databases would allow easy ingest of all manner of disparate data and make sense of it ­– and uncover hidden relationships and meaning. This is, in fact, possible — but there are a few considerations that you need to account for to make your database useful to an analyst charged with making sense of the information. There simply is no free lunch; where time and effort are saved in one place, they must be expended (at least partially) elsewhere. Let’s take a look at the fundamental difference between graph databases and relational databases from which these claims stem: Rather than store data in rows and columns, graph databases store data in a simpler format that describes a ... [ Read More ]

Big Data Advantage, part 2: “This is a very complicated case . . . You know, a lotta ins, a lotta outs, lotta what-have-yous.”

Big-Data-Advantage-part-2 (1)

In my last post on big data advantages, I wrote about the potential impact of big data and the types of things companies are looking to get out of their information. However, only four percent of companies extract the full value of their information assets, while 43 percent “obtain little tangible benefit from their information," according to PwC and Iron Mountain in their report “Seizing the information advantage.” Four percent — ouch! You see, although the internet of things and big data analytics have paved the way to previously unimaginable possibilities, they have also opened a Pandora’s box of complexity. And that leads me to another famous quote from “The Big Lebowski.” “This is a very complicated case . . . You know, a lotta ins, ... [ Read More ]