Making Sense of 50 Billion Triples: Getting Started

In my last post I outlined some of the reasons why the promise of graph analytics takes thought and planning to really capitalize on its potential. Now, I’d like to focus on getting started, appropriately, at the beginning with understanding the data itself. So let’s discuss methods of gaining an initial understanding of our data such that we can then feed that newfound understanding back into our ingestion process. First, we need to be able to query the database for some basic information that will tell us not only what is there but how the relationships are expressed. We would also, ideally, find out if there is a hierarchy to the relationships we find. By hierarchy, I mean that graphs can use containers, if you will, to organize ... [ Read More ]

Urika-GX: A Game Changer for Big Data Analytics

There’s a lot of hype around big data in healthcare and the life sciences. But big data is here to stay. Information is what drives the entire industry. When I worked in big pharma, I learned that the product of a pharmaceutical company is not a pill, it's a label. And a label is just a specific assemblage of information, carefully distilled from terabytes of information collected and analyzed over the course of many years by many intelligent people. To compete, companies have to be very good at turning data into information, and information into knowledge. The stakes couldn't be higher, because every day millions of patients rely on the quality of this data and the strength of the analyses done by researchers. Analyzing big data is ... [ Read More ]

Graph Databases: Key Thoughts from Online Chat

It’s pretty interesting to see graph analytics gain traction in the work of big data. We’ve been focusing on graph databases to round out Hadoop® and Spark™ ecosystems and allow for more advanced analytics — and enable people to uncover never-before-seen patterns. (Tell me that’s not cool!) From solving real-world problems such as detecting cyberattacks and creating value from IoT sensor data to precisely identifying drug interactions faster than ever before, graph has become a powerhouse in looking at complex, irregular and very large datasets to identify patterns in near real-time. On March 16, we hosted an online chat titled “Graph: The Missing Link in Big Data Analytics” with industry experts from Deloitte and Mphasis. Sixty-one ... [ Read More ]

How CGE Achieves High Performance and Scalability

In our graph series so far, we have explored what graph databases are and when they are valuable to use, as well as the Cray Graph Engine (“CGE”), a robust graph solution. For this last installment, we dive into how hardware affects the performance of a graph database. Cray’s main product line, the XC™ series, is mostly used for scientific computing. From the point of view of an applications programmer, there is an important difference between scientific computing and the kind of computations done on a graph database. Programmers call it spatial locality. In a nutshell, if a computation has a lot of spatial locality, when a computation has to fetch some value from memory, the next value it’s going to need is usually stored nearby in the ... [ Read More ]

Graph: The Missing Link in Big Data Analytics

Graph analytics is gaining traction in the world of big data and IoT. From solving real-world problems such as detecting cyberattacks and creating value from IoT sensor data to precisely identifying drug interactions faster than ever before, graph has become a powerhouse in detecting never-before-seen connections and emergent patterns. It’s critical to understand how graph can be added to traditional Hadoop® and Spark™ workflows for successful results. Join us Wednesday, March 16, for a live online chat, “Graph: The Missing Link in Big Data Analytics,” to learn and discuss all things graph analytics. You can easily participate using a Twitter, LinkedIn or Facebook account. Hear from industry experts from Deloitte, Mphasis and Cray who ... [ Read More ]