What are the Causes of Your Big Data Problem?

In my last blog, I shared some interesting articles and hopefully got you thinking, “Why can’t I solve my big data problems with all this progress in technology?” But let’s not get ahead of ourselves. Let’s start with the cause of the problem. Why all the problems? Technologies fail for reasons that are as wide-ranging as those that challenge almost any human endeavor. Sometimes the technology is complicated and misunderstood, resulting in its incorrect application. Other times the reasons are far more mundane and bureaucratic. Let’s look at some of the common patterns of failures so you don’t have to repeat them. Failure to understand the data This is by far the most common reason for reaching imprecise conclusions. It’s not ... [ Read More ]

Gooooooooooooaaaaaaaal! Data Analytics Could Improve Soccer Results


I’m a diehard soccer fan and have been glued to the screen during the World Cup games. While watching the last game (viva Brazil!), I was thinking about how one of Cray's customers uses our technology to improve pitcher/batter lineups in baseball. I realized that data analytics could also be used for soccer (known in much of the world as fútbol), with some potentially interesting results. Players are equipped with various devices to monitor heart rate and other factors so the amount of time spent strengthening, training and resting can be optimized. With the explosion of data and realization of the value behind it, every movement is now being recorded, from which foot the players are using to pass the ball to the number of steps ... [ Read More ]

Can You Have a Big Data Problem and Not Know It?

It’s been a while since my last blog post and the only excuse is we’ve been busy working on some really exciting technologies. You’ll hear more about them in the coming weeks so I’ll hold off on the details, but I promise it will be worth the wait. For now, I’d like to share several news articles that caught my attention over the past few months and discuss a few cause and effect problems associated with big data. If you’re in the big data space and involved in making analytical decisions, you’ll find this relevant. The links are an interesting read by themselves, but I’ll give you my perspective as well. Say “Big Data” One More Time! There, I Said It. With increasing frequency, we’re not only hearing about the promises, but also ... [ Read More ]

Don’t use a hammer to screw in a nail: Alternatives to REGEX in SPARQL

In the past I've talked about some tips for tuning SPARQL performance, but one interesting type of query that I didn't touch on comes up time and again as a SPARQL performance problem. In fact, more often than not, it is actually user naiveté that is the real cause. So what does this horrible query look like? Strangely, it is quite innocuous, and I bet a good number of you have written exactly this at some point in the past: SELECT * WHERE { ?s <http://some/predicate> ?o . FILTER (REGEX(?o, "search")) } While this looks like a perfectly sane query, the fact of the matter is that it is really anything but.  Any kind of FILTER in SPARQL involves iterating over all the possible solutions found at the point where the filter ... [ Read More ]

Making Big Data (Centers) More Energy-Efficient


In all this talk about big data, it doesn’t seem like much of it is focused on the centers around the world which house this technology. Data centers are getting bigger and bigger, as companies need more servers, infrastructure, storage and cooling equipment than ever before. This, of course, means the rise of energy costs, which are already at an all-time high: The LA Times reported that in California alone, residential electricity prices rose 30% between 2006 and 2012, so it’s easy to imagine data centers under a similar level of duress. But, according to The New York Times, the real problem is that data centers can waste 90% or more of the electricity they draw! This level of inefficiency is damaging both the environment and the ... [ Read More ]