What Should Your Big Data Strategy Be?

Figure 1. Data driven strategy

In previous articles, I discussed some common big data problems and causes of the problems.  In the final part of this series, we can now get to the icing on the cake – your big data strategy. Read on, my friends. When do you have a big data problem? Since our main interest here is in big data, a fitting question is when do you have a big data problem? The answer is not as straightforward as we’d all like but mostly because we need to have a paradigm shift in terms of how we think about the problem. This HBR article has some really good insight into how data visualization is helping companies understand complex consumer behaviors. The key is to think in aggregates and this is harder than it first appears because finer obvious details … [Read more...]

What are the Causes of Your Big Data Problem?

In my last blog, I shared some interesting articles and hopefully got you thinking, “Why can’t I solve my big data problems with all this progress in technology?” But let’s not get ahead of ourselves. Let’s start with the cause of the problem. Why all the problems? Technologies fail for reasons that are as wide-ranging as those that challenge almost any human endeavor. Sometimes the technology is complicated and misunderstood, resulting in its incorrect application. Other times the reasons are far more mundane and bureaucratic. Let’s look at some of the common patterns of failures so you don’t have to repeat them. Failure to understand the data This is by far the most common reason for reaching imprecise conclusions. It’s not … [Read more...]

Gooooooooooooaaaaaaaal! Data Analytics Could Improve Soccer Results


I’m a diehard soccer fan and have been glued to the screen during the World Cup games. While watching the last game (viva Brazil!), I was thinking about how one of Cray's customers uses our technology to improve pitcher/batter lineups in baseball. I realized that data analytics could also be used for soccer (known in much of the world as fútbol), with some potentially interesting results. Players are equipped with various devices to monitor heart rate and other factors so the amount of time spent strengthening, training and resting can be optimized. With the explosion of data and realization of the value behind it, every movement is now being recorded, from which foot the players are using to pass the ball to the number of steps … [Read more...]

Can You Have a Big Data Problem and Not Know It?

It’s been a while since my last blog post and the only excuse is we’ve been busy working on some really exciting technologies. You’ll hear more about them in the coming weeks so I’ll hold off on the details, but I promise it will be worth the wait. For now, I’d like to share several news articles that caught my attention over the past few months and discuss a few cause and effect problems associated with big data. If you’re in the big data space and involved in making analytical decisions, you’ll find this relevant. The links are an interesting read by themselves, but I’ll give you my perspective as well. Say “Big Data” One More Time! There, I Said It. With increasing frequency, we’re not only hearing about the promises, but also … [Read more...]

Don’t use a hammer to screw in a nail: Alternatives to REGEX in SPARQL

In the past I've talked about some tips for tuning SPARQL performance, but one interesting type of query that I didn't touch on comes up time and again as a SPARQL performance problem. In fact, more often than not, it is actually user naiveté that is the real cause. So what does this horrible query look like? Strangely, it is quite innocuous, and I bet a good number of you have written exactly this at some point in the past: SELECT * WHERE { ?s <http://some/predicate> ?o . FILTER (REGEX(?o, "search")) } While this looks like a perfectly sane query, the fact of the matter is that it is really anything but.  Any kind of FILTER in SPARQL involves iterating over all the possible solutions found at the point where the filter … [Read more...]