It’s pretty interesting to see graph analytics gain traction in the work of big data. We’ve been focusing on graph databases to round out Hadoop® and Spark™ ecosystems and allow for more advanced analytics — and enable people to uncover never-before-seen patterns. (Tell me that’s not cool!) From solving real-world problems such as detecting cyberattacks and creating value from IoT sensor data to precisely identifying drug interactions faster than ever before, graph has become a powerhouse in looking at complex, irregular and very large datasets to identify patterns in near real-time.
On March 16, we hosted an online chat titled “Graph: The Missing Link in Big Data Analytics” with industry experts from Deloitte and Mphasis. Sixty-one fellow graph enthusiasts joined to ask questions and share knowledge.
Below are several interesting comments from the #GetGraph chat. Which do you find most intriguing? Tweet us your answer with #GetGraph.
- Graph databases have no concept of tables, thus no concept of joins. This is a blessing for irregular data and complex queries.
- Graph analytics offers capability to search and identify different characteristics of a graph dataset: nodes connected to each other, communities containing nodes, the most influential nodes, chokepoints in a dataset, and nodes similar to each other.
- Graph is the best choice to analyze human and human-generated activities, like computer networks, road networks, power grids, social networks.
- Graph analytics is very popular in the life sciences, because nature does not tend to form tables. Natural data tends to be very irregular.
- There has been some recent interesting work in using graphs to analyze the spread of epidemics (Ebola).
- Graph is actually used for oilfield logistics in the North Sea of Norway. https://epim.no/
- The data needs to be ready to be consumed for a GraphDB. One way to do that is to convert the data to RDF format, which is formed with ”tuples/triples.” We more or less speak in subject-predicate-object triples.
- While a conventional DB returns definitive values, in case of a Graph DB combined with knowledge models, it is possible to get ”partially true responses.” This can be used to provide suggestions in case of money-laundering investigations.
- When a hypothesis needs to lead to deeper data analysis to prove/disprove, it is a good case when graph is a better choice. An example is a financial crime investigation.
- MapReduce can prep a graph, graph discovers some knowledge, and streaming/CEP handles detecting/acting on that knowledge in ”production.”
Want more graph? Here are a few resources to check out.
Deloitte white paper: “Change the Game: Cyber Reconnaissance” http://www2.deloitte.com/content/dam/Deloitte/us/Documents/public-sector/us-fed-change-the-game-understand-your-organization-through-an-adversarial-lens.pdf
World Wide Web Consortium standards (tuples/triples) https://www.w3.org/standards/semanticweb/
Cray’s basic introduction to graphs in handy video format: