Graph Databases: Key Thoughts from Online Chat

It’s pretty interesting to see graph analytics gain traction in the work of big data. We’ve been focusing on graph databases to round out Hadoop® and Spark™ ecosystems and allow for more advanced analytics — and enable people to uncover never-before-seen patterns. (Tell me that’s not cool!) From solving real-world problems such as detecting cyberattacks and creating value from IoT sensor data to precisely identifying drug interactions faster than ever before, graph has become a powerhouse in looking at complex, irregular and very large datasets to identify patterns in near real-time.

On March 16, we hosted an online chat titled “Graph: The Missing Link in Big Data Analytics” with industry experts from Deloitte and Mphasis. Sixty-one fellow graph enthusiasts joined to ask questions and share knowledge.

Below are several interesting comments from the #GetGraph chat. Which do you find most intriguing? Tweet us your answer with #GetGraph.

  • Graph databases have no concept of tables, thus no concept of joins. This is a blessing for irregular data and complex queries.
  • Graph analytics offers capability to search and identify different characteristics of a graph dataset: nodes connected to each other, communities containing nodes, the most influential nodes, chokepoints in a dataset, and nodes similar to each other.
  • Graph is the best choice to analyze human and human-generated activities, like computer networks, road networks, power grids, social networks.
  • Graph analytics is very popular in the life sciences, because nature does not tend to form tables. Natural data tends to be very irregular.
  • There has been some recent interesting work in using graphs to analyze the spread of epidemics (Ebola).
  • Graph is actually used for oilfield logistics in the North Sea of Norway.
  • The data needs to be ready to be consumed for a GraphDB. One way to do that is to convert the data to RDF format, which is formed with ”tuples/triples.” We more or less speak in subject-predicate-object triples.
  • While a conventional DB returns definitive values, in case of a Graph DB combined with knowledge models, it is possible to get ”partially true responses.” This can be used to provide suggestions in case of money-laundering investigations.
  • When a hypothesis needs to lead to deeper data analysis to prove/disprove, it is a good case when graph is a better choice. An example is a financial crime investigation.
  • MapReduce can prep a graph, graph discovers some knowledge, and streaming/CEP handles detecting/acting on that knowledge in ”production.”

Want more graph? Here are a few resources to check out.

Deloitte white paper: “Change the Game: Cyber Reconnaissance”

World Wide Web Consortium standards (tuples/triples)

Cray’s basic introduction to graphs in handy video format:

Speak Your Mind

Your email address will not be published. Required fields are marked *