In my previous blog article, I talked about the data management challenges facing the Oil and Gas industry and how the joint industry project, IOHN, picked semantic technologies as way to bring this data together. In this article, I want to tell you why semantic technologies are proving so useful for data integration.
In the IOHN information platform, each source is represented in the form of graph data, which according to W3C standards for distributed data, means RDF and SPARQL. This allows for an open-ended data management structure, where new data sources can be brought into the architecture. The graph nature of the linked data structure supports novel queries, without needing to re-structure the data for each new kind of query that it encounters.
At the center of the IOHN effort is the idea of a Semantic Model, which builds on the templating structure of the ISO 15926 standard. The goal of the Semantic Model is to build a linked data enterprise as described in . In short, the idea of a linked data enterprise is one in which it is the responsibility of every data provider to make their data accessible to the enterprise in a standard, re-usable way, moving emphasis from ‘need to know’ to ‘responsibility to provide.’
The IOHN project takes a simple hub-and-spoke approach to this problem. Data integration is usually done in an ad hoc manner, in which each data source is linked on an as-needed basis directly to a particular data consumer. This is, by and large, the status quo for data management in the North Sea oil industry.
In contrast, IOHN seeks to replace this massive number of connections with a central, shared representation of meaning, to which each provider maps its data and each consumer matches its requests.
This approach has two advantages; first, fewer data translations are required, since each source/consumer needs to be connected only to the Semantic Model, and not to a multitude of other suppliers. Second, there is no limitation on how many data sources a consumer is connected to; once the connection is made to the ‘hub,’ connection is established to any data source—even ones that joined into the architecture later on.
The semantic model addresses many of the specific issues in the North Sea oil situation. The reliance of interpretation on specialized knowledge is treated explicitly. This specialized knowledge is managed in semantic model, putting the data into context in a formal way. It becomes therefore possible to resolve difference in datasets by comparing how they are mapped into the semantic model.
The IOHN Semantic Model project was completed in 2012, and already has been influential in a number of information management projects. Of particular note is the Exploration & Production Information Management Association’s (EPIM) Reporting Hub . The reporting hub focuses on explorating and production data from operating North Sea oil rigs.
While much of the Reporting Hub effort has gone into the data transfer infrastructure to manage the massive amounts of production data, the key to the success of the project is the common semantics that the IOHN model provides. The Reporting Hub is a working example of how a standards-based semantic model can provide focus to real-world data integration problems.
The Reporting Hub provides a confirmation that the issues addressed by the IOHN project, as well as its technical approach, were sound, and were applicable in real-world information management project in the North Sea. The Reporting Hub encountered the multi-disciplinary situation previewed by the IOHN effort, emphasizing a challenge to understanding the context of each data source. The hub-and-spoke model of the system allowed it to encompass a large number of data sources, and to introduce new ones as the project progressed, without complex re-design of the data architecture.
The Reporting Hub requirements place high importance on the ability to manage novel and sometimes complex queries based on new information needs. And finally, compliance to an international standard like ISO 15926 allowed the whole project to have the organizational focus needed to produce an information sharing system on an industrial scale.
Key to the success of the reporting hub was the emphasis in IOHN on a graph-based linked data approach to representing distributed data. This approach was pivotal in allowing the system to manage information context and support the queries required by the complex North Sea oil environment.
 Linking Enterprise Data
 Delivering Semantic Reporting System for the Oil and Gas Industry, press release, 2011
Dean Allemang, co-author of the bestselling book, Semantic Web for the Working Ontologist, is a consultant, thought leader, and entrepreneur focusing on industrial applications of distributed data technology. He served nearly a decade as Chief Scientist at TopQuadrant, the world’s leading provider of Semantic Web development tools, producing enterprise solutions for a variety of industries. As part of his drive to see the Semantic Web become an industrial success, he is particularly interested in innovations that move forward the state of the art in distributed data technology. Dean’s current work is concentrated on the life sciences and finance industries, where he currently sees the most promising industrial interest in this technology.