Data integration and data synchronization in the business application

July 01, 2016

As a head-up to the SEMANTiCS 2016 we invited several experts from Linked Enterprise Data Services (LEDS), a “Wachstumskern” project supported by the German Federal Ministry of Research and Technology (BMBF), to talk a bit about their work and visions. They will share their insights into the fields of natural language processing, e-commerce, e-government, data integration and quality assurance right here. So stay tuned.

In many cases semantic technologies are still academic ideas, which miss real-world use cases. But for Robert Isele it is more important that the field of semantic technologies also makes sense and enhances efficiency in everyday business. As the person responsible for data integration at the eccenca GmbH he needs tangible technologies that satisfy both the market requirements and the growing legal requirements. Working in the Linked Data environment for the past ten years he has been constantly pushing the borders between academic theory and commercial application, even graduating with a PhD on the scalable integration of data sources using genetic algorithms.

Where Natanael Arndt has illuminated the academic side about semantic data integration two weeks ago, Robert Isele will share his insights on business challenges and the respective requirements for semantic technologies.

Which uses cases face businesses in terms of data processing and in the context of Big Data and IoT?

Traditional applications for data integration, in particular the integration of product and customer data, are still important in the context of big data.

In addition, in recent years new regulatory requirements are increasingly rising demands on the enterprise-wide data integration. In particular the banking sector faces this issue with the mandatory risk reporting according to BCBS 239.

In the context of IoT, the publication, extraction and composition of distributed and heterogeneous information certainly is the biggest challenge. Current research projects, such as the EU research project bIoTope, apply advanced real-time data integration technologies to allow for - among other aspects - efficient route planning systems in the Smart City environment.

Which challenges do companies face towards implementing the use cases?

The biggest challenge is still the operational costs to collect data which are typically distributed in different areas of the company, as well as the subsequent identification and linking of related information.

The increasing volume of available data becomes even more complex with a high data heterogeneity and the request to make data quality controllable through the automatic validation of incoming data.

How is data currently being processed? What are the disadvantages?

According to the 2015 Data Management Industry Benchmark Report of the EDM Councils, which mainly focuses on companies in the financial industry, companies are still largely lacking a comprehensive data management infrastructure. This includes the mapping of complex data streams associated with the tracking of data lineage, as well as the use of unique identifiers, hence an enterprise-wide ontology for a uniform mapping of critical data elements in the company.

How will semantic technologies improve data processing?

Semantic data management allows you to centrally define taxonomies and ontologies, which detect the structural relations of different entities. Hence, you can precisely and unambiguously define the relevance of critical data elements put the importance of critical data elements (CDE) ensuring that the management of data in corporate divisions is carried out consistently and transparently.

Since a uniform representation of data throughout different company divisions is often hardy possible, semantic technologies enable the mapping of the physical data management on specialized ontologies, therefore decoupling the physical from the logical data representation.

The processes which process data can also be managed as metadata. This allows the annotation of processed data to trace their system of origin and to achieve data lineage. In addition, processes can be versioned and reproduced easily.

What features and technical solutions do companies need to make the integration and synchronization of data from multiple heterogeneous sources reasonable, cost-effective and profitable for them?

Governance is the key to successful data management. It defines the organizational model and ensures that the principles of data management are implemented. Closely related is the establishment of a data Management life cycle which defines a clear process for the management and processing of corporate data.

Incoming data must be validated and subjected to a consolidation process, which ensures the necessary data quality for business processes. To be able to cope with the complexity of data integration, the harmonization of data across all processes throughout the company needs to be ensured.

Do you see any current shortages with semantic technologies?

The goal of big data solutions like data lakes is the most cost-efficient and scalable processing of large amounts of data. But data lakes generally only manage large amounts of weakly structured data sets.

By contrast, semantic technologies are often subject to the disadvantage of poor scalability.

Projects like LEDS use semantic technologies to apply a virtual knowledge graph to an existing data lake, thus combining the advantages of both approaches.

The timely processing and integration of incoming data streams is another current trend which needs to be addressed and is part of the research at LEDS.

Partners

LEDS is a joint research project addressing the evolution of classic enterprise IT infrastructure to semantically linked data services. The research partners are the Leipzig University and Technical University Chemnitz as well as the semantic technology providers Netresearch, Ontos, brox IT-Solutions, Lecos and eccenca. 

brox IT-Solutions GmbH

Leipzig University

Ontos GmbH

TU Chemnitz

Netresearch GmbH & Co. KG

Lecos GmbH

eccenca GmbH

Supported by