A large part of the free knowledge existing on the Web is available as heterogeneous, semi-structured data, which is only weakly interlinked and in general does not include any semantic classi fication. Due to the enormous amount of information the necessary preparation of this data for integrating it in the Web of Data requires automated processes. The extraction of knowledge from structured as well as unstructured data has already been the topic of research. But especially for the semi-structured data format JSON, which is widely used as a data exchange format e.g., in social networks, extraction solutions are missing. Based on the fi ndings we made by analyzing existing extraction methods, we present our KESeDa approach for extracting knowledge from heterogeneous, semi-structured data sources. We show how knowledge can be extracted by describing diff erent analysis and processing steps. With the resulting semantically enriched data the potential of Linked Data can be utilized.
Michael Krug finished his studies in applied computer science with a diploma in 2011 and is currently a PhD student at the professorship for distributed and self-organizing systems at the Technische Universität Chemnitz. He has been working on several research projects (FP7, ESF) regarding web mashups, media enrichment and component technologies. Since mid of 2015 he is one of two researchers from TU Chemnitz working on the project LEDS and focuses on knowledge extraction, quality assessment and new search paradigms for linked data.