Fürber | Data Quality Management with Semantic Technologies | E-Book | www.sack.de
E-Book

E-Book, Englisch, 230 Seiten

Fürber Data Quality Management with Semantic Technologies


1. Auflage 2016
ISBN: 978-3-658-12225-6
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark

E-Book, Englisch, 230 Seiten

ISBN: 978-3-658-12225-6
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark



Christian Fürber investigates the useful application of semantic technologies for the area of data quality management. Based on a literature analysis of typical data quality problems and typical activities of data quality management processes, he develops the Semantic Data Quality Management framework as the major contribution of this thesis. The SDQM framework consists of three components that are evaluated in two different use cases. Moreover, this thesis compares the framework to conventional data quality software. Besides the framework, this thesis delivers important theoretical findings, namely a comprehensive typology of data quality problems, ten generic data requirement types, a requirement-centric data quality management process, and an analysis of related work. 

Dr. Christian Fürber completed his doctoral study under the supervision of Prof. Dr. Martin Hepp at the E-Business and Web Science Research Group of the Universität der Bundeswehr München. He is founder and CEO of the Information Quality Institute GmbH, a company that consults organizations of any size to improve the quality of their data.

Fürber Data Quality Management with Semantic Technologies jetzt bestellen!

Autoren/Hrsg.


Weitere Infos & Material


1;Foreword;6
2;Preface;10
3;Table of Content;12
4;List of Figures;18
5;List of Tables;21
6;List of Abbreviations;23
7;PART I – Introduction, Economic Relevance, and ResearchDesign;26
7.1;1 Introduction;26
7.1.1;1.1 Initial Problem Statement;26
7.1.2;1.2 Economic Relevance;28
7.1.3;1.3 Organization of this Thesis;31
7.1.4;1.4 Published Work;31
7.1.4.1;1.4.1 Book Chapters;32
7.1.4.2;1.4.2 Papers in Conference Proceedings;32
7.1.4.3;1.4.3 Other Publications;32
8;2 Research Design;33
8.1;2.1 Semantic Technologies and Ontologies;33
8.2;2.2 Research Goal;34
8.3;2.3 Research Questions;36
8.4;2.4 Research Methodology;37
8.4.1;2.4.1 Design Science Research Methodology;38
8.4.2;2.4.2 Ontology Development Methodology;43
9;PART II – Foundations: Data Quality, SemanticTechnologies, and the Semantic Web;45
9.1;3 Data Quality;45
9.1.1;3.1 Data Quality Dimensions;46
9.1.2;3.2 Quality Influencing Artifacts;49
9.1.3;3.3 Data Quality Problem Types;51
9.1.3.1;3.3.1 Quality Problems of Attribute Values;53
9.1.3.2;3.3.2 Multi-Attribute Quality Problems;55
9.1.3.3;3.3.3 Problems of Object Instances;57
9.1.3.4;3.3.4 Quality Problems of Data Models;59
9.1.3.5;3.3.5 Common Linguistic Problems;63
9.1.4;3.4 Data Quality in the Data Lifecycle;64
9.1.4.1;3.4.1 Data Acquisition Phase;65
9.1.4.2;3.4.2 Data Usage Phase;66
9.1.4.3;3.4.3 Data Retirement Phase;67
9.1.4.4;3.4.4 Data Quality Management throughout the Data Lifecycle;67
9.1.5;3.5 Data Quality Management Activities;68
9.1.5.1;3.5.1 Total Information Quality Management (TIQM);68
9.1.5.2;3.5.2 Total Data Quality Management (TDQM);72
9.1.5.3;3.5.3 Comparison of Methodologies;74
9.1.6;3.6 Role of Data Requirements in DQM;74
9.1.6.1;3.6.1 Generic Data Requirement Types;75
9.1.6.2;3.6.2 Challenges Related to Requirements Satisfaction;79
10;4 Semantic Technologies;81
10.1;4.1 Characteristics of an Ontology;81
10.2;4.2 Knowledge Representation in the Semantic Web;83
10.2.1;4.2.1 Resources and Uniform Resource Identifiers (URIs);83
10.2.2;4.2.2 Core RDF Syntax: Triples, Literal Triples, and RDF Links;84
10.2.3;4.2.3 Constructing an Ontology with RDF, RDFS, and OWL;85
10.2.4;4.2.4 Language Profiles of OWL and OWL 2;88
10.3;4.3 SPARQL Query Language for RDF;89
10.4;4.4 Reasoning and Inferencing;90
10.5;4.5 Ontologies and Relational Databases;92
11;5 Data Quality in the Semantic Web;94
11.1;5.1 Data Sources of the Semantic Web;94
11.2;5.2 Semantic Web-specific Quality Problems;96
11.2.1;5.2.1 Document Content Problems;97
11.2.2;5.2.2 Data Format Problems;97
11.2.3;5.2.3 Problems of Data Definitions and Semantics;98
11.2.4;5.2.4 Problems of Data Classification;99
11.2.5;5.2.5 Problems of Hyperlinks;100
11.3;5.3 Distinct Characteristics of Data Quality in the Semantic Web;101
12;PART III – Development and Evaluation of the SemanticData Quality Management Framework;103
12.1;6 Specification of Initial Requirements;103
12.1.1;6.1 Motivating Scenario;103
12.1.2;6.2 Initial Requirements for SDQM;104
12.1.2.1;6.2.1 Task Requirements;105
12.1.2.2;6.2.2 Functional Requirements;107
12.1.2.3;6.2.3 Conditional Requirements;108
12.1.2.4;6.2.4 Research Requirements;110
12.1.3;6.3 Summary of SDQM’s Requirements;111
13;7 Architecture of the Semantic Data Quality Management Framework (SDQM);112
13.1;7.1 Data Acquisition Layer;113
13.1.1;7.1.1 Reusable Artifacts for the Data Acquisition Layer;114
13.1.2;7.1.2 Data Acquisition for SDQM;115
13.2;7.2 Data Storage Layer;116
13.2.1;7.2.1 Reusable Artifacts for Data Storage in SDQM;116
13.2.2;7.2.2 The Data Storage Layer of SDQM;117
13.3;7.3 Data Quality Management Vocabulary;119
13.3.1;7.3.1 Reuse of Existing Ontologies;120
13.3.2;7.3.2 Technical Design of the DQM Vocabulary;121
13.4;7.4 Data Requirements Editor;124
13.4.1;7.4.1 Reusable Artifacts for SDQM’s Data Requirements Editor;125
13.4.2;7.4.2 Data Requirements Wiki;126
13.5;7.5 Reporting Layer;129
13.5.1;7.5.1 Reusable Artifacts for SDQM’s Reporting Layer;130
13.5.2;7.5.2 Semantic Data Quality Manager;130
14;8 Application Procedure of SDQM;135
14.1;8.1 Prerequisites;135
14.2;8.2 The Data Quality Management Process with SDQM;136
15;9 Evaluation of the Semantic Data Quality Management Framework (SDQM);147
15.1;9.1 Evaluation of Algorithms;147
15.1.1;9.1.1 Algorithm Evaluation Methodology;147
15.1.2;9.1.2 Application Procedure;148
15.1.3;9.1.3 Results;149
15.2;9.2 Use Case 1: Evaluation of Material Master Data;149
15.2.1;9.2.1 Scenario;150
15.2.2;9.2.2 Setup and Application Procedure of SDQM;150
15.2.3;9.2.3 Results and Findings;152
15.3;9.3 Use Case 2: Evaluation of Data from DBpedia;157
15.3.1;9.3.1 Scenario;157
15.3.2;9.3.2 Specialties of Semantic Web Scenarios;158
15.3.3;9.3.3 Setup and Application Procedure;158
15.3.4;9.3.4 Results and Findings;160
15.4;9.4 Use Case 3: Consistency Checks Among Data Requirements;166
15.4.1;9.4.1 Scenario;167
15.4.2;9.4.2 Application Procedure;167
15.4.3;9.4.3 Summary;169
15.5;9.5 Comparison with Talend OS for Data Quality;170
15.5.1;9.5.1 Representation and Management of Data Requirements;170
15.5.2;9.5.2 Data Quality Monitoring and Assessment Reporting;173
15.5.3;9.5.3 Summary;176
16;PART IV – Related Work;178
16.1;10 Related Work;178
16.1.1;10.1 High-Level Classification Schema;178
16.1.2;10.2 Categorization Schema;179
16.1.2.1;10.2.1 Supported Data Lifecycle Step;179
16.1.2.2;10.2.2 Supported Data Representation;180
16.1.2.3;10.2.3 Supported Data Quality Task;181
16.1.3;10.3 Conventional Rule-Based Approaches;182
16.1.4;10.4 Ontology-based Approaches;183
16.1.4.1;10.4.1 Information System-oriented Approaches;183
16.1.4.2;10.4.2 Web-oriented Approaches;190
16.1.5;10.5 Summary;193
17;PART V Conclusion;196
17.1;11 Synopsis and Future Work;196
17.1.1;11.1 Research Summary;196
17.1.2;11.2 Contributions;198
17.1.3;11.3 Conclusion and Future Work;199
18;Appendix A – Comparison of TIQM and TDQM;202
19;Appendix B –Rules for the Evaluation of SDQM;207
20;Appendix C – Test Data for SDQM’s Evaluation;212
21;Appendix D – Evaluation Results of SDQM’s Data QualityMonitoring Queries;216
22;Appendix E – Evaluation Results of SDQM’s Data QualityAssessment Queries;218
23;References;220



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.