Shahbaz | Data Mapping for Data Warehouse Design | E-Book | www.sack.de
E-Book

E-Book, Englisch, 180 Seiten

Shahbaz Data Mapping for Data Warehouse Design


1. Auflage 2015
ISBN: 978-0-12-805335-5
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark

E-Book, Englisch, 180 Seiten

ISBN: 978-0-12-805335-5
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark



Data mapping in a data warehouse is the process of creating a link between two distinct data models' (source and target) tables/attributes. Data mapping is required at many stages of DW life-cycle to help save processor overhead; every stage has its own unique requirements and challenges. Therefore, many data warehouse professionals want to learn data mapping in order to move from an ETL (extract, transform, and load data between databases) developer to a data modeler role. Data Mapping for Data Warehouse Design provides basic and advanced knowledge about business intelligence and data warehouse concepts including real life scenarios that apply the standard techniques to projects across various domains. After reading this book, readers will understand the importance of data mapping across the data warehouse life cycle. - Covers all stages of data warehousing and the role of data mapping in each - Includes a data mapping strategy and techniques that can be applied to many situations - Based on the author's years of real-world experience designing solutions

Qamar shahbaz Ul Haq is currently a senior business intelligence consultant with Stewart Title where he creates cloud based business intelligence and SAAS Big Data applications. He has more than 9 years of experience designing Business Intelligence / Data Warehouses solutions and has spent most of this time in data mapping, working across different industries and cultures learning different aspects of this field. In previous roles he has created solutions ranging from billing systems to semantic design to performance optimization for maximum throughput of data processing.
Shahbaz Data Mapping for Data Warehouse Design jetzt bestellen!

Autoren/Hrsg.


Weitere Infos & Material


1;Front Cover;1
2;Data Mapping for Data Warehouse Design;4
3;Copyright Page;5
4;Dedication;6
5;Contents;8
6;1 Introduction;12
6.1;Definition;12
7;2 Data Mapping Stages;14
7.1;Mapping from the Source to the Data Warehouse Landing Area;14
7.2;Mapping from the Landing Area to the Staging Database;14
7.3;Mapping from the Staging Database to the Load Ready or Target Database;14
7.4;Mapping from Logical Data Model to the Semantic or Access Layer;15
8;3 Data Mapping Types;16
8.1;Logical Data Mapping;16
8.2;Physical Data Mapping;16
9;4 Data Models;18
9.1;Definition;18
9.1.1;Entity;19
9.1.2;Relationship;20
9.1.3;Attributes;22
9.2;Normalized Data Model;22
9.2.1;First Normal Form;22
9.2.2;Second Normal Form;22
9.2.3;Third Normal Form;22
9.3;Dimensional Data Model;23
9.3.1;Fact;23
9.3.2;Dimension;23
9.3.3;Measure;24
9.3.4;Drill-Down and Roll-Up;24
9.4;Star Schema;25
9.4.1;Fact Tables;26
9.4.2;Dimension Tables;26
10;5 Data Mapper’s Strategy and Focus;28
10.1;Mapper Who? How Does He or She Do It?;28
11;6 Uniqueness of Attributes and its Importance;32
11.1;Telecom;32
11.2;Manufacturing;33
11.3;Finance;34
11.4;Uniqueness in Data Warehouse;34
12;7 Prerequisites of Data Mapping;36
12.1;Logical Data Model;36
12.2;Entities and Their Description;36
12.3;Attributes and Their Description;36
12.3.1;Primary Key of Entities;36
12.3.2;Relationship Between Entities;37
12.3.3;Cardinality of the Relationship;38
12.3.4;Change Capture Column of History-Handled Entities;38
12.4;Physical Data Model;38
12.5;Source System Data Model;39
12.6;Source System Table and Attribute Details;39
12.7;Subject Matter Expert;39
12.8;Production Quality Data;39
13;8 Surrogate Keys versus Natural Keys;40
13.1;Natural Keys;40
13.2;Surrogate Keys;40
14;9 Data Mapping Document Format;42
14.1;Header-Level Rules;42
14.2;Column-Level Rules;42
14.3;Major Parts of the Data Mapping Document;42
14.4;Data Mapping Columns Explained;43
14.4.1;Change Date;43
14.4.2;Subject Area;43
14.4.3;Target Table Name;43
14.4.4;Target Column Name;43
14.4.5;Data Type;43
14.4.6;PK;43
14.4.7;Nullable;43
14.4.8;Source System;43
14.4.9;Record ID;43
14.4.10;Source Table Name;45
14.4.11;Source Column Name;45
14.4.12;Data Type of Source Column;45
14.4.13;Transformation Category;45
14.4.14;Transformation Rule;45
14.4.15;Updated By;46
14.4.16;Mapping Priority or Sequence;46
15;10 Data Analysis Techniques;48
15.1;Source Data Sample;48
15.1.1;Direct Access;49
15.1.2;Extraction from a Source;49
15.1.3;Data Files;49
15.2;What to Look For;50
15.2.1;High-Level Inter-Source System Relationship;50
15.2.2;Intra-Source System Table-Level Analysis;52
15.2.3;Column-Level Analysis;53
15.3;Uniqueness;54
15.3.1;Full Row Duplicates;54
15.3.2;Primary Key Duplicates;56
15.3.3;Multiple Extracts;57
15.3.4;Source System Updates;57
15.4;History Pattern Analysis;57
15.4.1;Type 0;58
15.4.2;Type 1;58
15.4.3;Type 2;59
15.4.4;Type 3;60
15.4.5;Type 4;61
15.4.6;Type 6;62
15.4.7;Temporal Database;64
15.4.7.1;Transaction Time;66
15.4.7.1.1;Definition;66
15.4.7.1.2;Limitations;66
15.4.7.2;Valid Time;66
15.4.7.2.1;Definition;66
15.4.7.2.2;Limitations;66
15.4.8;History Data Verification;66
15.5;SQL Tools;70
15.5.1;Automatic Query Generators;70
15.5.2;Aggregate Functions;71
15.5.3;Window and Rank Functions;72
15.6;Microsoft Excel and Other Tools;73
15.6.1;Remove Duplicates;73
15.6.2;Sort;74
15.6.3;Pivot Tables;75
16;11 Data Quality;78
16.1;What Is Data Quality?;79
16.2;How Do You Benefit from Data Quality?;81
16.3;Factors Determining Data Quality;82
16.3.1;Accurate Data;83
16.3.2;Complete Data;83
16.3.3;Legible Data;84
16.3.4;Relevant Data;84
16.3.5;Reliable Data;85
16.3.6;Timely Data;85
16.3.7;Valid Data;85
16.4;Stages of Data Warehousing Susceptible to Data Quality Problems;86
16.5;Classification of Data Quality Issues;87
16.5.1;Data Quality Issues at Data Sources;87
16.5.2;Data Quality Issues During the Data Profiling Stage;88
16.5.3;Data Quality Issues During the Extract, Transform, Load Phase;89
16.5.4;Data Quality Issues During Data Modeling;90
16.6;How Can You Assess Data Quality?;91
16.7;What Can You Do to Make Data Quality a Success?;92
17;12 Data Mapping Scenarios;94
17.1;Data Transformation (Normalized Model);94
17.1.1;Source;94
17.1.2;Target;94
17.1.3;Mapping;95
17.2;Data Joining (Normalized Model);96
17.2.1;Source;99
17.2.2;Target;99
17.2.3;Mapping;99
17.3;Data Integration from Multiple Sources (Normalized Model);101
17.3.1;Source;101
17.3.2;Target;102
17.3.3;Mapping;102
17.4;Data Quality Improvement;102
17.4.1;Source;104
17.4.2;Target;104
17.4.3;Mapping;104
17.5;Prioritized Data Consolidation or Joining;105
17.5.1;Source;105
17.5.2;Target;108
17.5.3;Mapping;108
17.6;History Handling (Normalized Model);109
17.6.1;Source;109
17.6.2;Target;113
17.6.3;Mapping;113
17.7;History Handling Done in the Source (Normalized Model);113
17.7.1;Source;115
17.7.2;Target;115
17.7.3;Mapping;115
17.8;History Handling with No Rules on Date or Time;115
17.8.1;Source;115
17.8.2;Target;117
17.8.3;Mapping;117
17.9;Joining the Source Data with the Target Table;117
17.9.1;Source;120
17.9.2;Target;120
17.9.3;Mapping;120
17.10;History Handling from Snapshots;122
17.10.1;Source;122
17.10.2;Target;122
17.10.3;Mapping;122
17.11;Master Data (Normalized Model);124
17.11.1;Source;125
17.11.2;Target;126
17.11.3;Mapping;126
17.12;Surrogate Keys;129
17.12.1;Source;129
17.12.2;Target;130
17.12.3;Mapping;130
17.13;Call Detail Record (CDR) Mapping;130
17.13.1;Source;133
17.13.2;Target;133
17.13.3;Mapping;133
17.14;Performance Issue Handling in Mapping;136
17.14.1;Source;136
17.14.2;Target;136
17.14.3;Mapping;136
17.15;Business Mapping, Reference, and Lookup Data (Normalized Model);137
17.15.1;Source;141
17.15.2;Target;141
17.15.3;Mapping;141
17.16;Business Key, Surrogate, or Helping Table with Multiple Unique IDs for the Same Logical Concept;143
17.16.1;Source;143
17.16.2;Target;143
17.16.3;Mapping;144
17.16.3.1;Mapping 1;144
17.16.3.2;Mapping 2;144
17.16.3.3;Mapping 3;144
17.16.4;Mapping;145
17.17;Denormalized or Data Mart Table;148
17.17.1;Source;148
17.17.2;Target;148
17.17.3;Mapping;148
17.18;Access, Semantic, or Presentation Layer Attributes Mapping;151
17.18.1;Source;151
17.18.2;Target;151
17.18.3;Mapping;151
17.19;Dimensions Mapping;154
17.19.1;Source;154
17.19.2;Target;154
17.19.3;Mapping;154
17.20;Apply Logic versus Transformation Logic;156
17.21;Dividing the Dataset Into Smaller Chunks;157
17.21.1;Source;157
17.21.2;Target;157
17.21.3;Mapping;157
17.22;Unstructured Data;159
17.22.1;Source;159
17.22.2;Target;163
17.22.3;Mapping;163
17.23;Data Transpose;164
17.23.1;Source;165
17.23.2;Target;165
17.23.3;Mapping;165
17.23.3.1;Transpose: Converting Columns to Rows;165
17.23.3.2;Transpose: Converting Rows to Columns;166
17.24;Aggregate Functions and Loading Cycle;167
17.24.1;Source;167
17.24.2;Target;167
17.24.3;Mapping;168
17.24.3.1;Store Raw Data Separately;168
17.24.3.2;Store Row Count with Aggregated Column;169
17.25;Initial Load versus Delta Load;169
17.26;Recursive Query;170
17.26.1;Source;170
17.26.2;Target;171
17.26.3;Mapping;171
17.27;Loading Sequence of Mapping;173
17.27.1;Source;173
17.27.2;Target;174
17.27.3;Mapping;174
18;Glossary and Nomenclature List;178
19;Bibliography;180
20;Back Cover;181



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.