E-Book, Englisch, 400 Seiten
Inmon / Strauss / Neushloss DW 2.0: The Architecture for the Next Generation of Data Warehousing
1. Auflage 2010
ISBN: 978-0-08-055833-2
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
E-Book, Englisch, 400 Seiten
            ISBN: 978-0-08-055833-2 
            Verlag: Elsevier Science & Techn.
            
 Format: EPUB
    Kopierschutz: 6 - ePub Watermark
DW 2.0: The Architecture for the Next Generation of Data Warehousing is the first book on the new generation of data warehouse architecture, DW 2.0, by the father of the data warehouse. The book describes the future of data warehousing that is technologically possible today, at both an architectural level and technology level. The perspective of the book is from the top down: looking at the overall architecture and then delving into the issues underlying the components. This allows people who are building or using a data warehouse to see what lies ahead and determine what new technology to buy, how to plan extensions to the data warehouse, what can be salvaged from the current system, and how to justify the expense at the most practical level. This book gives experienced data warehouse professionals everything they need in order to implement the new generation DW 2.0. It is designed for professionals in the IT organization, including data architects, DBAs, systems design and development professionals, as well as data warehouse and knowledge management professionals. * First book on the new generation of data warehouse architecture, DW 2.0.  
* Written by the 'father of the data warehouse', Bill Inmon, a columnist and newsletter editor of The Bill Inmon Channel on the Business Intelligence Network. 
* Long overdue comprehensive coverage of the implementation of technology and tools that enable the new generation of the DW: metadata, temporal data, ETL, unstructured data, and data quality control.
Autoren/Hrsg.
Weitere Infos & Material
1;Front cover;1
2;DW 2.0: The architecture for the next generation of data warehousing;4
3;Copyright page;5
4;Contents;8
5;Preface;18
6;Acknowledgments;21
7;About the authors;22
8;CHAPTER 1 A brief history of data warehousing and first-generation data warehouses;24
8.1;Data base management systems;24
8.2;Online applications;25
8.3;Personal computers and 4GL technology;26
8.4;The spider web environment;27
8.5;Evolution from the business perspective;28
8.6;The data warehouse environment;29
8.7;What is a data warehouse?;30
8.8;Integrating data—a painful experience;30
8.9;Volumes of data;31
8.10;A different development approach;31
8.11;Evolution to the DW 2.0 environment;32
8.12;The business impact of the data warehouse;34
8.13;Various components of the data warehouse environment;34
8.13.1;ETL—extract/transform/load;35
8.13.2;ODS—operational data store;36
8.13.3;Data mart;36
8.13.4;Exploration warehouse;36
8.14;The evolution of data warehousing from the business perspective;37
8.15;Other notions about a data warehouse;37
8.16;The active data warehouse;38
8.17;The federated data warehouse approach;39
8.18;The star schema approach;41
8.19;The data mart data warehouse;43
8.20;Building a "real" data warehouse;44
8.21;Summary;45
9;CHAPTER 2 An introduction to DW 2.0;46
9.1;DW 2.0—a new paradigm;47
9.2;DW 2.0—from the business perspective;47
9.3;The life cycle of data;50
9.4;Reasons for the different sectors;53
9.5;Metadata;54
9.6;Access of data;56
9.7;Structured data/unstructured data;57
9.8;Textual analytics;58
9.9;Blather;61
9.10;The issue of terminology;61
9.11;Specific text/general text;63
9.12;Metadata—a major component;63
9.13;Local metadata;66
9.14;A foundation of technology;68
9.15;Changing business requirements;70
9.16;The flow of data within DW 2.0;71
9.17;Volumes of data;73
9.18;Useful applications;74
9.19;DW 2.0 and referential integrity;75
9.20;Reporting in DW 2.0;76
9.21;Summary;76
10;CHAPTER 3 DW 2.0 components—about the different sectors;78
10.1;The Interactive Sector;78
10.2;The Integrated Sector;85
10.3;The Near Line Sector;94
10.4;The Archival Sector;99
10.5;Unstructured processing;109
10.6;From the business perspective;113
10.7;Summary;115
11;CHAPTER 4 Metadata in DW 2.0;118
11.1;Reusability of data and analysis;119
11.2;Metadata in DW 2.0;119
11.3;Active repository/passive repository;122
11.4;The active repository;123
11.5;Enterprise metadata;124
11.6;Metadata and the system of record;125
11.7;Taxonomy;127
11.8;Internal taxonomies/external taxonomies;127
11.9;Metadata in the Archival Sector;128
11.10;Maintaining metadata;129
11.11;Using metadata—an example;129
11.12;From the end-user perspective;132
11.13;Summary;133
12;CHAPTER 5 Fluidity of the DW 2.0 technology infrastructure;134
12.1;The technology infrastructure;135
12.2;Rapid business changes;137
12.3;The treadmill of change;137
12.4;Getting off the treadmill;138
12.5;Reducing the length of time for IT to respond;138
12.6;Semantically temporal, semantically static data;138
12.7;Semantically temporal data;139
12.8;Semantically stable data;140
12.9;Mixing semantically stable and unstable data;141
12.10;Separating semantically stable and unstable data;141
12.11;Mitigating business change;142
12.12;Creating snapshots of data;143
12.13;A historical record;143
12.14;Dividing data;144
12.15;From the end-user perspective;144
12.16;Summary;145
13;CHAPTER 6 Methodology and approach for DW 2.0;146
13.1;Spiral methodology—a summary of key features;147
13.2;The seven streams approach—an overview;152
13.3;Enterprise reference model stream;152
13.4;Enterprise knowledge coordination stream;152
13.5;Information factory development stream;156
13.6;Data profiling and mapping stream;156
13.7;Data correction stream;156
13.8;Infrastructure stream;156
13.9;Total information quality management stream;157
13.10;Summary;160
14;CHAPTER 7 Statistical processing and DW 2.0;164
14.1;Two types of transactions;164
14.2;Using statistical analysis;166
14.3;The integrity of the comparison;167
14.4;Heuristic analysis;168
14.5;Freezing data;169
14.6;Exploration processing;169
14.7;The frequency of analysis;170
14.8;The exploration facility;170
14.9;The sources for exploration processing;172
14.10;Refreshing exploration data;172
14.11;Project-based data;173
14.12;Data marts and the exploration facility;175
14.13;A backflow of data;175
14.14;Using exploration data internally;178
14.15;From the perspective of the business analyst;178
14.16;Summary;179
15;CHAPTER 8 Data models and DW 2.0;180
15.1;An intellectual road map;180
15.2;The data model and business;180
15.3;The scope of integration;181
15.4;Making the distinction between granular and summarized data;182
15.5;Levels of the data model;182
15.6;Data models and the Interactive Sector;184
15.7;The corporate data model;185
15.8;A transformation of models;186
15.9;Data models and unstructured data;187
15.10;From the perspective of the business user;189
15.11;Summary;190
16;CHAPTER 9 Monitoring the DW 2.0 environment;192
16.1;Monitoring the DW 2.0 environment;192
16.2;The transaction monitor;192
16.3;Monitoring data quality;193
16.4;A data warehouse monitor;194
16.5;The transaction monitor—response time;194
16.6;Peak-period processing;195
16.7;The ETL data quality monitor;197
16.8;The data warehouse monitor;199
16.9;Dormant data;200
16.10;From the perspective of the business user;201
16.11;Summary;202
17;CHAPTER 10 DW 2.0 and security;204
17.1;Protecting access to data;204
17.2;Encryption;204
17.3;Drawbacks;205
17.4;The firewall;205
17.5;Moving data offline;205
17.6;Limiting encryption;207
17.7;A direct dump;207
17.8;The data warehouse monitor;208
17.9;Sensing an attack;208
17.10;Security for near line data;210
17.11;From the perspective of the business user;210
17.12;Summary;211
18;CHAPTER 11 Time-variant data;214
18.1;All data in DW 2.0—relative to time;214
18.2;Time relativity in the Interactive Sector;215
18.3;Data relativity elsewhere in DW 2.0;215
18.4;Transactions in the Integrated Sector;216
18.5;Discrete data;217
18.6;Continuous time span data;217
18.7;A sequence of records;219
18.8;Nonoverlapping records;220
18.9;Beginning and ending a sequence of records;220
18.10;Continuity of data;221
18.11;Time-collapsed data;221
18.12;Time variance in the Archival Sector;222
18.13;From the perspective of the end user;223
18.14;Summary;223
19;CHAPTER 12 The flow of data in DW 2.0;226
19.1;The flow of data throughout the architecture;226
19.2;Entering the Interactive Sector;226
19.3;The role of ETL;228
19.4;Data flow into the Integrated Sector;228
19.5;Data flow into the Near Line Sector;230
19.6;Data flow into the Archival Sector;232
19.7;The falling probability of data access;232
19.8;Exception-based flow of data;233
19.9;From the perspective of the business user;236
19.10;Summary;237
20;CHAPTER 13 ETL processing and DW 2.0;238
20.1;Changing states of data;238
20.2;Where ETL fits;238
20.3;From application data to corporate data;239
20.4;ETL in online mode;239
20.5;ETL in batch mode;240
20.6;Source and target;241
20.7;An ETL mapping;242
20.8;Changing states—an example;242
20.9;More complex transformations;244
20.10;ETL and throughput;245
20.11;ETL and metadata;246
20.12;ETL and an audit trail;246
20.13;ETL and data quality;247
20.14;Creating ETL;247
20.15;Code creation or parametrically driven ETL;248
20.16;ETL and rejects;248
20.17;Changed data capture;249
20.18;ELT;249
20.19;From the perspective of the business user;250
20.20;Summary;251
21;CHAPTER 14 DW 2.0 and the granularity manager;254
21.1;The granularity manager;254
21.2;Raising the level of granularity;255
21.3;Filtering data;255
21.4;The functions of the granularity manager;257
21.5;Home-grown versus third-party granularity managers;259
21.6;Parallelizing the granularity manager;260
21.7;Metadata as a by-product;260
21.8;From the perspective of the business user;261
21.9;Summary;261
22;CHAPTER 15 DW 2.0 and performance;262
22.1;Good performance—a cornerstone for DW 2.0;262
22.2;Online response time;263
22.3;Analytical response time;264
22.4;The flow of data;264
22.5;Queues;265
22.6;Heuristic processing;266
22.7;Analytical productivity and response time;266
22.8;Many facets to performance;267
22.9;Indexing;268
22.10;Removing dormant data;268
22.11;End-user education;269
22.12;Monitoring the environment;269
22.13;Capacity planning;270
22.14;Metadata;271
22.15;Batch parallelization;272
22.16;Parallelization for transaction processing;272
22.17;Workload management;273
22.18;Data marts;274
22.19;Exploration facilities;275
22.20;Separation of transactions into classes;276
22.21;Service level agreements;277
22.22;Protecting the Interactive Sector;277
22.23;Partitioning data;278
22.24;Choosing the proper hardware;279
22.25;Separating farmers and explorers;279
22.26;Physically group data together;280
22.27;Check automatically generated code;280
22.28;From the perspective of the business user;281
22.29;Summary;282
23;CHAPTER 16 Migration;284
23.1;Houses and cities;284
23.2;Migration in a perfect world;285
23.3;The perfect world almost never happens;285
23.4;Adding components incrementally;285
23.5;Adding the Archival Sector;287
23.6;Creating enterprise metadata;288
23.7;Building the metadata infrastructure;289
23.8;“Swallowing” source systems;289
23.9;ETL as a shock absorber;290
23.10;Migration to the unstructured environment;290
23.11;From the perspective of the business user;292
23.12;Summary;293
24;CHAPTER 17 Cost justification and DW 2.0;294
24.1;Is DW 2.0 worth it?;294
24.2;Macro-level justification;294
24.3;A micro-level cost justification;295
24.4;Company B has DW 2.0;296
24.5;Creating new analysis;296
24.6;Executing the steps;297
24.7;So how much does all of this cost?;299
24.8;Consider company B;299
24.9;Factoring the cost of DW 2.0;300
24.10;Reality of information;301
24.11;The real economics of DW 2.0;302
24.12;The time value of information;302
24.13;The value of integration;303
24.14;Historical information;303
24.15;First-generation DW and DW 2.0—the economics;304
24.16;From the perspective of the business user;305
24.17;Summary;305
25;CHAPTER 18 Data quality in DW 2.0;308
25.1;The DW 2.0 data quality tool set;310
25.2;Data profiling tools and the reverse-engineered data model;311
25.3;Data model types;312
25.4;Data profiling inconsistencies challenge top-down modeling;317
25.5;Summary;319
26;CHAPTER 19 DW 2.0 and unstructured data;322
26.1;DW 2.0 and unstructured data;322
26.2;Reading text;322
26.3;Where to do textual analytical processing;323
26.4;Integrating text;324
26.5;Simple editing;325
26.6;Stop words;325
26.7;Synonym replacement;326
26.8;Synonym concatenation;326
26.9;Homographic resolution;326
26.10;Creating themes;327
26.11;External glossaries/taxonomies;327
26.12;Stemming;328
26.13;Alternate spellings;328
26.14;Text across languages;328
26.15;Direct searches;329
26.16;Indirect searches;329
26.17;Terminology;330
26.18;Semistructured data/VALUE = NAME data;330
26.19;The technology needed to prepare the data;331
26.20;The relational data base;332
26.21;Structured/unstructured linkage;332
26.22;From the perspective of the business user;333
26.23;Summary;333
27;CHAPTER 20 DW 2.0 and the system of record;336
27.1;Other systems of record;342
27.2;From the perspective of the business user;342
27.3;Summary;344
28;CHAPTER 21 Miscellaneous topics;346
28.1;Data marts;346
28.2;The convenience of a data mart;347
28.3;Transforming data mart data;348
28.4;Monitoring DW 2.0;349
28.5;Moving data from one data mart to another;350
28.6;Bad data;352
28.7;A balancing entry;353
28.8;Resetting a value;353
28.9;Making corrections;353
28.10;The speed of movement of data;354
28.11;Data warehouse utilities;355
28.12;Summary;360
29;CHAPTER 22 Processing in the DW 2.0 environment;362
29.1;Summary;368
30;CHAPTER 23 Administering the DW 2.0 environment;370
30.1;The data model;370
30.2;Architectural administration;371
30.2.1;Defining the moment when an Archival Sector will be needed;371
30.2.2;Determining whether the Near Line Sector is needed;372
30.3;Metadata administration;374
30.4;Data base administration;375
30.5;Stewardship;376
30.6;Systems and technology administration;378
30.7;Management administration of the DW 2.0 environment;381
30.7.1;Prioritization and prioritization conflicts;381
30.7.2;Budget;381
30.7.3;Scheduling and determination of milestones;382
30.7.4;Allocation of resources;382
30.7.5;Managing consultants;382
30.8;Summary;384
31;Index;386
31.1;A;386
31.2;B;386
31.3;C;387
31.4;D;387
31.5;E;388
31.6;F;389
31.7;G;389
31.8;H;389
31.9;I;389
31.10;K;390
31.11;L;390
31.12;M;390
31.13;N;391
31.14;O;391
31.15;P;391
31.16;Q;392
31.17;R;392
31.18;S;392
31.19;T;393
31.20;U;394
31.21;V;394
31.22;W;394
31.23;Z;394





