E-Book, Englisch, 400 Seiten
Johnston Bitemporal Data
1. Auflage 2014
ISBN: 978-0-12-408055-3
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
Theory and Practice
E-Book, Englisch, 400 Seiten
ISBN: 978-0-12-408055-3
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
Dr. Tom Johnston is the Chief Scientist at Asserted Versioning, LLC, which has developed a middleware product which supports the standard theory of bitemporal data, and which also implements the Asserted Versioning extensions to that standard theory. He is the co-author of Managing Time in Relational Databases (Morgan-Kaufmann, 2010). He lives in Atlanta, Georgia.
Autoren/Hrsg.
Weitere Infos & Material
1;Front Cover;1
2;Bitemporal Data;4
3;Copyright Page;5
4;Dedication;6
5;Contents;8
6;Foreword;16
7;Preface;20
7.1;Perspectives on the Relational Paradigm of Data;21
7.2;The Temporal SQL Standards: ISO 9075:2011 and TSQL2;22
7.3;Audience;23
7.3.1;Business Analysts;24
7.3.2;Enterprise Data Modelers;25
7.3.3;Enterprise Data Architects;26
7.3.4;Database, Data Warehouse, and Data Mart Developers;26
7.3.5;Business Ontologists;26
7.3.6;Computer Science Students;27
7.3.7;Computer Scientists;28
7.4;A Companion Volume to Managing Time in Relational Databases;29
7.5;An Extensive Glossary;29
7.6;A Note on Style;29
7.7;Looking Forward;30
8;Acknowledgments;32
9;1 Bitemporal Data: Preliminaries;34
9.1;Nontemporal, Unitemporal and Bitemporal Data;36
9.1.1;Nontemporal Tables;37
9.1.1.1;Table Names;37
9.1.1.2;Schema Rows;38
9.1.1.3;Foreign Keys and Primary Keys;38
9.1.1.4;Data Model Diagrams;38
9.1.1.5;Stand-Alone Representation of Rows;38
9.1.2;Unitemporal Tables;39
9.1.2.1;Table Names;41
9.1.2.2;Schema Rows;41
9.1.2.3;Temporal Foreign Keys and Temporal Primary Keys;41
9.1.2.4;Stand-Alone Representation of Temporal Rows;42
9.1.2.5;A Note on Assertion-Time Tables;42
9.1.3;Bitemporal Tables;43
9.1.3.1;Table Names;43
9.1.3.2;Temporal Foreign Keys and Temporal Primary Keys;43
9.2;Semantics and its Implementations;44
9.3;Glossary List;46
10;1 Theory;48
10.1;2 Time and Temporal Terminology;52
10.1.1;Time;52
10.1.1.1;Instants and Moments;52
10.1.1.2;Clock Ticks;53
10.1.1.2.1;Hardware Clock Ticks and DBMS Clock Ticks;54
10.1.1.2.2;DBMS Clock Ticks and Chronons;55
10.1.1.2.3;Chronons, Time Periods, and Other Dependent Concepts;55
10.1.1.3;Time Periods;57
10.1.1.3.1;Four Conventions for Time Period Representation With Delimiters;57
10.1.1.3.2;Representing Open Time Periods;59
10.1.2;Temporal Terminology;59
10.1.2.1;Temporal Dimensions;60
10.1.2.1.1;The TSQL2 and Computer Science Terminology;60
10.1.2.1.2;The ISO and IBM Terminology;62
10.1.2.1.3;The Asserted Versioning Terminology;63
10.1.2.2;Types of Tables;63
10.1.2.2.1;The Computer Science Terminology;63
10.1.2.2.2;The Asserted Versioning Terminology;64
10.1.2.3;A Choice of Terminologies;65
10.1.3;Glossary List;66
10.2;3 The Relational Paradigm: Mathematics;68
10.2.1;Tables and Columns;68
10.2.2;Columns and Domains;69
10.2.3;Cartesian Products;70
10.2.4;Functions and Primary Keys;71
10.2.5;Relations;74
10.2.6;Glossary List;74
10.3;4 The Relational Paradigm: Logic;76
10.3.1;Propositional Logic;76
10.3.1.1;Connectives;76
10.3.1.1.1;AND;77
10.3.1.1.2;OR;77
10.3.1.1.3;NOT;78
10.3.1.1.4;IF/THEN;79
10.3.1.2;Well-Formed Formulas;80
10.3.1.3;Transformation Rules;81
10.3.1.4;Rules of Inference;84
10.3.2;Predicate Logic;86
10.3.2.1;Statements and Statement Schemas;88
10.3.3;Logic and the Relational Paradigm;90
10.3.4;Glossary List;91
10.4;5 The Relational Paradigm: Ontology;92
10.4.1;Types and Instances;94
10.4.1.1;A Data Modeling Perspective;94
10.4.1.2;A Philosophical Perspective;94
10.4.1.3;A Set Theoretic Perspective;95
10.4.1.4;A Logic and Language Perspective;96
10.4.1.5;An Analogy;96
10.4.1.6;Summary;96
10.4.2;Instances and Identity;97
10.4.3;The Relational Paradigm Ontology: Aristotelian Roots;97
10.4.3.1;Aristotle on Substance;97
10.4.3.2;Aristotle on Accidents;98
10.4.3.3;Beyond the Aristotelian Roots;100
10.4.4;The Relational Paradigm Ontology;101
10.4.4.1;A Middle Level Extension to the Relational Paradigm Ontology;103
10.4.4.2;States and Change;103
10.4.4.3;Primary Keys Natural Keys, Foreign Keys;105
10.4.4.4;Objects, Events and Change;105
10.4.5;On Using Ontologies;106
10.4.6;Integrating the Mathematics and Ontology of the Relational Paradigm;108
10.4.7;Glossary List;111
10.5;6 The Relational Paradigm: Semantics;112
10.5.1;Rows, Statements, Assertions and Kindred Notions;113
10.5.2;Rows, Inscriptions and Sentences;114
10.5.3;Statements;115
10.5.3.1;Disambiguating Statements;117
10.5.3.2;Statements and Statement Schemas;121
10.5.4;Speech Acts;121
10.5.4.1;Statements and Assertions;122
10.5.4.2;Propositions;124
10.5.4.3;Expressing Assertions Explicitly;125
10.5.5;Glossary List;130
10.6;7 The Allen Relationships;132
10.6.1;Why the Allen Relationships are Important;132
10.6.2;A Taxonomy of the Allen Relationships;133
10.6.2.1;The Basic Allen Relationships;133
10.6.2.1.1;[Starts];134
10.6.2.1.2;[Finishes];135
10.6.2.1.3;[During];136
10.6.2.1.4;[Equals];137
10.6.2.1.5;[Overlaps];137
10.6.2.1.6;[Before];138
10.6.2.1.7;[Meets];139
10.6.2.2;Combinations of the Allen Relationships;140
10.6.3;A Binary Partitioning of the Allen Relationships Taxonomy;140
10.6.3.1;Common Timeline Time Periods: [Includes] or [Excludes];141
10.6.3.2;[Includes]: [Contains] or [Overlaps];141
10.6.3.3;[Contains]: [Equals] or [Encloses];142
10.6.3.4;[Encloses]: [Aligns With] or [During];143
10.6.3.5;[Aligns With]: [Starts] or [Finishes];144
10.6.3.6;[Excludes]: [Before] or [Meets];145
10.6.4;An Allen Relationship Thought Experiment;147
10.6.5;Glossary List;148
10.7;8 Temporal Integrity Concepts and Notations;150
10.7.1;Cubes, Slices and Cells: Data in Three-Dimensional Temporal Space;150
10.7.2;Semantically Anomalous Relational Tables;155
10.7.3;Implicit Bitemporal Time;157
10.7.4;Glossary List;159
10.8;9 Temporal Entity Integrity;160
10.8.1;Entity Integrity;160
10.8.2;Bitemporal Entity Integrity;161
10.8.2.1;Some Bitemporal Transactions;161
10.8.3;State-Time Entity Integrity;168
10.8.4;Conventional Entity Integrity;170
10.8.5;Glossary List;172
10.9;10 Temporal Referential Integrity;174
10.9.1;Temporal Foreign Keys;174
10.9.2;Episodes;176
10.9.3;State-Time Referential Integrity;179
10.9.3.1;A State-Time Delete: Block Mode;180
10.9.3.2;A State-Time Delete: Cascade Mode;181
10.9.3.3;A State-Time Delete: Set Null Mode;183
10.9.4;Bitemporal Referential Integrity;183
10.9.5;Conventional Referential Integrity;187
10.9.6;Glossary List;191
11;2 Practice;192
11.1;11 Temporal Transactions;198
11.1.1;An Overview Of Temporal Transactions;198
11.1.2;Basic Temporal Transactions on State-Time Tables;200
11.1.2.1;A State-Time Insert With Default Time;201
11.1.2.2;A State-Time Update With Default Time;201
11.1.2.3;A State-Time Update With Specified State Time;203
11.1.2.4;A State-Time Delete With Default Time;204
11.1.3;Basic Temporal Transactions on Bitemporal Tables;204
11.1.3.1;A Bitemporal Insert With Default Time;205
11.1.3.2;A Bitemporal Update With Default Time;205
11.1.3.3;A Bitemporal Update With Specified State Time;208
11.1.3.4;A Bitemporal Delete With Default Time;209
11.1.4;Whenever Temporal Transactions;210
11.1.4.1;A Whenever Insert Transaction;212
11.1.4.2;A Whenever Update Transaction;213
11.1.4.3;A Whenever Delete Transaction;214
11.1.5;Temporal Merge Transactions;215
11.1.6;Glossary List;218
11.2;12 Basic Temporal Queries;220
11.2.1;Temporal Query Syntax;220
11.2.2;Bitemporal Tables and Views;223
11.2.2.1;The Conventional Table View;223
11.2.2.2;The Logfile View;225
11.2.2.3;The Version View;228
11.2.3;Point-In-Time Range Queries;230
11.2.4;Range Queries;232
11.2.5;Glossary List;236
11.3;13 Advanced Temporal Queries;238
11.3.1;A Basic Temporal Range Multi-Table Query;238
11.3.1.1;Step 1: Decoalesce and Restrict on Assertion Time;240
11.3.1.2;Step 2: Decoalesce and Restrict on State Time;241
11.3.1.3;Step 3: Drop Assertion-Time Period Columns;241
11.3.1.4;Step 4: Align on State-Time Boundaries;243
11.3.1.5;Step 5: Join on RefId and State Time;245
11.3.2;A Complex Temporal Range Multi-Table Query;247
11.3.2.1;Step 1: Decoalesce and Restrict on Assertion Time;248
11.3.2.2;Step 2: Decoalesce and Restrict on State Time;249
11.3.2.3;Step 3: Drop Assertion-Time Period Columns;250
11.3.2.4;Step 4: Align on State-Time Boundaries;252
11.3.2.5;Step 5: Join on RefId and State-Time;254
11.3.3;Why Temporal Range Multi-Table Queries are Complex;255
11.3.4;Glossary List;256
11.4;14 Future Assertion Time;258
11.4.1;Future Assertion Time: Semantics;258
11.4.1.1;The Six-Fold Way;259
11.4.1.2;Challenging the Six-Fold Way;260
11.4.1.2.1;Past Assertion Time;261
11.4.1.2.2;Future Assertion Time;261
11.4.1.2.3;The Incompleteness of the Six-Fold Way;263
11.4.1.3;The Nine-Fold Way;263
11.4.1.3.1;Future Assertion Time and Future State Time;264
11.4.2;Future Assertion Time: Implementation;266
11.4.2.1;The Time Travel Paradox;267
11.4.2.2;Future Assertion Time Locking;269
11.4.2.3;Future Transactions With Assertion-Time Locking;270
11.4.3;Glossary List;273
11.5;15 Temporal Requirements;274
11.5.1;Updates and Corrections to Conventional Tables;274
11.5.2;Timestamped Tables;278
11.5.3;Double-Timestamped Tables;282
11.5.4;Double-Timestamps and Corrections;285
11.5.5;The Double-Timestamped Dilemma;286
11.5.6;The Bitemporal Data Solution;287
11.5.7;Glossary List;293
11.6;16 Bitemporal Data and the Inmon Data Warehouse;294
11.6.1;A Brief History of the Data Warehouse;294
11.6.2;What is an Inmon Data Warehouse?;298
11.6.2.1;Subject Orientation;298
11.6.2.2;Integration;299
11.6.2.3;Time-variance;301
11.6.2.4;Nonvolatility;302
11.6.2.5;Support for Management Decision-Making;302
11.6.3;Why Unitemporal Tables Cannot Be Both Time-Variant and Nonvolatile;303
11.6.3.1;Two Senses of “As-Was”;304
11.6.4;The Enterprise Data Warehouse Redefined;305
11.6.5;The Semantics of the EDW and the Question of its Physical Instantiation;306
11.6.5.1;Inmon’s Arguments for a Physical EDW;307
11.6.6;Glossary List;309
11.6.7;Inmon Terms;309
11.7;17 Semantic Integration via Messaging;310
11.7.1;The Objectives of an Enterprise Database;310
11.7.2;Two Paths to Semantic Integration;312
11.7.3;The Enterprise Data Model as a Canonical Message Model;313
11.7.3.1;The Failed Mission of the Enterprise Data Model;314
11.7.3.2;A New Mission for the Enterprise Data Model;315
11.7.3.2.1;Point-to-Point Messaging;315
11.7.3.2.2;Hub-and-Spoke Messaging;316
11.7.3.2.3;A Mapping Rules Dictionary;319
11.7.3.2.4;The Extended Enterprise Data Model;320
11.7.4;Glossary List;322
11.8;18 Bitemporal Data and the Kimball Data Warehouse;324
11.8.1;Star Schemas and Relational Databases;326
11.8.2;The Star Schema Data Warehouse Architecture;327
11.8.3;The Star Schema Design Pattern;327
11.8.4;Reconceptualizing Star Schemas: Fact Tables and Dimension Tables;328
11.8.4.1;Events and Objects;328
11.8.4.2;Surrogate Keys and Natural Keys;330
11.8.5;A Bitemporal Star Schema;332
11.8.5.1;A Bitemporal Dimension Case Study;333
11.8.5.1.1;Dimension Update #1;334
11.8.5.1.2;Dimension Update #2;335
11.8.5.1.3;Dimension Update #3;336
11.8.5.1.4;Dimension Update #4;337
11.8.5.1.5;Dimension Update #5;339
11.8.5.2;Fact Table Analysis;340
11.8.5.3;Summary of the Case Study;342
11.8.6;Bitemporal Dimensions Versus Slowly-Changing Dimensions;344
11.8.7;Glossary List;346
11.8.8;Kimball and Ross Terms;346
11.9;19 Time, Types and the Future of Relational Databases;348
11.9.1;Tritemporal Data and Statement Provenance;349
11.9.1.1;Inscription Time, State Time, Speech Act Time;351
11.9.2;Ontologizing Relational Databases;351
11.9.2.1;The Extended Relational Paradigm Ontology;354
11.9.2.2;The Extended Relational Paradigm Metamodel;356
11.9.2.2.1;The Semantic Component of the Extended Relational Paradigm Metamodel;358
11.9.2.2.1.1;Referent;359
11.9.2.2.1.2;Attribute;359
11.9.2.2.1.3;Statement-Schema;360
11.9.2.2.1.4;Atemporal State;360
11.9.2.2.1.5;Proposition;361
11.9.2.2.1.6;Statement;361
11.9.2.2.1.7;Referent/Statement-Schema;362
11.9.2.2.1.8;Attribute/Statement-Schema;362
11.9.2.2.1.9;Statement-Schema/Atemporal-State;362
11.9.2.2.1.10;Atemporal-State/Statement;363
11.9.2.2.1.11;Proposition/Statement;363
11.9.2.2.2;The Pragmatic Component;363
11.9.2.2.2.1;Source;364
11.9.2.2.2.2;Source-Association;364
11.9.2.2.2.3;Inscription-Act;364
11.9.2.2.2.4;Inscription;365
11.9.2.2.2.5;Speech-Act;365
11.9.2.2.2.6;Inscription/Inscription;365
11.9.2.2.2.7;Source/Speech-Act;366
11.9.2.2.2.8;Source/Inscription-Act;366
11.9.2.2.2.9;Inscription/Inscription-Act;366
11.9.2.2.3;The Integrated Metamodel: Semantic and Pragmatic Components;366
11.9.2.2.3.1;Statement/Inscription;366
11.9.2.2.3.2;Statement/Speech-Act;366
11.9.3;Atomic Statements and Binary Tables;367
11.9.4;Looking Ahead;368
11.9.5;Glossary List;369
11.10;20 Recommendations;370
11.10.1;Recommendations for IT Professionals in End-User IT Organizations;370
11.10.2;Recommendations for Standards Committees and Vendors;371
11.10.2.1;Remove the Ability to Correct Data in State-Time Tables;372
11.10.2.2;Specify Referent Identifiers in SQL Table Definitions;372
11.10.2.3;Specify Temporal Unique Identifiers in SQL Table Definitions;373
11.10.2.4;Package the Bitemporalization of Conventional Tables;373
11.10.2.5;Modify SQL Query Syntax to Clarify Semantics;374
11.10.2.6;Add Whenever Temporal Transactions to the Standard;375
11.10.2.7;Add Future Transaction Time to the Standard;375
11.10.3;Glossary List;375
12;Afterword: Reflections on Mindfulness and Bitemporality;376
13;Bibliography;380
14;Index;392
Preface
Time present and time past
Are both perhaps present in time future
And time future contained in time past.
T. S. Eliot, The Four Quartets: Burnt Norton
In this fragment from Burnt Norton, Eliot describes a Buddhist conception of time, one which encourages us to think of past time, present time and future time as interwoven with one another. This Buddhist concept is a useful counter-balance to our mechanistic notion of time as a linear sequence of moments which occur one after the other, and which constitute a series which can be traversed in one direction only.
Anything at all – you, or me, or any of the changeable objects around us – is at the present moment the latest stage in the history of what we are. With a different history, we would, at this present moment, be other than what we are now. In this sense, William Faulkner was correct when he wrote (in ), “The past is never dead. It’s not even past.”
It is perhaps with human beings, and the short-term and long-term projects and plans that inform their lives, that it is most obviously true that time present and time past are present in time future. Somewhere, a store manager is reviewing a history of product price changes and their effect on sales. She isn’t doing this out of simple curiosity. She is doing it because she wants to maximize future profits for her store. Somewhere, an author is working on the Great American Novel. He isn’t doing it just to pass the time. He imagines a future in which he has accomplished the great work of his life, in which accolades are heaped on him, and in which royalty checks are more than pittances. If and when either of those futures is achieved, it will be because of a history of present moments, each the culmination of a sequence of past moments during which those people worked towards those future goals.
So the intimate relationships of past, present and future manifest themselves in the changes that take place in the world. But they also manifest themselves in the changes that take place in what we say about the world.
This brings us to the subject of this book: temporal data and, in particular, bitemporal data. Bitemporal data is data that is associated with two kinds of time. One of these is the time in which things happen in the world; the other is the time in which descriptions of the world accumulate. The first kind of time is about when things were, are, or will be as the data which describes those things says they were, are, or will be. The second kind of time is about that data itself. It is about when we once thought, or still think, or may eventually come to think, that that data correctly describes what things were, are, or will be like; or at least when we once thought or still think that that data constitutes the best descriptions currently available to us.
This book is about bitemporal data that is persisted in relational databases, and about the information which that data provides. However, the extension to non-relational ways of persisting data is straightforward. I talk about data in relational databases, first of all, because that is the prevalent way of storing character set data, and because character set data is still the prevalent kind of data that describes the things an enterprise engages with, and the processes in which it engages with them.
I talk about data in relational databases, secondly, because the language of relational data and relational databases is a lingua franca among data management professionals. For example, we all know what tables, rows and columns are, and we all know what entity integrity and referential integrity are. Or, at least, we all should know these things.
But I also talk about data in relational databases, thirdly and most importantly, because relational theory is the richest and most mathematically informed of theories of data management. It is thus best suited to incorporate extensions needed to manage bitemporal data while itself remaining stable and well-grounded.
Relational theory also has both an ontology and a semantics, although neither are much discussed. To the best of my knowledge, little has been written about how the ontology and the semantics of the Relational Paradigm (as I will call the use of relational theory in data management) give meaning to the mathematical structures of sets, Cartesian Products and relations, and to their concrete manifestations as tables, columns and rows.
But in this book, I would like to say something about the ontology and the semantics of the Relational Paradigm – a set of concepts based on the relational theory invented by Dr. E. F. Codd, and on the implementation of that theory in the world’s major Database Management Systems (DBMSs). In fact, I don’t think that the Relational Paradigm can be correctly extended to accommodate bitemporal data unless these perspectives are understood and taken into consideration.
Perspectives on the Relational Paradigm of Data
One of the distinctive features of this book is that it discusses relational concepts, and their extension into the realm of bitemporal data, from several perspectives. In these discussions, I try to avoid explanations which mix these perspectives because I think that when that happens, explanations become pseudo-explanations which in fact explain nothing at all. In these discussions, I will occasionally point out examples of perspectival confusion so the reader may be better prepared to recognize it when she encounters it in her own working environment.
One perspectival distinction is the distinction between and . This distinction will become clearer through repeated use, but this much can be said at the outset. The of the Relational Paradigm describes relational data structures, instances of those structures, and transformations made to those instances. It’s about the things that DBAs and programmers construct and manipulate. A Customer table is a data structure, for example, and one row in that table is an instance of that structure. An update to a row in that table is a transformation made to that instance. Syntax describes structures and their instances, and transformations on those instances. Those transformations add instances to a database, change instances in a database, and remove instances from a database. The instances have the structure described by their syntax. The transformations add and remove syntactically valid instances, and change valid instances into other valid instances.
The of the Relational Paradigm is about the information expressed in those data structures and in their instances. Data is created and modified so that it accurately conveys information. If customer Smith changes her name to “Jones”, then we change her name on her row in the Customer table to reflect that change.
The important point here is that what we do to data, we do in order to preserve its value as an embodiment of information. That is all too obvious, of course. But once we get deep into the syntax of data and its management, it is easy to lose sight of this important fact. Information is the master; data is the servant.
Here is a brief example. Relational entity integrity is often explained as the rule that no primary key in a table can be null, and that each primary key must be unique. That is a rule of syntax that a relational DBMS enforces.
Is the semantics of entity integrity left undescribed because it is too obvious to be worth mentioning? Well, consider the fact that the semantics of entity integrity is that a database may never contain contradictory statements. Is this so widely recognized and so obvious as to not be worth mentioning? I don’t think so.
A consideration of contradictory statements is an entry into the realm of propositional logic and predicate logic. I discuss these perspectives on the Relational Paradigm in this book because we data management professionals should have some understanding of that logic, of how it is expressed in the Relational Paradigm, and of how it is used to manage data in relational databases.
We are all willing to do the hand-waving which acknowledges that relational theory is based on mathematics and logic. But if we can catch on to the trick of mathematics and logic embedded in the data structures and transformations that we manage, then we will build better databases and better applications. In particular, we will be more likely to provide generalized solutions to specific problems. These solutions are always more stable in the face of changing requirements than point solutions to specific problems are. They are easier to code and to maintain because they express simpler and clearer patterns than do idiosyncratic implementations of solutions to narrowly conceptualized problems. They are always better solutions.
The Temporal SQL Standards: ISO 9075:2011 and TSQL2
In late 2011, the ISO published the latest release of its SQL standard, ISO 9075:2011. This was the first ISO release to include support for bitemporal data. Prior to that, in 1994, a group of computer scientists published the TSQL2 proposed standard for the management of bitemporal data, but this proposal was never accepted by the ISO. Nonetheless, I will refer to it as a standard because it is a draft standard which represented, at the time, a consensus among a significant part of the computer science community.
A current implementation of the ISO SQL standard can be found in IBM’s DB2 10 DBMS and its successive releases, and a current implementation of the TSQL2 standard can be found in the Teradata 13 DBMS and its...




