Note: A glossary of technical terms used in these columns can be found on the Web sites of the authors. In addition, a listing of earlier articles and/or columns in this series can be found on DMReview.com, MindfulData.com and InbaseInc.com.
In this column, we will describe and illustrate the asserted versioning schema. We begin with Figure 1, which uses our familiar policy table as an example.
Temporal primary key (PK): These three columns make up the PK of an asserted version (AV) table. In this logical schema, we do not show all three left-most PK columns, although that is how PKs usually are shown on diagrams. Instead, we want to emphasize the two temporal dimensions in which each AV row is located: effective time and assertion time. Thus, we place the begin and end dates for each temporal dimension together as pairs.
Object identifier (oid): The object identifier is a surrogate-valued, unique identifier for an object - for a policy, a client, a wellness program, a wellness program category or an enrollment. When surrogate keys are used in a table, the business column or columns that constitute the business-key unique identifier should also be included, and a unique index should be defined on them. But in order to be able to show sample tables across the width of a page, we will not include business keys in our examples.
Effective and assertion begin dates: These are the other two columns that make up an AV PK. The logic of multicolumn PKs permits any number of rows to have identical values in all but one of their columns, provided that the values in the remaining column still distinguish them. The object identifier plus an effective begin date is commonly used as a PK for versioned tables. Thus, the AV PK permits an AV table to contain multiple versions of the same object, beginning on the same effective date.
Temporal foreign key (TFK): A TFK is a type of foreign key, but one that the database management system (DBMS) cannot enforce. Its target is an AV table. But a TFK does not point to a specific row in its target table. It does not point to a specific version of an object or to a specific episode. It designates an object. Our example says that only the policy of which this row is a version belongs to a specific client. But because no row in particular represents clients, i.e. because clients are versioned, the TFK points to no row in particular.
Effective period: A pair of dates defines the effective time period for each version. As explained in part 26 of our series, we use the closed-open convention in which the time period starts on its begin date and ends one day prior to its end date. Effective begin and end dates indicate when the version began and ceased to be effective. This time period defines when a version was the then-current description of its object.
Assertion period: Using the same closed-open convention, this pair of dates indicates when we began to assert the version as true and when, if ever, we ceased to assert it as true. In most cases, we assert a version to be true by the act of inserting it into the table. In these cases, the assertion begin date is the same as the transaction create date. In most cases, once we assert a version to be true, we continue to do so "for all time." So even if a version ceases to be effective, we will usually continually to assert that it was, is and always will be true during its effective time period. However - and this is a very significant point - there are exceptions. Assertions may begin some time after their versions are added to the AV table, and assertions may end at some specified time as well.
Type and co-pay: If this table was a non-temporal table, these columns would contain the business data describing a policy. But because this table is an AV table, these columns describe a temporally restricted assertion of an effective-time delimited version of a policy.
Transaction create (txn-crt): This column contains the date on which the row was physically added to its table. Although we are using simple dates throughout our examples, other granularities of time could be used instead, such as a timestamp.
Figure 2 shows the AV policy table with one row in it.
- Oid - P861 is the unique identifier of a specific policy.
- Eff-beg - This version of P861 became effective on 1/01/04.
- Eff-end - This version of P861 is currently in effect. In these examples, 9999 represents the latest date that the DBMS can represent. For SQL Server, for example, it is 12/31/9999. This version is currently in effect because its eff-beg date is in the past and its eff-end date is in the future. The present is what lies between the past and the future.
- Asr-beg - We began to assert this version as a true representation of what P861 looked like during its indicated effective-time period on 1/01/04. We could have begun asserting this version either before or after its eff-beg date.
- Asr-end - We continue to assert that this version correctly describes policy P861, as it was beginning on 1/01/04 and as it continues to be to this day.
- Epis-beg - An episode of an object begins when its first version begins. Because eff-beg and epis-beg are the same date, we know that this is the first version of this episode of policy P861. This does not necessarily mean that we have no earlier information about P861 in this table. As we will see later, there may be earlier episodes of this policy. All we can conclude from this row is that there is no earlier information about P861, that is temporally contiguous with this version (i.e. no version with an eff-end date of 1/01/04).
- Tfk - This TFK tells us two things. First, it tells us that client C882 is the owner of this policy. Second, because it is a TFK, it tells us that the client table is an AV table, not a non-temporal table (i.e. that the Client AV table may have any number of versions for C882). What a TFK does not tell us is which episode of C882 corresponds to this episode of P861. As we will see in our later discussions of temporal referential integrity, referential integrity constraints between a pair of AV tables that are constraints between episodes in those tables - not between objects or versions. Also, that referential integrity constraints from a non-temporal to an AV table are constraints between rows in the non-temporal tables and episodes in the referenced AV tables.
- Type, co-pay - This rows tells us that from 1/01/04 until a new version of P861 appears in this table this policy was an HMO policy with a $15 co-pay. Moreover, even if newer versions of P861 are added to this table, as long as none of them have an effective time period that overlaps this one, this row will always be the one that describes this policy during this time period.
- Transaction create - Txn is the date on which this row was added to this table. It happens to be the same date as eff-beg and asr-beg, but it could be a different date from either or both of them. Note in particular that with the AV interpretation of bi-temporality the date on which a row begins to be asserted is not necessarily the date on which the row was added to the table. In the extensive computer science literature on bi-temporality, assertion dates are called transaction dates, and the begin date is always identical to the date the row was physically added to the table. As we will see later on, there are useful things we can do with bi-temporal tables if we break this strict correlation between an assertion/transaction begin date and the date the row was physically added.
Next time, we will look at the complete data model of the example we will use to illustrate asserted versioning. This data model is an extension of the model for our sample database of clients and policies. The extension consists of three additional tables. One is a table of wellness programs, the second a table of wellness program categories, and the third is a table of enrollments of clients in those programs.
Immediately following that column, we will begin to present asserted versioning in a series of scenarios. This is an example-driven approach. We will begin with a series of scenarios involving only our AV policy table. This means that the discussion of temporal-referential integrity - a complex issue - will not occur until we fully understand how asserted versioning works in the single-table scenarios.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access