Note: a glossary of technical terms used in these articles can be found on the Web sites of the authors at MindfulData.com and InbaseInc.com.
In this column, we discuss a bi-temporal transaction which results in the merger of two episodes.
Figure 1 shows two episodes, four months apart. In this example, it is now March 1, 2009. An original insert transaction has been submitted, one intended to create a version of policy P861 that is effective from 4/1/08 to 8/1/08. For eleven months, we have overlooked the fact that P861 was in effect during those four months of last year. We are now going to correct our oversight.
Figure 1: Merging Episodes - an Invalid Transaction
We already know that an original insert does not update an existing episode; it creates a new one. We also know that, for the same object, every episode is separated from adjacent episodes by at least one clock tick. And for this transaction, that is a problem. Perhaps the person submitting the transaction does not know there already is an episode for P861 that is effective up to 4/01/08. That person may not know that there already is an episode for P861 that begins on 8/01/08. But either episode is sufficient to render this original transaction invalid. If it were applied to the database, the resulting version would not create a new episode, because it would be clock-tick contiguous with both the episodes already in the table.
But it reasonable to require the user to know about these episodes, and to require him/her to write an original update transaction instead of an insert? Why can't our asserted versioning framework code, having determined that the transaction does not specify any effective time for P861 that is already specified in the target table, simply apply the transaction?
Asserted versioning framework code certainly could do this, and the result would be a valid state of the database one continuous episode from January 2008 through to the present. But just as an insert against a conventional table expresses the user's assumption that a matching row does not already exist in that table, so too an insert against an asserted version table expresses the user's assumption that his/her transaction will create a new episode, not modify an existing one. And in this case, that assumption is wrong.
Some database management systems support what is called an "upsert" transaction. This is a transaction that will be treated as an insert if a matching row does not already exist in the target table but will be treated as an update if a matching row does exist. We could define an asserted versioning upsert transaction and write the framework code to handle it. If we did so and the transaction in Figure 1 was changed to an upsert, then it would be a valid transaction and we could apply it to its target table.
But lacking an asserted versioning upsert (for now), we must require the user to submit an original update instead of an original insert. This new transaction is shown in Figure 2.
Figure 2: Merging Episodes - a Valid Transaction
In earlier columns, we examined a transaction that extended an episode forward in time, and also a transaction that extended an episode backward in time. This transaction fills in an effective time gap between two episodes, and so it can be thought of either as extending the earlier episode forward to meet the later one or as extending the later episode backward to meet the earlier one.
In fact, because one of the purposes of bi-temporal tables is to retain the ability to present the state of the data as of any past point in time, this update transaction requires us to do both. Prior to the transaction, the database contains two episodes; after the transaction it contains a single episode that includes in its effective time period the effective time periods of the original two episodes, as well as the entire effective time between them. We must not lose the information that, prior to the transaction, there were two episodes with a five-month gap between them.
Applying the Update, Stage 1
As we know from previous columns, an original update causes at least one assertion to be withdrawn, replaced and superceded. That withdrawn assertion is the assertion of the row that the new version will be contiguous with. It is withdrawn so that it can be replaced by a version that contains the same business data, but with an effective time period that is adjusted to be contiguous and non-overlapping with the time period of the new version being created by the update.
In our example, two assertions must be withdrawn, because the new version will be contiguous with both of them. We begin by withdrawing the single asserted version that comprises the earlier episode. This is done by physically updating row 1, changing its assertion end date from 9999 to the transaction date, 3/01/09. The result is shown as row 1 in Figure 3. The row number is lightly shaded to indicate that the original transaction is not yet complete. We are looking at an intermediate result, one hidden from the outside world by the ACID (atomicity, consistency, isolation and durability) properties of the original transaction, and in particular the isolation property. The row itself is heavily shaded to indicate that it is no longer asserted.
The second step is to replace the withdrawn version with a new version that is identical to it except that its effective time period ends on the same clock tick that the new version will begin on. In this case, the result happens to be a new version that is completely identical to the withdrawn version, because the original version's effective end date already ends on the correct clock tick. Row 3 in Figure 3 shows us the result of replacing the withdrawn version. Its row number is also lightly shaded.
Note the dotted assertion timeline in Figure 3. It originates in a version whose effective time period is January to March of last year, but it does not begin to be asserted until March of this year. So, its assertion arrowhead is located in March, and the attached dotted line shows us which version that arrowhead belongs to.
Figure 3: Merging Episodes, Stage 1 - Set Up Earlier Episode
In ordinary circumstances, our next step would be to insert the new version, the one specified in the transaction. But these are not ordinary circumstances. Having withdrawn and replaced the earlier of the two versions adjacent to the new one, we must now withdraw and replace the later of the two versions adjacent to the new one. The result of doing so is shown in Figure 4.
Figure 4: Merging Episodes, Stage 1 - Set Up Later Episode
We have withdrawn and replaced the later of the two adjacent versions. Our penultimate step is to create the new version, the one that will fill the gap between the two episodes.
Applying the Update, Stage 2
Row 5 in Figure 5 is our new version. It fills the gap from 4/01/08 to 8/01/08. Because it is the result of an original update, it cannot create a new episode; and so its episode begin date must be set to the begin date of the episode it will become a part of. In Figure 5, we have shown the new version being assigned to the earlier episode, the one beginning in 1/01/08. If we assigned it to the later episode, the result would have been an episode with a begin date later than the effective begin date of its earliest version. That is invalid, because we have defined an episode begin date as identical to the begin date of its earliest version.
Figure 5: Merging Episodes, Stage 2 - Create the New Version
This definition is not arbitrary. An episode is an unbroken period of effective time for an object. More precisely, it reaches all the way backward in effective time until there is a break, and either all the way forward until there is a break, or to 9999. It is the asserted version parent in all temporal referential integrity relationships.
However, it really doesn't matter at this point, because our transaction is not yet complete. Therefore, all the rows it touches are still isolated and are not available to other transactions or to queries.
There are a couple of things left to do with this table. One is to adjust the episode begin date on all the versions belonging to the episode created by this merger of two prior episodes. From the discussion just completed, we know that the episode begin date for the new episode will be 1/01/08.
This completes the semantics of the update. We have withdrawn and replaced two versions, merged two episodes, added a version, and assigned all versions the correct episode begin date. But we can apply one more transformation to this set of rows that will improve the efficiency of the physical representation of these semantics. We can reduce the total number of rows required to represent this information by one row.
Notice rows 3 and 5. They are effective-time contiguous, running from 1/01/08 to 8/01/08. But it also happens to be the case that the business data they contain is identical. Both say that, in their effective time periods, P861 was an HMO policy with a $15 co-pay. Because both are true, we can physically replace these two rows with one row, as shown in Figure 6.
Figure 6: Merging Episodes, Stage 2 - Compact and Consolidate. (For the definitions of compact and consolidate, see the glossary on our Web sites.)
Viewing Asserted Version Tables
As with all asserted version tables, this one is the physical amalgamation of a series of states of the table. We propose that nearly all access to asserted version tables be through views that include various filters. The most common view, of course, would be the current state of the table. That is materialized (or virtualized) with a CREATE VIEW statement that includes the following clause:
WHERE asr-beg =< CURRENT-DATE AND CURRENT-DATE < asr-end
Figure 7 shows us two views of this table. With three of the columns dropped, this view will present the uni-temporal version table shown above the vertical line prior to March 1. Then, beginning on March 1, the same view statement will present the version table shown below the vertical line.
Figure 7: Before and After March 1
Figure 8 shows our taxonomy of asserted version state transformations. We have now reviewed an original update transaction that merged two episodes into a single episode. Next time, we see how an original transaction may result in one episode being split into two. As a reminder, what we are doing is traversing a taxonomy of bi-temporal state transformations to assure ourselves there are none that asserted versioning cannot handle.
Figure 8: Where We Are
Note: The authors are writing a book on temporal data management, based in large part on the material in this series. It will be published by Morgan-Kaufmann, and will be available in the second quarter of 2010.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access