Note: A glossary of technical terms used in these articles can be found on the Web sites of the authors. The URLs are MindfulData.com and InbaseInc.com.
This glossary is being constructed as a controlled vocabulary. By that we mean that any expression involving temporal concepts used in the definition of a glossary entry must itself be a glossary entry.
One reason for being this careful with definitions is that this kind of rigor must be applied to definitions before they can be "ontologized," i.e., expressed in a formal notation like first-order logic. That, in turn, is needed to make definitions available for manipulation by software-realized inferencing engines.
Another reason for being this careful with definitions is that this rigor must be applied to any set of technical definitions before we can claim not only that we say what we mean, but also that we know what we mean when we say it.
In our previous three articles, we introduced the distinction between original and temporal transactions. We then reviewed temporal referential integrity (RI) rules as they apply to temporal deletes. This article continues the discussion by reviewing temporal RI rules as they apply to temporal updates.
These same discussions also provide the insight necessary to decide whether we should attempt to enforce temporal RI on the updates themselves, or instead defer that integrity checking until data is retrieved. With respect to the deferral option, we will repeat what we said about it in the previous article:
It's easy to disparage [the deferral option] by describing it as letting temporal "garbage" onto the database, and then trying to hide that garbage from queries. But this is indeed the approach we have taken on temporal databases that we have designed, built, put into production and supported, and we had good reason for doing so. Like almost all database professionals, we were concerned that there would be an overwhelming volume of cascade updates if we tried to enforce temporal RI at time of update, especially when the update involves the delete of a versioned object that was the "parent" of possibly many other objects, in many other tables. (See the introduction to Part 10 for a more complete description of this "cascade update" fear.)
For the sake of continuity, we repeat the following definitions which are especially important for understanding the material in this article. These, and other technical terms, are all defined in the glossary.
Temporal Referential Integrity: Definition
If there is a temporal RI dependency from table Y to version table X (not necessarily distinct), then no exposed state of the database is valid in which any row in table Y is not object-RI linked to a row in table X, or in which the effectivity time period of a row in table Y is not wholly contained within the effectivity time period of its parent row in table X. No queries are valid which would expose such a state.
Original and Temporal Transactions: Definitions
An "original" transaction is a transaction against a versioned object.
A "temporal" transaction is the result of translating an original transaction into one or more transactions against one or more episodes of that versioned object.
Updating a Versioned Object: the Original Update Transaction
To the business, when they are issuing an update against an object, they don't care if the object is versioned or not, and, indeed, they may not know. They are directing us to apply the specified change to the object. Thus, an original update transaction doesn't need to specify a version, just the object being updated and the date the update is to take effect. This is illustrated in Figure 1.
As we have pointed out before, we are using dates throughout these articles only for convenience. The approach to versioning that we are presenting here applies no matter what the granularity of the clock ticks that measure out time. The granularity used here would be the correct granularity only for that updates take place in batch mode and are applied at most once per day. As soon as online transactional updates are involved, however, we would need a finer granularity, such as a full timestamp.
Figure 1: An Original Update Transaction
Update transactions can be submitted as either complete images of the updated row or else by specifying first the row, and then the column/value pairs for each column to be updated in that row. If these transactions are to become effective exactly when they are applied, the effectivity begin date in Figure 1 is not required. In those cases, an original update transaction against versioned data will be identical to the same update submitted against non-versioned data.
Updating a Versioned Object: the Existence Constraint
The original transaction we are considering is an update. Just as with an update to a non-versioned object, the first thing that must be done is to find the target row(s). Given that the target table is a versioned table, one target row must be the most recent version of the current episode of the object. The original transaction provides the EID for the object. The most recent version of the current episode of the object is the one version of that object with an effectivity begin date in the past and an effectivity end date either in the future, or containing the "temporal null" value "12/31/9999."
First of all, note the assumption that there cannot be more than one current version of an object in a versioned table. This assumption is guaranteed by the logic of inserts and deletes against versioned objects. This reflects our commonsense belief that at any point in time, there cannot be two different things that are true about an object.
When a user submits an update transaction against a non-versioned table, she is asserting that the indicated object already exists in that table. If the object doesnt already exist, the transaction is invalid. When a corresponding original transaction is submitted, i.e., when the update will be applied to a versioned table, the update logic must also ensure that the object already exists in that table. Specifically, the update logic must ensure that the object exists right now. Just as with non-versioned tables, it's not enough that the object may have existed in the table at some point in the past or will exist at some point in the future. It must exist right now. There must also be only one row in the table representing that object. In a versioned table, that means there must be only one current version of the object, where a current version is one whose effectivity begin date is in the past and whose effectivity end date is not.
If a current version of the indicated object is found, the update can proceed. Otherwise, it is rejected as invalid. (We consider updates which are explicitly against future versions in a later article. For now, we assume that all original transactions are against current versions.)
An original update of a versioned object requires us to supercede the most recent version of the current episode of the object (which is, by definition, the current version of the object) with an updated version, and also to do the same for any active future episodes of that object. An active future episode is a future episode whose most recent version is not a delete version.
Updating a Versioned Object: Temporal Update Transactions
Let's suppose that we have identified the most recent version of the current episode of the object specified on the original update, and also the most recent versions of any active future episodes of that object. We then add a temporal update transaction to each of these episodes, superceding the most recent version in each of them with a new most recent version.
Note that these temporal updates do not flag and rewrite these versions. That would be to do an update in place, which defeats the whole purpose of versioning by overwriting the past.
Temporal updates insert a new most recent version for each active episode. The effectivity end date on the pre-update most current version is set to one click tick prior to the effectivity begin date on the version that is superceding it.
This is because an original update is a business transaction. Its intent is to place the updated information in the database, as of the indicated date. For example, if a client is being updated, the intent is that after the transaction is complete, the current description of that client is different than it was before the update, reflecting new information we have come across about the current description of that client.
But what about temporal RI? If we supercede a version of a client, what about the policies for that client that are temporally RI dependent on the client? And what about the claims for those policies that are temporally RI dependent on them?
Updating a Versioned Object: the Concern
As weve mentioned before, a pervasive fear among business IT professionals is that temporal RI enforcement on updates raises the likelihood of an extensive cascade effect, one we have called the foreign key ripple effect in other articles.
We know that normal, database management systems (DBMS)-enforced RI doesn't just link child tables to parent tables. When using the set null or cascade options, we must also consider child tables to those child tables, and continue down the RI-dependency chain as far as it extends. Thus, in a situation where we have clients, policies and claims, a cascade or set null delete of a client may require one or more policies to be set null or deleted. And if any are, the DBMS must go on to set null or delete the claims dependent on those policies.
If these RI checks must be done on normal tables whenever an object is deleted, don't temporal RI checks need to be done on versioned tables whenever a version of an object is superceded?
Contrary to a widespread belief among data management professionals, the answer is that they do not need to be done whenever a version of an object is superceded. To understand why this fear is largely unfounded, note first of all that we implement temporal RI with object foreign keys (OFKs), not with normal DBMS-enforced FKs. OFKs in a temporally dependent table do not point to a specific parent row in the versioned table they are dependent on. They point only to the object they are dependent on. Thus, in a Policy table, each row has an OFK to the client whose policy they are. But the table the Policy table is temporally RI dependent on is a versioned table, and thus an OFK does not pick out a specific row in that table.
Initially, we might want to conclude that temporal updates have no effect on temporal RI at all, and that just like with normal RI, a temporal update does not require any RI checking at all. However, this is not so. Most temporal updates have no effect on temporal RI, and thus incur no RI-related performance penalties. But there is one kind of temporal update that does require checking of dependent tables, and that may involve a lengthy chain of dependencies that must be checked so that the temporal update can either be blocked, or trigger a cascade of temporal updates, or trigger a cascade of temporal set nulls to the OFKs in the chain.
We will examine this temporal RI-triggering original update in Part 24. In the meantime, we recommend that our readers think about temporal updates, and try to figure out for themselves what kinds of temporal updates can trigger a temporal RI cascade and what kinds cannot.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access