JAN 18, 2008 5:24pm ET

Related Links

Predictive Modeling Making Insurer Inroads
February 8, 2012
Biting the Bullet for a Core Upgrade
February 6, 2012
The CRM Shift
February 3, 2012

Web Seminars

Getting Started with Big Data
Available On Demand
Transactions & Interaction: The Correlation of Structured and Unstructured Data
Available On Demand
Deliver Better Enterprise Data through Better Reference Data Management
Available On Demand

Time and Time Again: Managing Time in Relational Databases, Part 18 - Physical Implementation of Version Pattern 3

Print
Reprints
Email

In this article, we will describe the physical implementation of Version Pattern 3. This pattern expands on Patterns 1 and 2 by retaining a history of updates. This pattern was logically described in part 4 and part 5 of this series.

 

Pattern 2 provided the ability to query objects that were previous deleted. However, it failed to save history created by updates. In Pattern 3, we will start saving this update history by using versioning, and will lay the groundwork for the more robust history patterns that follow.

 

Version Tables

 

One way to keep track of updates is to use history tables that are nearly clones of the object table, and are populated by using update triggers. There are pros and cons with using this approach, which we will discuss later. Pattern 3 takes an approach where all of the versions (current and past) are stored in the same table. A Policy was the object in the Policy table in Patterns 1 and 2. Now, the table will contain multiple versions of the same object. Therefore, to distinguish these, the table that contains the versions of a policy will be called the “Policy Version” table.

 

While you might assume that having a policy version table would also require a policy (object) table, this is not so. We will only populate and query the Policy Version table for our current and previous versions of an object. Some organizations might demand both tables for various reasons, but it is not required to support the semantics of this pattern. There are benefits to having only one (version) table to update, manage and query, so we will describe this pattern with this design decision.

 

Version Primary Key

 

Having versions of the same object in the same table requires a change to the primary key of the table. Policy number (policy_nbr) was the primary key in Patterns 1 and 2. However, policy number alone is no longer a valid primary key now that we are storing versions in the same table. We need something else as part of the primary key to distinguish multiple versions of the same policy. An alternative might be to replace policy number with a surrogate key. But since surrogate keys were not introduced as part of these earlier patterns, we will use a composite key made up of the policy_nbr and something else to define the version.

 

Figure 1: Correlated Subselect for Max (see PDF)

 

That “something else” could be a sequence number (version number), but for our examples we will use the version beginning (system activity) date (ver_beg_dt). Remember, we use dates in our examples to simplify them, but we really would use a date/timestamp to allow for multiple events per day. The primary key will be policy_nbr and ver_beg_dt.

 

The current version is identified by being the row for the policy that has the highest (most recent) version begin date. So, to find the most current version you would need to find the highest (MAX) date for the policy, then use that date to qualify the ver_beg_dt to get the single current row. This can be accomplished by using a correlated subselect, as shown in Figure 1. Either way, this approach has negative performance implications in applications that read these rows. We have seen this performance problem in implementations of this pattern, and it is often not discovered until there is a larger quantity of history and a load is placed on the application. This issue will be addressed and overcome in future patterns in this series, so stay tuned.

 

Returning a version of an object “as of” a certain point in time, while possible, is a bit more complex. You need to find the specific row that began (beg_dt) prior to the input date, but it must less than the next version that is higher. It is not a simple “between” predicate. This kind of query is shown in Figure 2.

Filed under:

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.