First, as soon as the first update takes place, the row, as originally entered, is lost because the update overwrites the row. So we lose the information that was true of the policy on the date it became effective. Next, considering the entire row as a single unit of data, all updates but the most recent one are also lost, including even the knowledge of whether or not there were any previous updates. Sometimes this is acceptable; but sometimes it is not.
Note: this may suggest that we should manage versions at the level of individual row/columns, not at the level of entire rows. Although that would certainly be possible, we believe that the same semantics can be supported by versioning entire rows, and at a lower overall system cost. We will not consider row/column level versioning in the remainder of these articles.
Now we will consider a more stringent business requirement for history.
{BR-3}. Support {BR-2}, and also keep a complete queryable history on policies, i.e., on all inserts, updates and deletes to all policies.
As our history taxonomy indicates, our first choice is to keep this history as a chronological record of either events or states. We could keep an event-based history of changes to policies by adding a row to the Policy table each time a new policy was created, and after that keeping a transaction table in which each transaction was an update or delete to the policy.
Event-based history is most appropriately used to manage changes to metric values of relationships among persistent objects, values such as counts, quantities and amounts. These relationships that contain metrics are often called balances. Each change to a relationship metric is recorded as a transaction, and in the traditional star schema design, each transaction contains a date, the change to each metric, and a foreign key to each object that is affected by the transaction, objects such as customers, products, suppliers and so on.
However, event-based history is not appropriate for managing changes to non-metric values of persistent objects. For them, state-based history is the preferred option. Because we have chosen to keep a state-based history of policies, we must now decide whether to do so using snapshots or versions. Snapshots are the preferred method for recording the coordinated history of many rows, usually across many tables, as of a single point in time. However, if the same column in the same row is updated more than once between snapshots, only the last update will be captured. Therefore, since the business requirement is to track all changes to policies, we must do so using versions.
What Version Tables Version
Before we dive into the details of managing versioned history, we must first understand what is being logically inserted, updated and deleted in versioned tables. It is not versions. It is the objects they are versions of. The physical insertion of a version logically inserts, updates or deletes an object, and does so in the following way.
- The physical insert of the first version for a policy is a logical insert of that object, i.e., that policy.
- The physical insert of subsequent versions for that policy, which are not designated as a logical delete of the policy, are logical updates of the policy.
- The physical insert of a version marked as a logical delete logically deletes the policy.
Note that "object" is being used here in the sense of "anything that persists over time," not in its narrower object-oriented sense. Customers, products, suppliers, contracts, employees, facilities, inventories, packages, invoices, purchase orders - all these are objects. For the most part, we are using insurance policies as illustrative objects in these articles.
A Policy Version Table
Each row in the original Policy table represents a policy (of course). But no row in the Policy Version table represents a policy! Instead, each row represents what we know about what a policy looked like during a designated period of time.
Multiple rows on the Policy Version table may have the same policy-nbr, of course. If this were a Policy table, that would be an error. But on a Policy Version table, it is not an error. On a Policy Version table, those rows, and only those rows, with a given policy number are versions of that policy. Those rows, and only those rows, contain information about that policy during a period of time.
So a version is a timeslice of an object. It is not the object itself. A complete sequenced set of versions is a full timeline for an object.
Since none of these rows in the Policy Version table represent policies, where are the rows that do? If this table is not a table of policies, where is that table?
The answer is that, on our implementation of versioned history, there is no Policy table. The reason is that we don't need one. For any point in time during which a policy was a policy, we can retrieve a row from the Policy Version table that represents the policy as of that point in time. The first row in the Policy Version table for a policy shows what the policy was like when it became effective. Subsequent rows show what the policy was like after each of the changes that happened to it and until the next one happened. If the Policy terminated, there is a version for that event also.









