In this PDM segment, we will describe the physical implementation of Version Pattern 2. This pattern was described, from a logical point of view, in part 3 of this series. This pattern expands on pattern 1 by adding a logical delete capability.
This pattern is more semantically expressive than Version Pattern 1, just as each pattern described in this series is more expressive than the patterns prior to it. This pattern allows us to keep a record of deleted rows, and thus information about when they were first inserted, last updated and, finally, deleted. A delete request which is carried out as a physical delete loses all this information, removing all evidence, in the online table, that the row ever existed.
To support logical deletes, we need an additional column, either some kind of a delete indicator or else a delete date. A delete indicator would be set to "N" when the row is first inserted, and set to "Y" if and when the row is logically deleted. A delete date would be set to {null} when the row is first inserted, and set to the date of the logical deletion if and when the row is logically deleted.
Remember, in this series when we reference a column as date we normally would consider a date/timestamp to be more accurate and to provide more flexibility as to when we can change the state of a row. We use dates in our examples to help keep them simple.
We prefer to use a delete date because it provides more information than a delete indicator. If a delete indicator is used, we must decide to either leave the last update date alone, or to put the delete date in that column. If we choose the first option, we have lost the delete date. If we choose the second option, we have overwritten the date of the last update. But if we use a delete date instead of a delete indicator, we can preserve both dates. In addition, as we will show later, it becomes easier and more efficient to query when looking for rows that were active at a given point in the past. Also discussed later in this article is the option of using a special date value rather than {null}.
A variant of Version Pattern 2 would be to add a delete reason column. Together with a delete date, this would tell us both when and why rows were logically deleted. For example, a row in a Customer table might have been logically deleted because of no recent orders, bad credit, etc. And, of course, the combination of insert date and delete date tells us how long the row was logically present in the table. (This may not correspond to how long the customer was active, however. It will correspond only if the business chooses to treat row insertion and deletion dates as also representing customer effectivity dates.)
With rows logically rather than physically deleted, it is also possible to maintain referential integrity for rows in dependent tables. Otherwise, a physical deletion would force us either to set foreign keys to {null}, or to cascade delete those dependent rows. In certain types of queries, data from a logically deleted parent row might frequently be usefully joined to nondeleted child rows. This would provide data about the parent row related to those child rows, a relationship that was in effect between the later of the insert dates for the parent and the child rows, and the delete date of the parent row.
Foreign Key Constraints
Most popular database management systems (DBMSs) allow a Set Null delete option. With this option, if there are one or more rows in one or more dependent tables that have a foreign key to a row being deleted, then those foreign keys are set to {null}. As just mentioned, while this approach retains the dependent rows, it loses the link to the parent row. So a logical delete allows us to retain more information than a physical delete which uses the Set Null option.
These same DBMSs also provide for a Cascade delete option. With this option, if there are one or more rows in one or more dependent tables that have a foreign key to a row being deleted, then those dependent rows are also deleted. If other rows have foreign keys to these dependent rows, then those other rows will also be deleted. Usually, Cascade delete options are avoided in most financial and other mission-critical applications. A Cascade delete retains less information than a Set Null delete against the same set of rows because a Cascade delete removes the dependent rows while a Set Null delete retains the rows and merely sets their foreign keys to {null}. So a logical delete also allows us to retain more information than a physical delete with the Cascade option.









