DEC 1, 2002 1:00am ET

Related Links

Visiting Nurse Service Cares About Cloud Security
October 25, 2011
Light at the End of the Silo
October 28, 2010
Pitney Bowes Releases Enhancements to MapInfo Professional
September 13, 2010

Web Seminars

The Big Deal About Big Data Governance
May 22, 2012
Getting Started with Big Data
Available On Demand
Transactions & Interaction: The Correlation of Structured and Unstructured Data
Available On Demand

Much Ado About Very Little – Reply to Johnston

Print
Reprints
Email

When I was notified by the editor of DM Direct of Johnston’s article, I had no idea who he was and had not seen his articles (probably because I have a different notion of what "scholarly research" is.) I am used to people disassociating themselves from my ideas (an indicator that I am doing something right), so a claim of authorship worried me: Where have I gone wrong? I’ve been promoting almost single handedly the same ideas since the early 1980s (I’ve been just taken to task for precisely that, see "Silly Seeley," forthcoming at TDAN) and authorship claimed in the 1990s by somebody unknown to me as a relational proponent did not compute.

Skimming quickly through his article, I realized that Johnston had been involved quite a few years back in an exchange with Chris Date, David McGoveran and Hugh Darwen on the subject of missing data, which I do remember (that exchange is included in the references at the end of chapter 10 of my PRACTICAL ISSUES IN DATABASE MANAGEMENT). As far as I can recall, Johnston defended many- valued logic, as distinct from the two-valued logic on which the relational model is based. No wonder I did not remember him: that kind of argument, particularly from somebody who professes to be a logician, is not likely to leave a lasting impression, and the article under consideration here won’t either. Frankly, it is very difficult to take seriously anybody who accuses Chris Date, of all people, of invectives (see Chris Date’s extended personal reply in this newsletter, posted at DATABASE DEBUNKINGS.)

Fortunately, it is not necessary to consider the whole Johnston article, or even his earlier two series, to show that even if he were the first to publish what he claims to be his idea – which is not the same as the first being aware of it – a) it is not as big a deal as he makes of it and b) it would not and does not make much of a difference in the specific context of my article. Most important, however, is that I was actually trying to convey a different idea but failed to express it correctly.

I will not dignify Johnston’s accusation of plagiarism with a reply; the reader is free to make his own mind. Since much of his article consists of arm waving and posturing, I will limit myself to what little technical content there is, namely the three points that Johnston "claims credit for being the first to point out" in his 1991 and 1993 articles:

  1. "[An] architectural flaw which is shared by all major relational DBMSs"
  2. "The implications and cost of that flaw"
  3. "A solution to it"

The reader is urged to keep in mind, first, that the article which prompted Johnston’s accusation focused on the integrity implications of denormalization. The issue of why current products sometimes force users to denormalize for performance is only briefly mentioned in the conclusion, in relation to a new implementation technology that I’ve been alluding to in several of my writings, and which prompted my article (more on this in the last section).

Second, what follows should not be construed as the only evidence for my case or the whole case against Johnston. It is just the minimally necessary within the time, space and inclination constraints.

Some background. In 1985, Ted Codd published his now famous 12 rules for relational fidelity, essentially some (not necessarily all) basic requirements of a DBMS, if it is to be deemed relational. Four of those rules, 8-11, specify four kinds of data independence, that is, independence of applications from certain functions, which must be supported by a RDBMS at the database level, namely: physical, logical, integrity and distribution independence. Our concern here is with physical data independence (Rule 8), and the distribution independence (Rule 11).

Physical Independence: Interactive applications and application programs should not have to be modified whenever changes in internal storage structures and access methods are made to the database.

Note that how products should achieve this is not specified, and no specific physical implementation is imposed, or prohibited. The relational model is nothing but logic applied to databases, and logic has nothing to say about physical implementation. Moreover, as I explained in my article, this is a major relational advantage, because it leaves implementers free to do whatever they darn please at the physical level to maximize performance, as long as they do not expose it to users in applications. It follows that any performance problem encountered in practice cannot possibly be due to the relational model, or to the relational nature of a DBMS. It is a product implementation issue by definition.

In SQL AND RELATIONAL BASICS (written in 1988, published in 1989) I explained the concept of physical data independence in detail and used Figure 8.1 to convey it (p. 124-6). Note very carefully that while Table 1 and 3 are each stored in one physical file, Table 2 and Table 4 are stored in two physical files each! I was even more explicit in UNDERSTANDING RELATIONAL DATABASES (published in 1993, written in 1992), where I used a similar figure and stated as follows (p. 46- 7):

Filed under:

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.