During a recent data modeling master class, a participant was describing a data integration challenge and used the terms “munged” and “refarkle” in his explanation. For those of you who are unfamiliar with the words:
Munge: “to transform data in an undefined or unexplained manner” (Wikipedia)
Refarkle: “to break a complex data structure or set of data into simpler pieces.”
Using both in a sentence: The source system had completely munged vendor and purchase order data, adding weeks of unanticipated coding for the data warehouse developers to refarkle vendor.
I am always interested in expanding my vocabulary (5NF and stability partitioning only go so far!), so I asked the Design Challenger group: What other cool data terms (along with their definitions) do you use or have encountered in the data world?
The responses revealed this top 10 cool terms to toss around, which will definitely impress your data buddies!
Business Normal Form (BNF)
Submitted by Allan B. Kolber, senior enterprise architect
The result of normalization applied to a business data model (a.k.a. semantic data model); a data model at Zachman row 2.
Submitted by Steve Hoberman, data modeler, who first heard this term from someone in one of his classes
A cowbird is a data element that is used for a purpose for which it was not originally designed. For example, the data element Company Name is being used to store an email address. From Wikipedia: “cowbirds feed on insects, including the large numbers that may be stirred up by cattle. In order for the birds to remain mobile and stay with the herd, they have adapted by laying their eggs in other birds' nests. The cowbird will watch for when its host lays eggs, and when the nest is left unattended, the female will come in and lay its own eggs.”
Submitted by Steve Hoberman
Elegant sometimes refers to a data model that meets the requirements but may have some issues with actually working. For example, a model that is extremely abstract that will be very challenging to implement is sometimes referred to as “elegant.”
Submitted by Jeff Lawyer, lead data modeler
A fattribute has a business name that is so unabbreviated, verbose and unsuccinct that the number of characters in the name exceeds the maximum character limitation of most data modeling software packages.
Submitted by Trey Peters, ETL developer
1. (v.) - To completely screw up data so badly that it is barely recognizable.
2. (adj.) - unrecognizable and unusable.
3. (n.) - A team member who is completely useless and unhelpful.
Submitted by Georgia Prothero, data modeler
A living ERD is a physical entity relationship diagram kept in line with production by virtue of the fact that its maintenance is integrated with the database change management system. “This phrase was coined by a member of the release management team in my company. I like it because it is meaningful and has nuances regarding the way data modelers view their creations,” says Georgia.
Submitted by Steve Hoberman, who first heard of this term on a consulting assignment
A physiological data model has properties of both the logical and physical data model. This type of model is most often seen when there is not enough time given to produce both the logical and physical. Therefore, they are produced at the same time on the same model. Often, physiological models are suboptimal because they have neither the benefits of a logical nor the benefits of a physical model. Part of the model might be fully normalized and part of the model might be designed for speed.
Submitted by Johnny Gay, data analyst, who borrowed this and the next term from the world of nutrition
A polyunsaturated fact has one or more collaborating sources capable of supporting the validity or accuracy of the fact. Polyunsaturated facts help reduce the level of data integrity risk, and increase the quality of the information shown on reports. “I see these as the really good data found in healthier data marts. The more of these found there, the better. They probably have been shown to lower the risk of heart attacks in data modelers,” says Johnny.
The antonym of polyunsaturated fact has no collaborating alternative source capable of supporting the validity or accuracy of the fact. The worst of these are found in ETL - hard-coded with no supporting source. You can think of these as the “free radicals” found in an unhealthy data mart.
Submitted by Gordon Everest, professor emeritus
This mental condition found in professional data modelers is thinking about tables too early in the process of database design. Some people popularly refer to this condition as “table think,” a phrase coined by Dr. Everest in the early 21st century.