Like many concepts in our field, ontology and taxonomy have more than one definition. In addition, these definitions can sometimes contain vague or ambiguous phrases.

The Challenge

In simple terms and using an example if possible, how would you define ontology and taxonomy, and how do they differ?

The Response

Gordon Everest, professor emeritus, provides this succinct explanation: “The synonym for ontology would be model (of something in data), and the synonym for taxonomy would be tree.” Robert Ruffin, data architect, offers this example: “The taxonomy of a tiger is that it is a subtype of cat (classification), but an ontological description may be that the tiger has a relationship to Asia, the continent on which it lives.”

An ontology is a formal way of organizing information. It includes putting things into categories and relating these categories with each other. The most quoted definition of ontology is from Tom Gruber: “Explicit specification of a conceptualization.” In other words, an ontology is a model - a model being a simplification of something complex in our environment using a standard set of symbols. Kinds of ontologies include but are not limited to glossaries, data dictionaries and, yes, even data models.

Steve Turnock, database engineer, says an ontology is a representation of a body of knowledge. “Ontology is closely related to semantics, the primary distinction being that ontology concerns itself with the organization of knowledge once you know what it means.” The body of knowledge can include both class and instance. We often find that one model’s class is another model’s instance. For example, wine is an instance of the class liquid, and zinfandel is an instance of the class wine.

Dave Hay, industry expert, adds, “In the modern world, the word is used to describe a list of the things that exist in an organization or an industry. Or, more specifically, it refers to the list of terms identifying those things. This includes a defined syntax and approach to specifying the relationship among those things. Ontology was originally the Greek word for the philosophical study of ‘that which exists.’ It turns out that identifying exactly what exists in our world is trickier than you might think.”

A taxonomy is an ontology in the form of a hierarchy. Steve Turnock provides this example: “The most commonly known of these is the biological classification of the structure of life itself. This is described in terms of phylum, family, genus, species and so on.” Nandi Iyer, solutions architect, adds a data twist to the definition: “Taxonomies are things of interest arranged in a hierarchical structure, typically in a supertype/subtype relationship.”

Whereas ontologies can have any type of relationship between categories, in a taxonomy there can only be hierarchies. A hierarchy is when a child only has a single parent and a parent can contain one or more children. If a child can have more than one parent (the term is poly-hierarchy), than the child is typically repeated for each parent. Examples of kinds of ontologies are product categorizations, supertype/subtype relationships on a relational data model and dimensional hierarchies on a dimensional data model.

Gordon Everest suggests taxonomy best practices: “Given a population of some things, we build a taxonomy to help us classify the members of the population into groups and subgroups within subgroups, etc. In a good taxonomy, every sibling set under a parent node (class) enables us to divide the parent population into mutually exclusive and collectively exhaustive subsets.”

Cheryl Rimes, senior business analyst, offers a health care example developed by the International Statistical Classification of Diseases and Related Health Problems (ICD). ICD provides a taxonomy to classify diseases and a wide variety of signs, symptoms and causes. Dave Hay provides this example and also raises a challenge: “The most famous of these is the Dewey decimal system for cataloging library books. It starts out with 10 major categories, and subcategories are defined by tacking digits to the end of the number. This was very useful for locating books that could physically be stored in only one place. It is less useful as a way to catalog a body of knowledge. Where do you put a book about the history of mathematics in the Islamic world? History? Mathematics? Religion? This points out the problem with most taxonomies. Most of our knowledge is not hierarchical. To cram a body of knowledge into a hierarchical structure leads to all kinds of problems.”

If you would like to become a Design Challenger and have the opportunity to submit modeling solutions, please add your email address at http://www.stevehoberman.com/. If you have a challenge you would like our group to tackle, please email me a description of the scenario at mailto:me@stevehoberman.com

Steve's publishing company, Technics Publications, recently published the first edition of the DAMA Dictionary of Data Management, a CD-ROM containing over 800 terms spanning 40 topics, including finance and accounting, knowledge management, architecture, data modeling, XML and analytics. You can order a copy from the DMReview.com Bookstore at www.dmreview.com/books.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access