APR 14, 2008 11:06am ET

Related Links

Modifying Foreign Key Definitions
April 14, 2014
FDA Opens Public Comment Period for New HIT Strategy
April 6, 2014
Modeling to Millennials
November 18, 2013

Web Seminars

Why Data Virtualization Can Save the Data Warehouse
Available On Demand
Essential Guide to Using Data Virtualization for Big Data Analytics
September 24, 2014

Ontology and Taxonomy

Print
Reprints
Email

Like many concepts in our field, ontology and taxonomy have more than one definition. In addition, these definitions can sometimes contain vague or ambiguous phrases.

The Challenge

In simple terms and using an example if possible, how would you define ontology and taxonomy, and how do they differ?

The Response

Gordon Everest, professor emeritus, provides this succinct explanation: “The synonym for ontology would be model (of something in data), and the synonym for taxonomy would be tree.” Robert Ruffin, data architect, offers this example: “The taxonomy of a tiger is that it is a subtype of cat (classification), but an ontological description may be that the tiger has a relationship to Asia, the continent on which it lives.”

An ontology is a formal way of organizing information. It includes putting things into categories and relating these categories with each other. The most quoted definition of ontology is from Tom Gruber: “Explicit specification of a conceptualization.” In other words, an ontology is a model - a model being a simplification of something complex in our environment using a standard set of symbols. Kinds of ontologies include but are not limited to glossaries, data dictionaries and, yes, even data models.

Steve Turnock, database engineer, says an ontology is a representation of a body of knowledge. “Ontology is closely related to semantics, the primary distinction being that ontology concerns itself with the organization of knowledge once you know what it means.” The body of knowledge can include both class and instance. We often find that one model’s class is another model’s instance. For example, wine is an instance of the class liquid, and zinfandel is an instance of the class wine.

Dave Hay, industry expert, adds, “In the modern world, the word is used to describe a list of the things that exist in an organization or an industry. Or, more specifically, it refers to the list of terms identifying those things. This includes a defined syntax and approach to specifying the relationship among those things. Ontology was originally the Greek word for the philosophical study of ‘that which exists.’ It turns out that identifying exactly what exists in our world is trickier than you might think.”

A taxonomy is an ontology in the form of a hierarchy. Steve Turnock provides this example: “The most commonly known of these is the biological classification of the structure of life itself. This is described in terms of phylum, family, genus, species and so on.” Nandi Iyer, solutions architect, adds a data twist to the definition: “Taxonomies are things of interest arranged in a hierarchical structure, typically in a supertype/subtype relationship.”

Whereas ontologies can have any type of relationship between categories, in a taxonomy there can only be hierarchies. A hierarchy is when a child only has a single parent and a parent can contain one or more children. If a child can have more than one parent (the term is poly-hierarchy), than the child is typically repeated for each parent. Examples of kinds of ontologies are product categorizations, supertype/subtype relationships on a relational data model and dimensional hierarchies on a dimensional data model.

Gordon Everest suggests taxonomy best practices: “Given a population of some things, we build a taxonomy to help us classify the members of the population into groups and subgroups within subgroups, etc. In a good taxonomy, every sibling set under a parent node (class) enables us to divide the parent population into mutually exclusive and collectively exhaustive subsets.”

Cheryl Rimes, senior business analyst, offers a health care example developed by the International Statistical Classification of Diseases and Related Health Problems (ICD). ICD provides a taxonomy to classify diseases and a wide variety of signs, symptoms and causes. Dave Hay provides this example and also raises a challenge: “The most famous of these is the Dewey decimal system for cataloging library books. It starts out with 10 major categories, and subcategories are defined by tacking digits to the end of the number. This was very useful for locating books that could physically be stored in only one place. It is less useful as a way to catalog a body of knowledge. Where do you put a book about the history of mathematics in the Islamic world? History? Mathematics? Religion? This points out the problem with most taxonomies. Most of our knowledge is not hierarchical. To cram a body of knowledge into a hierarchical structure leads to all kinds of problems.”

If you would like to become a Design Challenger and have the opportunity to submit modeling solutions, please add your email address at http://www.stevehoberman.com/. If you have a challenge you would like our group to tackle, please email me a description of the scenario at mailto:me@stevehoberman.com

Steve's publishing company, Technics Publications, recently published the first edition of the DAMA Dictionary of Data Management, a CD-ROM containing over 800 terms spanning 40 topics, including finance and accounting, knowledge management, architecture, data modeling, XML and analytics. You can order a copy from the DMReview.com Bookstore at www.dmreview.com/books.

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to information-management.com including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.