I recently taught a data modeling basics course to a group of college seniors getting ready to enter the workforce. I was extremely excited to have the opportunity to explain the value of data models and data modeling concepts to this group, untainted by any project experience.

However, with few exceptions, during the entire time I taught no one paid attention to me! While I was talking (about pretty exciting stuff like normalization and abstraction), attendees browsed the Internet, chatted with Facebook friends and checked personal email.  I am not sure how much modeling this group actually learned during the class. I did some research trying to determine what caused the lack of interest, and one idea that occurred to me is that the latest group to enter the workforce may question the value of data management. Sure there is big data, data science, and other cool areas that deal with data, but maybe there is a belief that new technologies allow us to skip the data management step. For example, Do I really need to know the rules between Customer and Account if I am throwing everything into a MongoDB collection? I learned from our most recently published book, “Business unIntelligence,” by Dr. Barry Devlin, that the Millennials are the most technologically advanced generation. They know the technologies, but as a whole, does the generation value (and, even more importantly, understand) the data? I know many Millennials do value data management, but what about those recent college graduates just entering the workforce?

How do we teach the newest group to enter our workforce the value of data management and data modeling in particular? Any ideas?

The Response

What I found very interesting when I posed this challenge to the Data Design Challenges group was the number of respondents who said the under-appreciation of data management was not limited to just a particular graduating class, but also included more experienced IT professionals. The suggestions fit into one of three categories: coming up with more relevant examples, calling “data modeling” something different, and experiencing life without data modeling.

More Relevant Examples 

Many engaging methods can be used to explain data modeling. For example, Anita Huq, data architect, recommends using Facebook as a case study in data modeling training. Kate Platonova, data architect, agrees: “Perhaps if relevant and interesting examples are used to illustrate abstract concepts young grads will pay more attention.”

Sai Koduri, senior manager, adds, “We need to mix this with real-life situations that they experience. Situations where they have to manage their academic data, visa data along with their identify data for entering into organizations, which is small but needs utmost privacy and security. That is an example for data management where integrity rules need to be explained.”

Andrew Wynn, director of information architecture, takes a similar approach: “I think it might be a good idea to spend the opening of the course asking them to list things they’ve wanted to create databases of.  Ask them if there is any type of public or private information they are aware of that they think could be better organized and/or integrated for analysis.  List what they come up with, pick a few of the juiciest ideas and facilitate conceptual data models for them right then and there.”

Vishal Aggarwal, data architect, explains data modeling by starting off with a few concepts important to the audience before getting into more advanced logical and physical modeling techniques. “My approach is to first discuss the day-to-day life subjects like Library, Student, etc., and ask them to provide the various scenarios that can happen and for which data can be captured. Once they provide the scenarios properly, we can target one by one and try to model them. This way the whole team would participate.”

Ray Doggendorf, data architect, would start explaining data modeling this way: “Regardless of technology or process, all decisions made and actions taken are best completed when they are based on fact. A fact is a recording of information (or data) about any person, place, thing or event.  Example one: a customer (is a person) - we record their (facts) name, address and telephone number.  Example two: a product (is a thing) - we record its (facts) size, color, style, and cost.  Example three: an order (is an event, some might say a relationship) - it records information between a customer and a product (e.g., Order 123 recorded that on 9/30/2013 customer Bob went online and ordered a pair of shoes, he paid $100 and had them shipped to his home address of 123 Alphabet Road). So, ddata modeling is a way for us to capture the essence of data.  It’s a method where we actually draw pictures and take notes so that we can visualize and understand what data is and how it relates to each other.”

Different Terminology

Professor Emeritus Gordon Everest recommends not using the phrase “data modeling.” Everest says, “On the surface it implies that we are building models of data.  But really, we are modeling the business in data.  A data model is the end product. It is a model or representation of the business, reflected in a data model. To be sure, after we have built a database according to the model design then the data model is a model of the database, i.e., a ‘data model.’ If you want to help the business, you must first know the business. One result of trying to know the business is to investigate and represent it in a model, we like to call it a ‘data model’. A data model is a model of the things that are, that exist in the business world. Today we hear a lot about big data and business intelligence, but success there is predicated on an understanding of the business – data models (of the business) are essential to the effectiveness of big data efforts. It starts with a vocabulary describing the things in the business, a definition of those things, the relationships among those things, and the (integrity) constraints (business rules) relating to those things and their relationships. That?s what data modeling is all about.”

George Rakauskas, project architect, takes a similar viewpoint, recommending we take a broader data-driven perspective which includes metadata and data mappings. Asoka Diggs, enterprise architect, agrees and recommends we read the paper “Strength in Numbers: How does Data Driven Decision-Making Affect Firm Performance.”  Diggs says, “The essence of the message is that firms that are data-driven decision-makers are more productive/valuable than firms that aren’t, and the firms that are doing this (when you can get them to talk) indicate that useful data/data management is their primary bottleneck to doing more.”

Life without Data Modeling

Sometimes the best way to learn is to explain what happens when we don’t build a data model. Navin Ladda, data architect, uses this approach. Dean Myshrall, director of data services, agrees: “I think the approach has to be to let them discover the true value of data as it relates directly to each of them. Very soon, if not now, they will be hit with the reality of what happens if data is not properly understood and cared for. They will hit aberrations in the data behind their credit scores, charge accounts, medical backgrounds, etc.”

George Burnette, senior data architect, recommends two exercises right in the beginning to illustrate what happens when things are not well-understood or well-organized: “Illustration one: give them a picture of a physical object that requires assembly - nothing simple, but something that requires lots of different kinds of parts and pieces, but do not give them the names of any of those parts and pieces. In actuality, the finished product represents a system such as a data warehouse. You can even give instructions, but do not name the parts - just call them part 1, 2, 3 and do not be consistent with objects that you call part 1 or 2 or 3 and so on; many different objects may be referred as part 1 and even some of the same may be referred to as part 2 and so on. This is a lesson on its own.  Illustration two: after they have finished exercise one, let them build something else, but be consistent with part names, give them instructions, but give them a huge bucket/vat/tub containing all of the parts mixed in e
xcept add lots and lots of extra and unnecessary parts (these do not have to have a name).  This is another valuable lesson.”

Emma Fortnum, data and information architect, brings up the interesting point that it can be difficult to explain what happens when we don’t build a data model because of how easy technology makes things appear: “There are a lot of people who don’t value data management because there are other hordes of people who are shielding them from the true pain that it causes.  It’s a bit like machine coding - no-one really needs to machine code much anymore, because there are 3rd/4th/5th generation assemblers/compilers/higher level languages that do that work for you - you sacrifice the ability to define exactly what you want for the convenience of using standard patterns.  This, I think, is true of data modeling up to a point.”

Summary

I very much appreciate the approaches shared in this challenge, and I was surprised how many of us have witnessed under-appreciation for data management at much more experienced levels in our organizations. As John Giles, principal consultant, remarked in his response to this challenge, “What I find perplexing are the agilists who exhibit a half-heartedness approach towards data modeling, in spite of the clear articulation of benefits by people such as Larry Burns and Scott Ambler. Is it like eating fruit and vegetables - everyone knows it’s good for you, but it isn’t exciting news?”

As an aside, I have already taken some of the advice from responses to this challenge, including updating our Data Modeling Master Class by making it even more problem-driven and updating several of the exercises to make them relevant to a wider audience.

Until the next challenge!

If you’d like to join the more than 4,000 data modelers in the Data Design Challenges group, sign up at www.stevehoberman.com.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access