JUL 1, 2010

Related Links

Predictive Modeling Making Insurer Inroads
February 8, 2012
CA Takes Data Model to the Cloud
February 2, 2012
Tableau Twists Platform for More Sharing
January 19, 2012

Web Seminars

How to Narrow the IT/Business Communication Gap
March 21, 2012
Data Modeling Made Simple with Steve Hoberman
Available On Demand
Go Big Data or Go Home
Available On Demand

How Long Will the Modeling Take?

Print
Reprints
Email

Don't you just dread being asked the question, "How long will the modeling take?" I usually react to this question with another question: "How much time are you giving me?" Often, however, it comes down to me rolling up my sleeves and coming up with an estimate. I have noticed over the years that my estimates hover somewhere between 20 and 40 percent of the overall project effort. (Note that the 20 percent estimate is given rarely - only when the requirements are very well defined and the modeling and mappings are relatively simple.) What techniques do you use to estimate the modeling portion of your projects?

After determining the resources that will be provided to us and clarifying which artifacts the data modeler will be responsible for delivering, we can apply an equation to come up with a reasonable estimate to complete the data models.

Resources and Expectations

So what will the data modeler deliver? Richard Kooijman, data warehouse architect, suggests tackling this question first and offers these included artifacts: "We see this as the design phase ... including the data model, mapping rules and metadata information for end user tooling." Both Norman Daoust, data modeling consultant, and Georgia Prothero, data modeler, ask several questions prior to providing an estimate. Norman asks:

a) Could you produce me a complete list of attributes, including definitions, of your current system tomorrow?

b) Who is the customer for the model?

c) Who will approve the model?

d) Is this for a new system or for existing systems?

"I use the answers as the basis for my estimate," Norman explains. "If the answer to a) is 'We don't have any existing documentation,' I know what I'm in for and my response won't be at the low end of the range. I use the answer to b) to then speak with those people to understand what they want. I use the answer to c) to talk that those people and understand their acceptance criteria. I use the answer to d) to determine who will be the source for the data requirements and then speak with them."

Georgia asks three questions that neatly complement Norman's: "First, is the project using an existing database? Second, is the project an interface, a data entry application or a processing engine? ... Interfaces are the easiest, so the data modeling is likely to take up about 40 percent [of the project]. Applications and processing engines are more complex, so the data modeling is likely to take a smaller percentage of the overall effort. And thirdl, how experienced is your data modeler?"

Estimate Approaches

Here are some of the estimating approaches and formulas that were submitted in response to this challenge:

  • Data warehouse architect John Stinnett says, "If we're building a new mart, we roughly determine the number of fact and dimension tables and apply an hourly multiplier for each type. We'll also apply a complexity factor for a large number of attributes, snowflakes, versioning, alternate keys, etc."
  • Jan Cohen, data modeler says, "My definition of modeling is completing the physical model. My manager guesstimates how much time I can afford to put into a project per week, say 60 percent of my time. I ask the project manager when they need to start development, say eight weeks from now. So 40 hours times eight weeks times .6 equals how much time it will take to complete 80 percent of the model. The other 20 percent is reserved for requirement changes that come up during development."
  • Richard Kier, data architect, shares his technique: "My preference has usually been to allocate based on the number of developers. I generally assume that one modeler can support a team of four to eight developers, depending on the skills of both the development and modeling team. For a clean sheet project going through a full analysis and design session, I allocate a modeler 100 percent for each set of developers and then scale back - small enhancements to add functionality to an existing system would usually get one-fourth of a resource."
  • Chris Welch, data administrator, says: "I typically apply a confidence level as a factor. If I am very confident in my overall domain knowledge and I feel the time is about 20 percent, then I'll give a confidence rating of one and the estimate is at 20 percent. A two to five confidence level I will tend to add a quarter percent. So that breaks down as:

2 = +25% or 25% of total time

3 = +50% or 30% of total time

4 = +75% or 35% of total time

5 = +100% or 40% of total time

There were other creative estimation approaches submitted, including what Bob Schork, senior principal engineer, calls rapid data modeling: "After the initial [Java application descriptor] session or use case creation, when I have most of what I need, I tell the business user to give me a couple of days and let's see what I come up with. When I show the user the logical data model, I always get more detail because when they see the model visually, they start to think of things they have missed. So I get those missing elements and then modify the model and I am back the next day or two with the updated data model."

There were other creative estimation approaches submitted, including what Bob Schork, senior principal engineer, calls rapid data modeling: "After the initial [Java application descriptor] session or use case creation, when I have most of what I need, I tell the business user to give me a couple of days and let's see what I come up with. When I show the user the logical data model, I always get more detail because when they see the model visually, they start to think of things they have missed. So I get those missing elements and then modify the model and I am back the next day or two with the updated data model."

Steve Hoberman is one of the world's most well-known data modeling gurus. He taught his first data modeling class in 1992 and has educated more than 10,000 people about data modeling and business intelligence techniques since then. Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!), and organizations around the globe have brought Steve in to teach his Data Modeling Master Class, which is recognized as the most comprehensive data modeling course in the industry. Steve is the author of "Data Modeling Made Simple," "Data Modeler’s Workbench" and "Data Modeling for the Business (Technics Publications). He is the founder of the Design Challenges group and inventor of the Data Model Scorecard.

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.