This month we are going to preview the analysis and design process for decision support systems. As with most things, if we haven’t done this before or are unsure of the process, we fall back on methods and processes that are comfortable and familiar. Most of us grew up with transaction-based systems design, where we delve into the details of how the operational processes function with our business domain. Data warehousing turns the tables on us by posing an impossible user requirement: I’ll tell you what I need when I see it.
Based on this, most traditional methods of analysis and modeling fall short. They also fail for another reason they take too long. We need to get a data mart project increment out in 30 days and a data warehouse increment out in less than 60. Most start-up tasks for traditional systems development life cycle processes (SDLC) take longer than this. However, the need remains to engineer the data warehouse database and decision support application within acceptable limits for systems engineering. What do we do?
Breaking the problem down is a good place to start. We need to decide how many activities need to be completed and how to bundle them in terms of sequential and/or parallel streams. An initial list includes:
- Program and project management tasks.
- Team training in DW building methods and tools.
- Tools selection and deployment.
- Technical infrastructure identification and deployment for the development environment (followed by the same for an acceptance and production environment).
- Data asset analysis and data staging.
- Database design.
- Applications design.
- Meta data management, data security, etc.
Our next task is to decide which internal and external project groups will undertake each stream. For example, technical infrastructure (selection and/or allocation of processors, disk, network and back-up resources) is often delegated to a systems support group. Once this subset task list is determined, we need to identify time frame, milestones, deliverables and resources for each activity (within our primary area of responsibility) and how we will interface/delegate all secondary activities (such as outsourcing team training).
An example of what we need is illustrated in Figure 1.
It’s always hardest the first time around because you must complete a lot of setup tasks; but, those tasks can simply be augmented and tuned for each succeeding project (as the warehouse will be successful and you will be growing the environment, right?). Our primary priority remains to capture and utilize the expertise of our client and get them what they need as fast as possible. Not only is their attention span limited, but business pressure will force us into quick decisions. From a best- practice perspective and keeping the critical time frames in mind, what works best is a rapid prototyping process designed as an exploratory warehouse (standalone to start). Simply obtain a decent size server and supporting infrastructure, sample and collect sufficient data that can be engineered into an initial star schema data model design through the deployment of automated extraction tools (often with some consulting help if this is your first time), which is then engineered into a relational or multidimensional database engine. Then decide on two or three decision support tools to preview with the client (remember we and our client are still trying to discover requirements). Selecting a tool that allows us to pose a lot of questions can help determine what kind of front-end application will suit the client’s needs. Often it is operational-based (structured reporting). Over time as the client gains experience it becomes more OLAP in nature. Don’t get too hung up on the decision support tool (if you consult industry experts or organizations, they usually have lists of tools based on processor classes that work best together). In addition, you will deploy more than one if you end up with an EIS, DSS, operational reporting and data mining tool. If you have time, think ahead and look for bundled tools which can share meta data across these four layers. But, first things first. We need to develop and rapidly deploy a database engine based on a star schema model that our data staging folks can populate. Again, speed is of the utmost importance, so collect just enough requirements to fuel source system selection and database design. The application requirements can be collected once you get the tool in the users’ hands to determine what really is needed (versus what you thought was needed if you went through a process of elaborate JAD sessions and papered the walls with business-process models). Don’t get too attached to your first, second or even third DBMS design because it is going to change. Leave your exploratory environment alone for a few months after the initial design is completed and let the client hammer away at it for awhile. What you need to do while this is going on is tune and refine the data-staging process and collect and review the SQL being generated by the users to determine what really is going on and what is needed. After determining a good cut-off point (two months minimum), you can then stage a more stable version of database, application and data extraction into your production environment.
Next month we’ll continue discussing the process of analysis and design of decision support data structures and related processes. For a more complete description of the process and deliverables discussed in this month’s edition, please contact the author at firstname.lastname@example.org.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access