Efforts to deliver business intelligence to end users have come up short, despite massive investments in data warehousing and data marts. The drivers for business success in information-rich industries, such as pharmaceuticals, will exacerbate this shortfall as integration needs increase, sales force automation projects accelerate and customer data moves from monthly tracking to weekly ­ or even daily ­ tracking. Sometimes this malfunction can be attributed to a lack of business rules. However, much of the ill success continues to be an issue of poor data integration and the need for a designated area for the activity to be accomplished. One approach to addressing this "business intelligence gap" is to implement an architecture that incorporates an information distribution hub into the warehousing environment. This concept recognizes the need for distinct environments outside the warehouse that both clean and validate the data before it goes into the warehouse and publish the data into a format that is conducive to business intelligence for data marts. An information distribution hub links these environments together with automated processes, creating a platform that delivers meaningful business intelligence to users in a timely basis.

The current environment is a result of the drive to rapidly deliver meaningful business applications to end users. This has resulted in the proliferation of data marts based on different access environments and with different processing infrastructures to load and populate the data. As the number of these data marts has increased, the data management and application development functions have become very difficult to support from a corporate perspective. The information distribution hub addresses this problem by standardizing the data acquisition and storage functions for all data marts and by creating a common "publishing platform" from which all data marts can be populated.

Creating an information distribution hub with a common infrastructure for data acquisition and receipt into the warehouse has numerous benefits. First, it insulates the decision support infrastructure from changes to the operational systems infrastructure. When operational systems are reengineered, replatformed or remediated, only the single interface to the distribution hub is updated rather than numerous feeds to warehouses and marts. Second, since there is now a single point of entry into the decision support environment for each data source, it allows each feed of information to be optimized from an operational standpoint. Each data source can be optimized for electronic delivery, automated load and validation and vendor formatting. This is made possible from both the organizational focus and financial leverage that results from optimizing fewer interfaces from operational systems.

Creating a hub with a common infrastructure for publishing data marts also has key advantages. It leverages the common data infrastructure deployed in the hub and utilizes tools to rapidly spin off and deploy data marts. This is what allows the business to quickly adapt to changes in the marketplace and deliver customized marts. Key parts of this infrastructure should include: a very large disk staging area for data transformations and paralleled summary table builds (the size of this staging area can, in some cases, approach the size of the warehouse); available processing power to leverage parallel processing and transform/summarize the data rapidly; data transformation tools to reduce development time; and database design specialists that can exploit the capabilities of the analysis tools through optimized model design.

Figure 1 describes some of the "best practices" for deploying an information distribution hub based on projects deployed in the pharmaceutical industry.

By deploying an information distribution hub and following the "best practices," you will position your organization to rapidly deploy data marts that deliver meaningful business intelligence. You will also do this in an environment that is maintainable, supportable and can adapt to rapidly changing business requirements.

Best Practices

For Data Receipt:

  • Transient store of data
  • Automated electronic delivery
  • Automated validation and cleansing processes
  • Alerts for manual intervention
  • Automated load to warehouse

For Warehouse Data Storage:

  • Store data at lowest level of granularity
  • Retain historical data
  • Don't store summaries or "pre-integrated" fast tables
  • No end-user access

For Business Rules:

  • Clarity
  • Cross-departmental delineation
  • Established ownership
  • Data stewards
  • Change current procedures
  • Business/IT steering committee
  • Subject-matter experts
  • Knowledge repository

For Third-Party Data Acquisition:

  • Data cleansing and validation done by vendor
  • Third-party data sources should be ready to load
  • Normalized fact table
  • No summarization
  • Integration done post acquisition

For Data Mart Publishing:

  • Perform aggregation and integration using parallel-enabled tools
  • Standardize around selected tools and analysis suites
  • Decentralize some application development to users/business analysts

Figure1: Best Practices

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access