Faisal Shah would like to thank Jay Desai for his contributions to this month’s column.
During the last 30 years, enterprises have invested billions of dollars to automate and optimize business operations through information technology. Such investments have paid off handsomely through improvements in operational productivity, reduction in operating costs and revenue enhancements. Enterprises are beginning to make the same level of capital investment in building environments to support business intelligence and enable knowledge workers to make timely, accurate and actionable decisions.
As a result of this new emphasis on knowledge work, data has emerged as a vital resource for every enterprise. Data is generated and consumed to support day-to-day business operations as well as strategic decision making throughout the enterprise. Day-to-day business operations are supported by operational data systems, whereas knowledge workers and decision-makers are supported by data warehouse and decision support systems.
First- Generation Implementations Don’t Meet Long-Term Needs
Research shows typical environments for knowledge workers and processes have evolved around federated point solutions for reporting and decision support. While such approaches have helped enterprises address near-term, parochial needs, they sometimes compromise long-term information requirements. Decision making is an evolving practice, and knowledge workers constantly need more data more breadth, more depth, and more spread. When environments are built with a narrow purpose in mind, they do not easily scale or expand to meet changing needs.
Figure 1: Typical "First-Generation" Decision Support Implementation
Figure 1 illustrates a first-generation decision support implementation a federated, functionally aligned environment. Typically, there is a considerable amount of redundancy in data acquisition processes and in the data itself. First generation implementations often face difficulties regarding the timeliness, consistency and accuracy of data. Moreover, when data transformation occurs in several places, it produces "multiple meanings" for every piece of data and results in multiple versions of the "truth" throughout the enterprise. While such approaches have provided some respite to enterprise’s thirst for information, they do not provide long-term answers.
Decision making is not a fad. It is a permanent fixture, a core competency for every enterprise. It immediately manifests the value proposition of an organization to its customers, partners and employees. Because decision making relies heavily on data, enterprises need a foundation upon which they can build and deploy decision support solutions for both the short and the long term.
Data architecture is that foundation. It provides a blueprint for an enterprise to establish a permanent, robust and comprehensive class of data repositories to support actionable decision making. Data architecture also provides a road map for methodical, high-confidence decision support implementations. Additionally, by promoting a build-once-and-reuse approach, data architecture ensures efficacy, consistency and maintainability. All in all, enterprises that are willing to invest time and effort in creating a data architecture will attain a meaningful competitive advantage.
Establishing a Next-Generation Data Architecture
There are some basic questions an enterprise needs to answer before beginning to create a data architecture:
- What are the business and technical requirements for reporting and decision support?
- Is there a need to support integrated operational reporting and "as-of-now" reporting? If the answer is yes, it establishes the need for an operational data store (ODS).
- Is there a need to support historical reporting based on periodicity concerns days, weeks, months, years? If the answer is yes, it establishes the need for historical data stores (HDS).
- Is there a need to support analytic reporting (e.g., multidimensional cubes)? Is there a need to support forecasting, data mining, decision engines, etc.? If the answer is yes, it establishes the need for aggregate data stores (ADS).
- What is the scope of the data architecture? What are the source systems, data processes, etc.?
- What are the timetables for delivery?
- What is the quality of the source data?
- What are the balancing, audit and control needs?
- What are the meta data requirements?
Most organizations need data architectures that support wide-ranging reporting and decision support requirements operational, historical and aggregate. Operational reporting serves the purpose of providing an integrated view of operational information "as of now." Historical reporting serves the purpose of providing time-invariant behavioral information over a period of time. Aggregate reporting enables effective macro analysis over large scales.
A tiered multirepository data architecture approach, where each repository provides and supports discrete reporting functionality, provides for all of the different needs of the organization. Operational data stores support operational reporting, the data warehouse preserves the history of the operational data for analysis and comparison over time, and the data marts support specialized analytics.
This tiered approach drives discrete/dedicated ETL processes including balancing, auditing and controls for each class of repository. Typically, populating the data warehouse is the most complex because differences in the ODS over a period of years must be rationalized into a single, meaningful, structure
Figure 2: Typical Next- Generation Data Architecture
Figure 2 provides a representation of a typical next-generation data architecture and its three primary types of repositories:
- Operational data stores support day-to-day operations. They serve as the comprehensive and consolidated store for enterprise data, merging data from multiple sources throughout the organization. Data organization is normalized, closely aligned with operational systems and optimized to support data consistency, flexibility, integrity and maintainability.
- The data warehouse provides a historical, point-in-time view of enterprise data. Data organization is denormalized, incorporates periodicity and follows a dimensional model. Data organization is also optimized to support periodicity, consistency, maintainability (loading/unloading) and accessibility.
- Data marts serve as the functionally aligned aggregate information stores. Data organization is denormalized, incorporates limited periodicity and aggregation and follows a dimensional model. Data organization is also optimized to support data accessibility and data analytics.
What does this mean? Increased data integrity and less duplication means the enterprise has one "truth" which it can use as the basis for management decisions. A tiered approach to data architecture accomplishes this with less ETL effort and requires less movement of data around the enterprise.
A tiered data architecture is especially helpful for enterprises that are dealing with large data volumes. Organizations that have a terabyte or more of data can’t afford to choose the wrong architecture, given the high cost of systems and labor involved in big-data implementations. The right data architecture helps these enterprises harness their terabytes of data while minimizing effort and expense. A well-designed architecture can also improve the timeliness and accuracy of information delivery in organizations that are struggling to collect and disseminate massive amounts of data.
Well-Designed Data Architectures Achieve Technical Objectives
Enterprises can achieve some fundamental technical objectives through a well thought out, well-planned data architecture.
- Optimizing data value A solid data architecture maximizes the value of enterprise data by providing a holistic view of the enterprise and its customers; ensuring that data is reliable, accurate and consistent; and supporting the analysis and reporting needs of knowledge workers throughout the organization.
- Optimizing ETL processes A solid data architecture enables high-performance data acquisition processes, supports scalability and availability, and optimizes loading and unloading. A solid data architecture also allows quick recovery from ETL process failures and enables the collection of run time statistics to support meta data needs.
- Optimizing usage of decision support applications Additional technical benefits of a data architecture include enhanced query performance, flexible and timely reporting, and optimal integration with decision support tools and technologies.
Well-Designed Data Architectures Provide Business Benefits
As a data architecture achieves the technical objectives previously mentioned, it leads to several important business benefits for enterprises:
- Preserving investment in information Through careful design of the data architecture, an enterprise maximizes the value of the data it collects from customers, partners and operational systems. A solid data architecture makes valuable information more secure, usable and accessible.
- Using resources more efficiently A solid data architecture makes it easier for an organization to efficiently disseminate information and get the most out of its information technology resources.
- Maximizing the productivity of knowledge workers A solid data architecture provides an efficient path for the transfer of enterprise information to knowledge workers, thereby allowing them easy access to the data they need.
- Gaining a competitive advantage By providing more timely, complete and accurate information, a solid data architecture helps an enterprise gain a better understanding of its customers and its markets. A solid data architecture also creates efficiencies in information flow, leading to faster time-to-market performance for organizations.
In conclusion, a well-designed data architecture provides a solid foundation upon which an enterprise can build solutions to meet both the current and future needs of its decision-makers and knowledge workers. This foundation maximizes the value of enterprise data and supports the organization’s investment in information as a strategic resource.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access