In good times or bad, businesses must transform data into useful information to not only run the business, but to grow it and increase profits. In addition to sales and profitability, businesses need data to provide transparency, privacy and security for government and industry-specific regulations or initiatives.
Why We Must Change
Although demands for information expand exponentially, businesses keep falling behind because they continue to look at data integration as an IT task. This narrow focus inhibits businesses from transforming data into the comprehensive, clean, consistent and current information they and their customers, partners, suppliers and stakeholders crave. They are losing competitive opportunities and are unable to provide transparency, leaving them perpetually behind the curve of information demands.
Unable to wait for late, over-budget IT projects, businesses rely too much on data shadow systems (or spreadmarts). They lose productive time and resources hunting, gathering and merging data when they should be focusing on running the business and increasing revenue and profit.
The Old Ways
Businesses need to take a new approach to data integration if they want cost- and resource-effective access to information. They can start by recognizing the following misconceptions:
- Data integration is ITs responsibility. (Reality check: data integration is only as good as the quality of the definitions of data, transformations and metrics that only the businesspeople can provide.)
- Data integration is just a coding task. (Reality check: its less expensive to buy a tool.)
- Data integration is only ETL. (Reality check: there are many other aspects of data integration, such as data profiling, data quality and consistency, and operational processing, that go beyond basic ETL capabilities.)
- Data integration is dealt with in a project-by-project, tactical approach. (Reality check: it should be enterprise wide, like phones and email systems.)
The Leopard Needs to Change its Spots
When things are not working, it is time to change your approach. Learn from the many companies of all sizes across various industries that have been successful in implementing data integration. These companies have broken through the misconceptions and reoriented their data integration efforts.
Its the Business
Restricting data integration to IT is a recipe for disappointment. Information wont represent what is really happening in the business, forcing businesspeople to fill the information gap by building data shadow systems that sap productivity and result in poor data.
The business should be involved in three ways: sponsorship, participation and governance (see Figure 1).
First, although the business needs to sponsor or fund a data integration project, sponsorship is more than writing a check. Sponsorship means commitment of resources and time, not just for the current critical priority but on an ongoing basis. It does little good to sponsor data integration activities if your businesspeople dont have the time or inclination to get involved in governance and in the project. If the business sponsor isnt willing to change business processes and priorities, then his/her subordinates will get the message that its a waste of their time and the investment will be lost.
Second, business participation in data integration projects from beginning to end is necessary to get relevant data. This starts with gathering the business and data requirements for the project but goes much further. The most often overlooked aspect of these projects is data quality. Many people are surprised at the end of the project that data quality hasnt improved and, in fact, more data quality problems are uncovered. IT needs to work with the business to know the current state of data quality within the source systems. More importantly, it is necessary to understand the state of data quality across business systems, especially as it applies to inconsistent product, customer and supplier data. The business must be involved in uncovering data quality, determine what to do about it and establish the priorities about what should be fixed.
Finally, data governance is essential both on the current project and as an ongoing activity. Only businesspeople can provide the business definition of the data, how that data needs to be transformed into useful business information and the metrics that measure business performance. Just as the business evolves, so does the accompanying business data. Unlike a fine wine, data does not age well. It needs constant attention from the business to update changes to the definitions, transformation and metrics.
Its Not Just the Tool (but You Need One)
Hand coding remains pervasive in data integration projects because ETL tools once were expensive and hard to use. Although times have changed, its still a common practice for IT to crank out the SQL code for ETL. Because the tasks are complex, the SQL code stretches on for many pages for each data source, and there are numerous SQL scripts scattered across the enterprise to gather data from all the necessary data sources. Before taking this cumbersome approach, ask:
- Are the ETL SQL scripts and hand-coded applications documented?
- Is the coding totally dependent on having the best SQL guru on the project?
- What happens with the code when the current developers leave?
- Do you know how well the ETL processes perform when run?
- How do you recover from errors?
- How responsive is your code to changes in sources or BI requirements?
- Do you have a huge queue of new ETL requirements that keeps getting bigger?
Hand-coded ETL applications do not age well. The larger they get, the harder it is to update them or fix problems. Although the initial costs are higher when purchasing a tool, after a while the hand-coded applications get increasingly costly to add to and modify (see Figure 2). In addition, responsiveness using hand-coded ETL decreases with time.
Theres no need to reinvent the wheel and hand code ETL. ETL uses a lot of common processes, and its doubtful that your programmers can code them better than the tools vendor. Even if they could code it as well as the vendor, is it worth their time and your budget to do so? The best use of their time is to employ these prebuilt processes to transform your data rather than building the processes from scratch.
The days of costly ETL tools with limited functionality are gone. There are many tools across various price and skill set ranges to meet different levels of data integration tasks. The prices range from six figures for data integration suites to ETL tools bundled with databases or even open source offerings. Some of the reasonably priced best fit or best buy tools might be just right for your situation.
If budget is a big concern, it would be better to select a bundled or open source tool rather than hand coding. Do not let the price fool you; these tools are still better than your best ETL programmer in the long term.
It is Bigger than ETL
ETL tools get all the publicity, but robust enterprise data integration also requires these technologies and processes:
- Data profiling: Analyzing the state and condition of data in source systems.
- Data quality: Cleansing and ensuring consistency of data.
- Auditing and process management: Managing the overall data integration workflows and enabling data auditing.
- Real-time integration: Enabling real-time access and integration of data scattered across an enterprise.
- MDM: Ensuring consistency of a companys master data (i.e., product, customer and employee reference data).
- Metadata management: Defining and managing all the data, transformations and processes used within data integration processes.
The core technology is the basic ETL process, but each of these processes builds upon and extends the basics. They start appearing in a companys data integration effort as their labors mature and as business expectations rise with success in implementing basic ETL. Like ETL, theyre difficult or nearly impossible to build cost-effectively, so buying tools is best. Although many have been available as standalone tools, the top-tier ETL products have evolved to data integration suites incorporating all of these extended functions. And many of these processes are being built into the entire spectrum of ETL tools. When evaluating, check out which tools have these extended functions. When using a data integration tool, leverage as much of this functionality as appropriate to your business needs.
It is Bigger than the Project
Data integration is taking place all over your company, whether through formal tools, hand coding or filling the gaps with shadow systems. Its important to step back from a tactical project and look at the bigger enterprise perspective to determine how you or others can leverage each others work. Too often, each data integration project works as a silo, which not only costs time, money and resources, but is also likely to mean that data is not consistent across these silos. As each data integration silo gets built, ironically, the enterprise gets farther from obtaining consistent information.
How you handle data integration activities across the enterprise depends on your companys size, information maturity, process orientation and use of standards. Breaking down these silos can range from sharing standards and tools, pooling resources or creating a specific group dedicated to enterprise integration.
At a minimum, your company should define integration standards, reuse processes and share a common toolset. With this approach your enterprise can avoid duplication and focus its efforts on solving business needs. People will be sharing knowledge and leveraging past work while learning from others in the company.
The other end of the spectrum is to create a group dedicated to designing, developing and deploying integration applications across your company. This enables a group to become highly skilled and specialized in data integration. Generally, this happens in larger companies.
An approach in the middle is to establish a data integration center of excellence that pools resources, establishes standards and creates reusable processes.
Achieve Deliberate Data Integration
If youre in a large corporation that has years of data integration efforts, you are probably ready to expand your scope by leveraging data integration suites. If you do not yet have an extensive data governance program, it is an opportune time to expand business involvement to increase the business ROI from your efforts.
If your company has not invested in data integration, it is time to move away from accidental data integration (i.e., business and IT hand coding integration in an ad hoc manner) and make it deliberate. It saves money and helps businesspeople get the information they need.
Whether youve started data integration or not, the key is to make sure that its pervasive and purposeful.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access