Ok, what do system integrators, enterprise application integrators (EAI), B2B integrators, data warehouse architects, customer relationship management (CRM) designers, enterprise resource planners (ERP), and even architects of mergers and acquisitions (M&A) all have in common? Answer: for whatever purpose, they integrate data in a way that makes it of greater value to the enterprise, and they place it where it can be enacted upon. That’s right – it’s “data” that they’re integrating. Ever wonder why your data management organization isn’t playing a bigger role in these projects? You should.

Data integration is something mankind has been doing since the advent of record keeping. Even the Bible points to such events as the census that required Joseph and Mary to travel to Bethlehem in approximately 8 B.C. Or how about Joseph, son of Jacob, and how in 1800 B.C. his management of a seven- year surplus saved the people of Egypt during the seven-year famine that followed (talk about challenges with your supply chain).

According to one well-known IT management journal, data integration is at the root of most IT failures and miscellaneous other problems. The author observes, “…There are just so many different ways to integrate data and thus so many ways to fumble the project management fundamentals common to any IT initiative…[Data] integration has clearly become a basic block-and-tackle endeavor for many organizations – a low-profile service that user groups and senior management have come to take for granted.”

In this two-part article we will identify some of the different ways that data can be integrated – perhaps there’s more than you might think. We will also take a look at what these methods all have in common, and we will conclude by focusing on the processes that make them different.

Different Varieties of Data Integration The term “data integration” is one of those terms like data quality that get thrown around a lot. It can mean a lot of things to a lot of people. In the realm of EAI or B2B, it might simply mean to transport data from point A to point B. In a sales application, it might mean to merge two similar lists of customers into one. The IRS will be particularly interested that you integrate or aggregate every source of individual or joint income on to your 1040, line 7. In a data warehouse, integration can take on yet another meaning for the data architect who needs to combine the business’ customer information from checking account, savings account and CD account files into one common Customer Table. I have heard IT managers use the term data integration to refer to each of these situations, yet each scenario has quite a different scope and effort, and requires a different set of techniques to complete. These forms of data integration can be classified within one of four types:

Data Sharing/Data Transportation In the final analysis, implementing EAI simply means moving information between applications. The challenge is to set the applications up to send, receive and react to information. One definition of EAI states that it is “the unrestricted sharing of data and business processes throughout the networked applications or data sources in an organization.”1 During the two decades leading to the mid-1990s, business applications were designed to run independently, or as business silos. It was during the 1990s that ERP and data warehousing architectures sought to share data between applications in an effort to gain a strategic or operational advantage. As this awareness continues to grow, companies will continue to provide their application systems with the ability to transfer and to share data between systems – and it is EAI they seek as the silver bullet.

Figure 1: Data Sharing

Data Compilation

Data compilation is the process of accumulating a large number of similar things – records or rows in a file or table, for example. I am reminded of a bank whose departments operated as independent data silos – checking, savings, CDs, loans, etc. In order to know who their customers were they needed to compile a master file of customer records from each department.

Figure 2: Data Compilation

Data Aggregation

Another instance of the term data integration is when different sets of data need to be consolidated into a single occurrence. Such is the case when numeric data is aggregated into an instance of higher granularity. For example, by definition a data warehouse is granular – each of its dimensions must be analyzed to determine the appropriate level of granularity. In preparing the data for a data warehouse, the ETL process is usually called on to aggregate its source data – thus combining records of a lower level of granularity (as in the case of aggregating weekly transactions into monthly totals).

Figure 3: Data Aggregation

Data Synthesis

Data synthesis is a combining or blending of data elements into a whole that become more valuable to the enterprise than the individual elements. In constructing the row of a customer table, for example, the source of the customer name and address data might come from sales, credit information might come from billing, customer demographics might be found in marketing, etc. It is when these fields are combined into the single row of a table where it can be seen at one time does it empower the company to make better decisions about that customer (i.e., one- to-one marketing).

Figure 4: Data Synthesis

In next month’s column of Data e.Quality, we will examine the common set of processes that each variety of data integration uses. We will also review those unique processes each need to perform its mission.

Reference: See www.webopedia.com.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access