OCT 19, 2006 1:00am ET

Related Links

10 Sustainability Predictions for 2011
February 23, 2011
A Letter to Future Employees: Embrace Analytics
February 3, 2011
A Hunger for Risk
January 6, 2011

Web Seminars

How to Run a Successful Bring Your Own Device (BYOD) Scheme
Available On Demand
IBM MobileFirst Management: Empower Your Mobile Workforce
June 25, 2013
Hybrid Cloud Storage: Getting the Best of Two Worlds
June 26, 2013

Six Degrees of Separation: Connecting the Dots Between Business and Technology Using ETL

Print
Reprints
Email

"Six degrees of separation" refers to the assertion that any person on the planet is connected to every other by five relationships with other people. For instance, if I needed to get a message to central Mongolia but did not have the phone number or address, I would communicate it to my neighbor, who know someone in Bangalore, India, who, in turn, knows someone in Singapore, who knows someone in Beijing, who knows someone who is neighbors with the recipient in Ulanbantor in Mongolia, and hands it off. Five hops, six degrees of separation.1

Connecting the dots between the business and the underlying technology in the case of extract, transform and load (ETL) technology can be a tricky undertaking and ends up working similar to our example of passing that post card from Chicago to Mongolia through six degrees of separation. This will require a bit of set up and some work, but it is well worth considering given the need to build the business case for technologies that are not always intuitively or directly relevant to business operations.

At first glance, ETL looks like a technology that is more relevant to infrastructure than to a conversation about business results. The ETL platform is furnished with a design workstation at which a developer uses a high-level interface to drag and drop and point and click to generate an application. Often the ETL platform occupies a key point in the system architecture - a data hub - through which heterogeneous data must pass to map upstream data elements, typically from transactional systems, to downstream data elements in a data warehouse, data mart or your target data store of choice. Along the way, a wide variety of operations are applied to transform the data - including changes in format, look ups to adjust content as well as actions to affect data or information quality. Predefined connectors or adaptors enable the access of an extensive list of data source and targets, extending from relational databases to XML to enterprise resource planning (ERP) or other proprietary systems and sources. The rules of interoperation between different systems are captured to a centralized metadata repository for subsequent impact analysis, easing system maintenance. The metaphor of an information supply chain that unfolds in data stages is powerful and relevant. Yet, as described so far, ETL is not a technology that addresses what keeps most businesspeople up at night. How can we make the connection in order to build the business case?

What does keep businesspeople up at night? At the risk of over simplifying, the CFO is worried about the bottom line numbers and the integrity of the preceding lines from which it is derived; the marketing manager, about the coherence of the messaging going forth in the firm's communications; the sales staff, about meeting their quota; the product manager, about inventory levels (low but not too low) and the quality of the production process; the HR manager, about inspiring teamwork and collaboration; the executive function, about market trends, competitors, substitute products, legal issues, regulatory pitfalls and compensation plans. Of course, each of these roles is an over-simplification, and each of them is concerned with their own relevant metrics, messages, inventories, quality and teamwork. If the enterprise is to succeed as a whole, each of them must have a concept of serving the customer (or if they are in the public sector, the constituent) and a perspective on the overall enterprise. If they do not get their heads above the day-to-day struggle for survival, then a whole set of dysfunctional behaviors can result with damaging consequences.

Now let us (finally) return to our hypothesis about six degrees of separation. Any valid data point is separated from a business problem by a number of degrees of separation - sometimes six, sometimes less or even more. The first level is the transactional one - customer buys a product or service. This generates an atomic data point, a sale. Of course, the transaction itself may entail degrees of separation such as credit checking if the purchase is with a payment card or validation if an insurance claim is being processed. In turn, this data is aggregated at a second level for purposes of statutory accounting and regulatory reporting, categorized according to distinctions that mean something to government auditors. At a third level, the transactions are aggregated according to other relevant master data dimensions - which customer bought which product and when and where this occurred. A fourth level consists of adjustments to aggregates. In retail, products are returned and must be added into inventory and subtracted from revenue. In insurance, losses must be accrued, deducted from reserves and added to payouts. If the organization has a flat structure and consistent systems, we are done sooner rather than later. We have traced a path from interaction in the market to a fundamental business entity (such as customer or product) that can solve a business problem or answer a customer question. We are now able to answer basic business questions about trends in the market and related issues. In this example, we have four levels or three degrees of separation unless we count the one hidden in the transactional layer, in which case we have four degrees of separation. This is relatively simple, but rarely is this the case.

There is a "gotcha." In this discussion, "degrees of separation" is a proxy for "steps in a process" or "hand-offs between system interfaces." A quick and easy result such as the one we just obtained does not really correspond to our intuitions about the complexity of business operations. This result presumes we have intermediate access to something similar to a consistent representation of customers and products or other essential reference data in the form required. In general, if each system interface (hand-off between different representations of the same data) represents a degree of separation, then the task of merging a half dozen customer or product files across each of these interfaces represents a potentially astronomically large number of degrees of separation. The exponential fan out goes in the wrong direction - instead of converging on your next-door neighbor who hands off the post card to his second cousin in Bangalore, we get proliferating system interfaces. It is as though you need to get there by way of the planet Mars. This is the state of the spaghetti-like diagrams of system interrelations that represent the "before" state in data mart consolidation initiatives.

Filed under:
ETL

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Where do young IT professionals (30 and under) obtain information to aid with daily role responsibilities and career development?

Trade publication websites 14%
Social media 23%
Vendor websites 4%
Vendor/community forums 7%
Newsletters 1%
Trade conferences/meetups 2%
RSS feeds 6%
Web search 44%

 

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.