The introduction of a technology to an organization can create conflicts and opportunities for productivity improvements. This is the case with extract, transform and load (ETL) technology.

Though technology changes rapidly, human behavior does not; and matters can degenerate into territorial politics unless management provides leadership. The ETL tool often operates at an architectural choke point, where transaction data is funneled, transformed and filtered while being collected and aggregated for query- intensive processes in the decision support data warehousing system. It's located at an architectural control point – a sort of Straight of Hormuz – through which the data must pass. If you wanted to seize control of the government of Chile, for example, you would grab the TV station and the airport; if you wanted to seize control of business intelligence in the enterprise, you would seize control of the ETL tool and its meta data repository. While that is not a probable scenario, what is likely is that the staff will feel uncertain about roles or responsibilities in the face of technological change occasioned by the introduction of an ETL tool. Therefore, organizations must plan and implement a transition to the new technology with an understanding of how the cross-functional team will operate.

Defining and implementing the optimal ownership model for the ETL technology (and indeed any tool) can be a challenge in a highly matrixed organization when getting results quickly becomes a priority. Further complications arise when a third party, such as a consulting firm, is providing project support or even leadership on a fixed-price basis. Data management and application development teams often conflict over who should control ETL tools. Typically, the application team has control but feels that the data administration function is slowing development. The data administration function, for all its power and influence, can sense that it is being marginalized or made obsolete, especially if the ETL tool operator understands and can implement a data model. From the outside consultant's perspective, it is better if the consultant operating the ETL tool can interface directly with the end user, gather all requirements, build the data model and implement the applications. Both security and data center operations will be involved. For both the developers who use the tool and the tool itself, authorization to access source and target data sources will require definition and maintenance. If the ETL tool happens to be a transformation engine, then operations and operators will have to be authorized to launch production process using it. Therefore, assembling and managing the cross- functional team is on the critical path.

Best Practices for Managing the ETL Tool

The application without data is empty, and the data without the application is useless. The application team is responsible for the business rules, initial interface and gathering of end- user requirements and graphical user interface (GUI) design; the data management group is responsible for data integrity, including backup and recovery as well as definition and design of data structures as a shared resource. The application and data management functions cross paths as the physical representation of the data model is imported into and made visible in the design workstation of the ETL tool, hence the requirement for cross- functional management. Because data integration is a key problem being addressed by the ETL technology, the data management team has an important role to play in the selection of the tool as well as in its operation and support. However, the database administrators (DBAs) are more in the role of reviewers and coordinators of the data integration process, as opposed to implementers of the applications that further data integration on a case-by- case basis. Much of the work of developing the application consists in mapping source to target data elements and applying transformation function in the process of mapping.

In short, the ETL is most like a fourth-generation language (4GL) in that it generates an application (even when it executes as a transformation engine) based on visibility to the data structures. It properly belongs in the application group as an application development tool. The actual operator of the ETL tool, the person who sits at the design workstation, is most appropriately a developer, someone who would be writing and implementing procedural code (though that is not what is happening in this case). The data management team will still have the responsibility of attending to the data integrity on which the application relies. The data structures made visible within the ETL tool will be designed and implemented by data management.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access