Industry Data Challenges


In recent years, global energy and financial markets have grown significantly. As these markets continue to expand they require an increasing amount of transaction data to be captured, managed and delivered for analysis and actionable decision-making. Factors such as the California and Texas nodal initiatives, new financial instruments and the increasing popularity of derivatives have contributed to the dramatic increase in data creation. This increase of data is expected to continue exponentially in the future, placing higher demands on processes, technologies and resources.


Data quality is highly correlated to the accuracy of market research and thus profitability of clients. Missing and incorrect information can be extremely costly to decision-makers. The need and importance for accuracy, completeness and consistency of timely data cannot be overstated. Unfortunately, data from leading vendors is often inconsistent, incorrect or missing. A recent study researched four providers of energy futures data and concluded that each data vendor had inaccurate and incomplete data.1 This study brings to light that an organization must add intellectual capital to every sourced piece of data it relies on.


Enterprises are shifting toward real-time consumption and distribution of information and away from traditional batch approaches. Contributing to the shift are the increasing accessibility and reliability of real-time feeds and the expanding need for a competitive advantage based on the timing of trade execution. Information that arrives late is at best less valuable and often completely worthless.


The increasing importance of internal audits and regulatory compliance, such as the Sarbanes-Oxley Act and Financial Accounting Standards Board (FASB) 133/157 accounting standards, increase the workload of client IT departments.


The combination of these challenges and the need for them to be addressed concurrently multiply the degree of difficulty for energy and financial capital market companies. Specifically, organizations need the ability to acquire, standardize, correct, store, manage, audit and deliver more data than before - faster and with higher data quality.


This article introduces a framework for energy and financial capital markets (EFCMs) master data management (MDM). The framework is specifically developed for these industries and addresses the aforementioned challenges.


Energy and Financial Capital Markets MDM Framework


MDM provides the authoritative, reliable foundation for data used across the organization with the goal to provide a single version of the truth. The EFCMs introduce subtle differentiation to traditional operational MDM frameworks. Specifically:

  • Distributed time-series architecture is required to maximize storage and retrieval performance and efficiency,
  • Data quality focuses on quality assurance rather than cleansing and merging of information and
  • Reliability, throughput and real-time ability are pushed to extremes when millions of dollars are at stake every second.

A high-level architectural representation of a proposed EFCM-specific MDM framework with associated source and target systems is depicted on Figure 1 (located at the bottom of the article).

  The EFCM MDM framework in Figure 1 includes two main components: the MDM process hub and the client data mart system. The MDM process hub handles centralization, quality and consistency of information. The ideal architecture for the MDM hub is an integrated MDM centralized repository that houses the “gold copy” of all master data and metadata information. This integrated approach provides the most complete, accurate and consistent view of master data.


The federated and hybrid architectures are not well suited for energy and financial systems. This is due to the volume of data, real-time needs and importance of consistency and quality. The MDM hub handles the lifecycle of data, master data and metadata. It includes faculties for process management (automation, notification, measurement, monitoring, logging and reporting), hierarchy management, model management, rule management, security and data governance. EFCM MDM systems need to be highly scalable and able to dynamically process millions of data points every hour.


The client data mart system provides a highly efficient and optimized client-centric view of the subset of information required by each department, division or company. The ideal database architecture for EFCM client data mart systems is a time-series database, which inherently understands time-based relationships, requires less storage, eliminates the need for indices and provides a much faster response time for energy and financial market data. The client data mart system provides faculties for importing, querying and exporting all energy and financial information in a simple intuitive way. In effect, the role of the client data mart system is to empower departments, divisions or companies to interact with their custom version of a much larger universal truth. An example of the client-centric data mart view is depicted in Figure 2.



Master data for EFCM MDM systems include the following categories: supplier (data vendor), energy market products, financial market products, customer and enterprise information. EFCM MDM solutions must provide the ability to easily integrate client-generated forward curves, models and other proprietary data to external energy and financial information, as reference data or as master data based on business requirements. Representative energy capital market (ECM) MDM solutions include: agriculture, coal, futures, electricity, emissions, freight, metals, natural gas, nuclear, petrochemicals, oil and weather - just to name a few. Financial capital market (FCM) MDM solutions include: credit derivatives, currencies, equity pricing, equity fundamentals, fixed income, index constituents, indicators, indices and options. Similarity of transactional data types and overlap of categories (such as futures) across ECM and FCM support a single EFCM architecture that accommodates both industries.


A notable distinction exists between the transactional data, which flows from suppliers through the stages of the data lifecycle (acquisition, standardization, quality assurance, normalization, validation, distribution and archival) to client data marts, and the energy or financial product master data hierarchies which capture and describe the single version of the truth through the hierarchical canonical schema (HCS).


Client, energy and financial data are made available in a variety of frequencies, methods and sources. The acquisition layer handles real-time, intraday, daily, weekly and monthly data through pull or push acquisition methods and over an array of technologies (such as FTP, HTTP(S), Web services, email, fax and client application interfaces) in a consistent, automated fashion.


The standardization layer addresses structure and data transformation requirements. Structure transformation provides a consistent file layout to data that arrives in a plethora of formats (such as Excel, comma separated, fixed length, delimited, fax and email). Data transformation standardizes supplier, customer and product information against the master data repository.


Quality assurance is provided upstream through quality checks in the MDM hub prior to distribution of information to the client data mart(s). EFCM MDM systems depart from the traditional cleansing data quality phase that is substituted by quality assurance because most vendor data cannot be changed due to contractual agreements. Corrections for out-of-range or missing values or other changes need to be logged for auditing purposes.


The normalization stage shapes and categorizes data into the HCS. A metadata view example of the HCS is depicted on the left-hand side of Figure 2. A client-centric view of the data is shown on the right side.


The validation stage guarantees quality of data for distribution to the client data mart(s). This is accomplished by applying the updates to be performed to the client data mart(s) in an internally housed representation of the remote system.


The distribution stage manages the delivery of client-specific packages in order to efficiently transmit only information required by the client data mart(s) based on predetermined client agreements.


Once information has been distributed, the archival step relocates information to a predefined efficient storage facility and reallocates resources to optimize distribution of new information, while maintaining an easy and expeditious retrieval of archived data.


Client and external applications consume consistent, correct and timely information in a variety of downstream decision support systems. These systems include: risk management, portfolio management, fundamental analysis, historical analysis, statistical research, forecasting and trading.


MDM technology provides a portion of a holistic MDM solution. Governance expands the MDM scope to formal orchestration of people and processes, promoting data management and use as an enterprise asset. Governance plays a critical role in the journey toward successful MDM. Under data governance, data stewards establish and validate data quality rules, monitor data for compliance and ensure the quality of all data from acquisition through delivery. This is accomplished by handling data exceptions and managing communications with vendors and internal business stakeholders. Data stewards report to an executively sponsored data governance council responsible for the strategic direction, processes and resources of data quality across the organization.



  1. Zachary Simecek. "Data Vendor Comparison." Vendor Comparison.pdf . Logical Information Machines, Inc., April 7, 2008.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access