The objective of a data strategy is to provide guidance and direction in a general and nonspecific sense, or to provide a mindset to be used during design, implementation and tuning times. The strategy context should be at the industry level, and the strategies objectives should be at the shop level.

Terminology varies wildly at the shop level. I will use the industry standards of Object Management Group. Their business motivational model forms the rough basis for my vocabulary and the set of relationship concepts for devising a data strategy. Influenced by this, I consider a strategy as something that is above business rules and directive levels, is implemented by tactics, and represents an essential course of action. In short, a strategy supports a mission statement, and I’m applying it to the enterprise level. Usually, we devise a strategy as a generic output shaped and refined by increasingly more specific use cases and directives developed over time.

Scope of a data strategy has at least three critical parts. A data strategy is more than a modeled data inventory. It is a managerial function, and as such, it needs to address quality and provide a prescription for going forward. It can be a method for getting a single source of truth for tracking business and metadata. It sets quality standards for timeliness, accuracy, completeness, pertinence, reproducibility and precision. It is a prerequisite for a good business flow and data scorecarding or quality assessment.

The data strategy needs to define priorities at the shop level because some of these data qualities are mutually exclusive. The quality needs of the industry must be considered as part of the data strategy context because they will enable or limit the ability to reuse data and services. A strategy can help manage exceptions that become folk law and prevent orphan directives, tactics and business rules. It needs to be a template that can be applied at new function generation time.

The data strategy regulates the purpose of and access to system of record data stores, single source of truth data stores, business rule and data governance enforcement stores, data warehouses and data marts. It is prescriptive of the design, population, extension, maintenance and storage considerations. The data strategy should cover data hosting, residency, OLTP and OLAP design, DBMS or application referential integrity enforcement, data timeliness, recoverability, static and derived data retention, synchronization or subscription, compression, latency, security, partitioning, distribution and access methods as well as the number of hops prior to and the timing of data scrubbing or MDM enforcement.

Who is involved and what skills are needed to devise a data strategy could fill several separate articles. Only key, high-level descriptions follow. The database administrator will have to support many of the data storage standards defined in the data strategy. Even if these are the domain of the DBA, a formal data strategy should explicitly include them to help manage the inverse relationship between data quantity and data quality. The enterprise architect skills required include application inventory generation, metadata management, data transport protocols and the balance of best-of-breed to commodity applications. Regulating the retention of systems with limited interoperability and their decommissioning is a key overlap with these folks. Project managers have a stake in data quality and reusability which are inversely related to risk failures and cost of application development. Intuitional folk law and documentation tradeoffs are a function of staff turnover that overlaps project manager domain.

Since a clearly defined strategy based on rules and good standards adoption are directly related to intuitiveness, speed of development and lower maintenance cost, the ROI from reuse will sell a data strategy - if the cost accounting system can measure it. Unfortunately, most can’t, and without it, strategic needs go unaddressed. If your shop makes project cost and duration fixed, the only variable remaining is scope. ERP administrators promote interoperability by using a common data strategy across many programs. Services using a vendor-specified standard need to be encapsulated. This standardization of data across many processes is one of the key ERP selling points. If you are lucky, the vendor supports industry standards. In any event, the data strategy needs to implement over a larger scope than an ERP does.

An industry standard best practice is for the shop standard to encapsulate and interface with the vendor standards. Coupling the shop and vendor standard as a directive in support of best-of-breed systems is frequently done. As the industry overtakes best-of-breed enhancement, the ROI of this exception is reduced to a legacy burden. This is another key value indicator the data strategist needs financial observers to be aware of and track.

Industry standards groups and other external sources, aside from the obvious impact that regulators have, should define the framework the shop uses because the industry level largely sets the data requirements. Several competing cross-vendor standards bodies can be considered for frameworks. They include but are not limited to: ISO, ANSI, OMG, The Open Group and W3C.  

A capacity planner skill set is needed to identify the workload mix as part of the current assessment and future requirement. A good data strategy will show the nature of the I/O requirements and be able to classify them in terms of benchmarks like TCP scores. Having that skill set is a key requirement.  

The duration of data modeling on any critical path should be concurrent with project feasibility and high-level design. Modeling files and a metadata database need to be part of a formal change control process.  Project quality assurance or acceptance testing needs to require an as-built model to feed the next project. Standards enforcement to help intuitiveness and consistency in the optimization of tuning configurations, consistency of data type domains and data availability qualities must be properly documented and communicated. Consistent model notation and integration with the data dictionary BPM or other application models are the domain of the modeler and should be coordinated as part of the data strategy. Programmers and a data strategist needs to agree on volume estimates, constraints and RI enforcement locations and data definitions along with types of keys, unique keys, summarization, filtering or subsetting and context as well as the impact of encapsulation and morphing of content.

Along with DBAs, a standard indexing schema for structured and unstructured data is needed. Lawyers and security staff must agree with the data strategy, CRUD analysis, group and role definitions. Subject matter experts from the business own the data inventory and meanings. They define the suitability for use but not the at-rest design or transfer protocols. The use cases they develop need to be one to one, with KVIs in the requirements and testing. The strategy has to support the sum of the KVIs with limited directive exceptions.

Devising the data strategy can be an ongoing iterative process based on scientific method, or it can be a de facto standard developed by the summation of default and missing values. Devising the initial strategy can be a top-down or bottom-up process, and it can reflect tactical or strategic thinking. The ongoing maintenance or shaping of the data strategy is performed by creating and evaluating the impact of directives on a strategy.

Observation of the KVIs associated with the directives over time shapes the data strategy (formal or folk law). Bottom-up default summarization creating a folk law strategy is probably how your data entry, I/O, traffic management, backup and recovery, and error handling guidelines evolved for all but your ERP systems. This is the current industry mode for devising a data strategy in house. It is a bottom-up summation of business rules and constraints from tactical projects that generate distributed systems and use a management–by-exception approach to directive generation that results in an undocumented de facto data strategy.

This is close to the worst-case scenario because tactical solutions tend to block the flexibility needed to deliver enterprise data in a timely and agile manner. Since this is also the basis of the current assessment for the changes proposed, it means tactical or bottom-up shops that use best-of-breed software are starting off two steps back because they have exceptions to and undocumented strategies as their strategy.

The best-case scenario for devising a data strategy is a simultaneous meeting of top-down objectives based on decomposition of vision and mission statements with a bottom-up business rules and constraints summation design that is periodically reviewed to limit active directives and best-of-breed exceptions to seven (plus or minus two) in number. A good data strategy promotes interoperability, data quality and reuse based on industry standard context with limited exceptions. It has the cost savings of interoperability and reuse promoting it.  

Where and when you do this will change things. Data strategy objectives and the exact scope and design (point to point or hub and spoke) depend on tradeoffs between time-to-market and short-term cost optimization, or on the other hand, data interoperability, ease of reuse and long-term cost optimization.

Shops implementing ERP systems will have vendor-driven objectives with limited rollout requirements and need to perform a business process redesign (BPR) so data matching will only be approximate. Shops that need to support an enterprise service bus or SOA will have requirements that apply to all systems in the shop and need to spend extra time on service and data catalogs and semantics. Shops that are seeking migration to the cloud will need industry standard frameworks, tight SLAs for low latency and language translation support. Shops seeking low-power, fault-tolerant support, or AI net. or neurogrid probabilistic system support will  need special metadata standardization, advanced semantic parsing and archiving as well as real-time ETL support to reconcile data without a single source of truth or real-time ODS.

In summary, when devising a data strategy, it is better to scale out rather than up. Identifying any additional interfaces will bring a wealth of potential metadata and KVIs to the forefront. A strategy by definition is high level, and the goals must be clear, consistent and concise. Bigger is not better. The meeting of a top-down iterative yet strategic strategy that supports the vision and mission statements coupled with a bottom-up iterative business rule and goal-summarizing tactical strategy that has only a few current directive level exceptions and has a fixed number of use-case-inspired KVIs is a good way to devise a data strategy.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access