Enterprise Information Architecture


In the book The Art of War for Executives, Donald G. Krause interprets the following: “Sun Tsu notes, superior commanders succeed in situations where ordinary people fail because they obtain more timely information and use it more quickly.” For metadata professionals, this observation is increasingly relevant as more and more of the business seeks integration and federation, alignment with business goals and strategies, and agility - the ability to respond both quickly and accurately to change. Industry analysts and IT professionals are less focused on solutions to problems where metadata management plays a role but rather look more to metadata management as an overall strategy for the benefits it provides to multiple aspects of the whole organization.


Metadata management is key to the future health of our database systems. Metadata management provides us with the information necessary for impact analysis to help reduce the time, cost and risk associated with change. Reducing the impact of change gives organizations the agility they need. Metadata management also provides the traceability needed to support regulatory compliance initiatives. By being able to trace all aspects of the information from source to transformation to destination to report, organizations can ensure no hidden pitfalls exist that can become punishable offences against the regulations. Additionally, in the case of an audit, organizations can have the auditor in, out and out of the way with a minimum impact on business continuity because the documentation available is complete and accurate. Metadata management has been proven to be a mandatory element of modern architectures, including service-oriented architectures (SOA), by providing a repository of assets aligned to process, making it easy to track what, where, when, how and who is responsible for fulfilling service level agreements (SLAs) to the overall organization’s method of operation.


Data models drive metadata collection, definition and maintenance. Taking an analogy from business intelligence (BI), models are the transactional systems for our metadata management, while the metadata repository acts as the warehouse. Models provide the user interface for the day-to-day transactions, the inserts, updates and deletions of metadata elements from the data dictionary to the physical implementation design details. Modeling aligned with repositories ensures that the most up-to-date, accurate metadata is known and provides the basis for the analytics we will perform. Questions around the impact of a proposed change, the number of instances of a given data concept, what publications and subscriptions of data exist and more are answered by querying the repository in much the same way we would query an analytics server about the knowledge of our business transactions.


Shifting from Modeling to Architecture


Alignment, agility and architecture are the goals. Metadata is the key, and models feed the metadata repositories used to achieve the goals. However, models also provide abstraction to simplify complexity, increase understanding through visual representations and provide governance to increase consistency and reusability throughout the organization. As the need to align business and IT increases, the need for more levels of abstraction increases. A greater need to establish standard practices, procedures and tooling results from the fundamental shift from workgroup opportunistic physical data modeling into departmental and enterprise-wide strategic information engineering.


Enterprise information architecture starts with the high-level business definitions and descriptions, setting standards for data throughout the organization. Standard, common metadata definitions are key to establishing proper communication, including the concept of standard formats for messages between processes. Conceptual data models and canonical data models are the standard representation of this highest level of abstraction, while business process and process orchestration language models provide the context for data. From there, designers add detail to the architectural elements aligned with relational concepts in logical data models and final refinements are made in relational database management system (RDBMS)-specific physical data models, representing the instances of data. While the conceptual and logical data models can be used to define the data heritage from standard definition to implementation definition, answering questions like “where used,” data used in practice also comes from transformation and movement from one or more sources to one or more destinations. It is here that data lineage comes into play - the mapping between source elements together with the definition of any transformation, simple or complex, that make up the path of physical data through an integration, federation or replication server.


Conceptual Data Models


Representing the business architect’s view and the business owner’s perspective, the conceptual data model is independent of both the storage architecture and the implementation technology. As such, it not only maps the core business concepts and relationships between them, it also serves as a single, enterprise-wide standard for the common data definitions and descriptions. It supplies the semantics for the business, the core language that can be mapped to the aliases and synonyms used at departmental, divisional levels. In principal, conceptual data models as a single construct for the enterprise provide a single source of truth that all instantiations will be based on, increasing the consistency of use for common data elements throughout systems easing integration and federation and increasing BI system success.


Canonical Data Models


Conceptual data models can map to any storage architecture, including hierarchical, relational and object-oriented. One practice that has been driven mainly from system integration projects that center on messaging is the development of a canonical data model based on XML. XML, hierarchical by nature, is also the format of choice for defining message construction. Canonical data models are meant to be exactly that - the single true core definition of data constructs so that everyone needing to use the data uses it in the same way. Conceptual data models and canonical data models are theoretically the same, while in practice we see canonical data models as abstractions of XML schemas. Either way, these models are only representing the core data definitions and key attributes.


Business Process Models


Where conceptual and canonical data models help define and describe the core data definition as a static structure, business process models describe the flow of control. As a process is followed, information is used (created, accessed, modified, produced, consumed or destroyed), and that information has a definition. As a case for a separate discussion, we can prove that the organization of data as it is used throughout a process model is fundamentally different than how it is organized in conceptual data models. In short, data in a business process model cannot be directly mapped 1:1 to a conceptual data model, but there is a correlation that we do need to track.


Data definitions in a process can still be mapped to conceptual data model elements, but both need their independent representations as well. This creates some complexity in the overall analysis and an even greater need to invest in the conceptual data model to unify the multiple processes that the data is supporting together. This is the intersection between process and data definitions.


Logical Models versus Conceptual Data Models


Are logical data models just like conceptual data models? Can we not use logical models to satisfy the need for a business view of the data and have a faster, more direct way to implement the database? If we had only one RDBMS platform and one RDBMS instance implementing the data, then we probably could. Logical data models imply an organization architecture, and in nearly all cases today, a relational data model complete with foreign keys and other details that require training, understanding and explanation from a business user perspective. Logical data models also tend to be closer to the development of an application, centered on a project or system and as a result, contain context-specific attributes and other noncore elements that cannot be shared generically across all systems and projects in the enterprise. This dependence on context makes it nearly impossible for a logical data model to serve as a single source of truth in defining the data structures as a standard. To get the true business glossary, one conceptual data model can abstract multiple logical and/or physical models and, as such, provide a common denominator.


Data Heritage and Mapping


All of this builds up to one thing - defining and mapping business processes to make better decisions. We can use this knowledge about our information systems to make informed decisions if we 1) define and describe business processes and map the data definitions to conceptual data models, 2) transform the conceptual data models into logical and/or physical data models and 3) store all this metadata together in a single repository. We can decide how to roll out changes and determine the right changes to support the business. However, to make sure we have the complete picture, all the mappings need to be managed and stored with the data definitions. We have already captured the process to data map and the information telling us the logical and conceptual elements from which each physical element is derived. What is missing are the details about the data itself, and how it has been moved and transformed between systems.


Some of the implementation of data will be in our transactional systems. Other implementations will be in reporting systems and analytics servers. For example, the idea of a customer in an order entry context is transactional, but it is analytic in a data mart. We want to know that there is a dependency between the data mart and the transactional context, and structure alone does not give us that knowledge. For that, we will look to create mappings between physical models. Simple mappings (documenting the end points of any transformation or federation process) give us the basic dependencies and enough to know where there will be impact when a change is proposed, but it is not enough to determine if we have handled the data correctly compared to standards or regulations.


The details of a transformation or federation are found in the systems that implement the data movement, but these are not directly expressed in data models. Some tools can help with that by reading the catalogs of these engines and writing metadata into a repository shared with other (modeling) metadata. Some tools offer visualization of these processes and many require customization. In the end, if we know the sources, the destinations and the path, and we store that information together with the conceptual logical and physical data models, we have the complete depth and breadth of the enterprise information architecture collected and collated into something very useful.


Impact Analysis and Change Management


Business agility is defined as the ability to react quickly and accurately to change. If we need to change a business process to achieve one of the business goals, we can use the metadata collected from our enterprise information architecture and report on all the data elements, where they are implemented and how they are related very quickly and accurately, leaving no element uncovered. Without this knowledge base at our disposal, this analysis can take months and can still be full of omissions. We will change one system and forget that there is a subscription from it that depends on a specific structure to remain stable. We can forget to account for other business processes that reference the changed data, and serious business disruption. The time, cost and risk associated with change cannot be properly assessed, and as a result, we do not make informed decisions.


Alignment is defined as the ability to ensure the work that we do is consistent with the needs of the enterprise. The enterprise information architecture helps everyone communicate and collaborate together on future development. As new business processes are defined, the process managers, data architects and development managers can all look at their respective representations, or models, independently and use the traceability to ensure that everyone is working together on the common goal.


Architecture is defined as high-level planning that shows the overall shape of things to come. Architects design or redesign the overall environment. For our enterprise information architecture, the documentation and definition of what is current for today is really only half the picture. Now, new analysis models can be developed that define the desired future state of the enterprise processes and data definitions, and traceability can be used to roadmap how to get there from here.


Alignment, Agility and Architecture


Knowledge is a source of power. The power is in the ability to recommend change, drive change and succeed with change. Whether the change is opportunistic response to new conditions, requirements, regulations or market conditions, or whether it is a planned redesign, we need this power to succeed. Data models centered around projects and database implementations only serve to silo our knowledge and limit our ability to have accurate and complete visibility to the real impact of any change. Data modeling is moving beyond the database and into the entire data definition, from core concepts in conceptual data models through to all physical instances and to dependencies defined through process and data mappings. This enterprise-wide higher level of abstraction unifies information architecture, aligning not only business with IT, but also IT departments together. Alignment improves the assessment of the time, cost and risk associated with change, as well as streamlines the implementation of change, allowing organizations to respond faster and more accurately to changes. Enterprise information architecture facilitates alignment and agility, allowing us to be superior commanders of our metadata and business as a whole.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access