During the course of my 24 years in the world of data architecture, I have witnessed a definite decline in people's understanding of what data architecture is and why it's important. This spells trouble for companies in the long run – not just because everyone increasingly relies on information, but also because business value and competitive advantage are created from that information. If you don't have the data architecture right, it's hard to get anything else right.

In this column, I want to discuss the scope of data architecture. In subsequent columns I will address planning and design issues that the data architect is expected to solve.

What is Data Architecture?

In the IT world, "architecture" refers to an orderly arrangement of parts. Thus, data architecture is an orderly arrangement of parts to do four things: organize, store, access and move data. Of course, a typical solution involving corporate data does a lot more than four things, but every action performed by the solution fits into one of these four categories. Think of these categories as the four suits that make up the data architecture "deck."

Organize is the "heart" of the deck. The heart refers to logical data designs that define data elements, organize them into records and organize the relationships between records. In today's world, there are three broad categories of organizing approaches: flat files (often hardly organized at all), normalized and dimensional. The data architect works through a series of design levels in determining how data should be organized for a production system. Various design levels include conceptual data models, logical data models and physical database designs. Key to the concepts behind normalizing data is a need to organize data in such a way that the data design will endure despite changes to business processes. However, today's emphasis on analytic systems is revealing the need for greater usability, which leads to dimensional designs.

Storage, symbolized by the spade, provides the underpinning for the architecture. It refers to physical data designs that define data location. This includes physical designs to address data distribution across replicated or partitioned data stores – as well as across geographical locations, multiple servers and multiple disk storage devices. It includes designs for archiving and retrieval of older data.

Access is the jewel of data architecture, thus represented by the diamond. It refers to process parts of operational systems that organize the way data is accessed. In the earliest days, every computer program accessed data directly. However, more complex applications introduced a need to isolate functional logic from technical logic. This allowed highly skilled programmers to focus on their areas of expertise, whether functional or technical, because it was impossible for individual programmers to be experts in all aspects of programming.

An access layer was introduced which was sometimes called I/O modules and sometimes called DBAMs (database access modules). Today this layer may be a component called information services or something to that effect. ODBC (open database connectivity), for example, can be considered an information service. Whatever you call it, the data architect must provide guidance as to whether the access layer of the architecture will be thick or thin, smart or dumb, functional or strictly technical, secure or accessible. This aspect of data architecture is also concerned with application performance and whether data accesses are appropriately designed for delivering the performance levels required for the application.

Move is the gathering of data into locations where it best serves the user community – represented by the club in this metaphor. "Move" is the aspect of data architecture that addresses processes and techniques for taking data that's in one place and putting it another place. This encompasses decisions regarding batch data movement and file transfer, data replication and/or synchronization, and EAI (enterprise application integration) versus ETL (extract, transform and load).

Significance for Today

Why is data architecture so important for 21st century business solutions? DM Review readers certainly know the value of data modeling and database design, but what is the value in thinking of the other suits as integral parts of data architecture?

In today's information society, we all participate in the drive to turn knowledge into power. We want to turn data into information and then information into results. To do that effectively, it is not enough just to organize the data. We also need to store it, move it and access it in the most optimal ways possible.

In fact, in today's business solutions, any one of the four suits – organize, store, access and move – can be the distinctive and differentiating factor in bringing real value to the solution. Any one of the four can be the ultimate "trump" suit.

The data architect needs to be conversant with all four suits, ready to produce the trump suit of the day.

In the coming months, I will address the use of these four suits for meeting the most important needs of our business and technology solutions today. Next month I will discuss enterprise data strategy.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access