Currently, most companies with international lines of business, global offices or expatriated operations are embarking on international data warehouse projects, both broad and narrow in scope. Global data warehouse structures, with amalgamations of heterogeneous systems and databases on different platforms spread around the world, are now the norm. With Web-based data movement, mining and analysis, data boundaries have virtually dissolved. The world of data has become much smaller as many of earth's remote areas log onto the Internet for business intelligence purposes. Entities that have expanded into truly intercontinental businesses, with complex 24x7 globally aware data, run the gamut from international manufacturing conglomerates to financial firms to telecommunication providers. Unfortunately, many worldly warehouses and reporting repositories still have hurdles supporting quality global analysis, research, data consolidation, executive reporting and other types of data mining, whether approached from a core business, product or customer-oriented paradigm. The simultaneous distribution and publishing of data to autonomous and far-reaching locations is usually wrought with formidable difficulties. It is important to realize that problems are not only limited to bandwidth and language.
International data warehouse requirements can be extremely diverse. For instance, it must be decided early whether the data warehouse will be primarily for high level decision support reporting or detailed historical data mining and exploration (optimized for statistical or actuarial analysis and drill-through/drill-down inquiries). You must know fully which elements will drive warehouse reporting and data dissections - what the patterns of analysis will be. In other words: Will your global data warehouse or repository be geared toward customers, products, financials or other components? There is no off-the-shelf model, database or application that is 100 percent correlative with the warehousing objectives of international businesses and their data distribution needs. Global repositories will all have to be built so that they reflect how information and data is used in the company. An effective international data warehouse will need to reflect and reinforce the core values of the organization itself. Thus, it is a good idea to get a general understanding of some of the pitfalls, problems and possibilities of international data warehousing, before coming face to face with them at crunch-time.
An eye must always be kept on data latency issues; data is commonly created in one location and then synchronized or replicated to numerous locations throughout the world. The more geographically diverse the systems and resources, the more elaborate the complications. Quality and performance controls are a must when trying to keep data up to date and consistent across countless cities and countries. Although more persistent refresh and replication frequencies will shrink latency and waiting periods for data, greater network bandwidth will be used, requiring increased monitoring and performance-tuning tasks. Rugged scheduling logic and checkpoints will be required in order for a round-the-world user base to receive measures and dimensions that are consistent across their organization's divisions.
An efficacious universal warehouse will bequeath to every global office an iterative feedback loop that tracks the actions, trends and whims of a company's foreign and local customers. Be it billing, shipping, return authorizations, marketing or other segments - all information from day-to-day business operations will relate back to the customer.
People behave incongruously (eating habits, hygiene standards, commuting trends, banking preferences, etc.) throughout the world. However, the international data warehouse should have data elements that are common throughout global locations - ones that track the same granularity, habits, behavior and components of customers. In other words, all behavior should be tracked. This is important in order to effectively spot customer trends and differences per localities. Only cross-country reciprocity and parallel congruency of data will give you a true picture of an entire customer base, helping you create strategies for targeted marketing pushes, speeding discovery of cross-selling opportunities, and boosting the conquering of untapped markets. Today everything is intra-country, from airlines/vacation travel to online dating to MP3 downloads. With an integrated cross-country viewpoint, your organization will start to understand why customers behave the way they do.
If you want to capture true global demographic trends and conduct serious business intelligence (BI), avoid making the mistake of having one warehouse per country or continental region (stovepipe). The goal is to have integrated data from around the world. Robust product lines will always straddle two or more continents, time zones (see Time Zone Issues section), currencies, regulations, etc. For example, a single ocean cruise excursion may encapsulate all of these characteristics in a single day's journey! It is vital to the spirit and architecture of the international data warehouse that shared global data is channeled into a primary repository. From here, all interested parties can be methodically provided with valuable data (via data marts aggregated along country lines and so on) for everything from high-level analysis to customer calls.
Time Zone Issues
International data warehouses require thorough and carefully planned time zone management because most enterprises span multiple zones. As data is synchronized, scrubbed, transformed, distributed and shared, data elements will invariably get out of phase with respect to time. As physical distances increase, problems with real-time and batch synchronization can increase exponentially, meaning that time zone problems need to be addressed in distribution schedules, data models, data storage and replication/integration plans.
Time stamping strategies are often the best methods to use in order to overcome problems of time zone processing differences, lending a helping hand with tracking when a transaction occurred or became valid/invalid. It is not uncommon to use three or more time stamps in order to track data movement from the main repository back to the source systems of record. For contemplation, consider the following time stamp fields that may occur in a warehouse fact table that models global transactions: add_timestamp will capture the local time zone date/time that the transaction was added to the main warehouse; batch_timestamp will capture the time zone date/time that the batch that loads the main warehouse started; and source_add_timestamp will contain the time zone date/time that the transaction of record took place in the source system. This sort of approach can be extended, scaled up or scaled down. Most financial measures should have multiple time stamps, or multiple surrogate foreign or primary keys that connect back to a verbose DATE/TIME dimension table. This table will track many things beyond simple date and time constructs. Holidays, Julian dates, days of the week and financial quarters can all be included in the mix. You want to avoid complicated SQL commands when navigating through layers of time, implementing most of this time stamping and date calculation logic during ETL extracts, not during end-user queries.
This is complicated, and this article covers only the tip of the iceberg. International time zones are terribly problematic. There are countless geographic regions with distinct time zone rules; even in the U.S., parts of Indiana and all of Arizona do not recognize daylight savings time! Today, many companies that want to cut offshore risks have implemented "near-shore" alternatives in places such as Canada, where customer service remains in the same language and time zone as corporate headquarters.
Financial data will be a common denominator for your organization at every global branch or office. Money will always be a common measure to all corporations worldwide, and tracking the historical movement of money will prove to be a major challenge. A truly global data warehouse will require careful attention, translation, measurement and adjustment of many different currencies in both data and budgetary realms. Therefore, a data steward is required - one who has the power and means to enforce all enterprise data standards, locally and internationally. Most high-level executives will want to see reports that track profitability of products and services across warehouses, business divisions, suppliers, customer demographics and more. Once international demographics and global supply chains are introduced, old ways of slicing and dicing financial-oriented data may become irrelevant. An effective data warehouse project leader will be able to spot potential caveats, such as new layers of financial measures or dimensions, and instruct warehouse modelers and developers as to the appropriate actions required to manage these kinds of changes.
Because currency exchange rates fluctuate on a daily (minute-to-minute) basis, clean and easy apples-to-apples comparisons of U.S. dollars to Euros or Yen may not be possible, especially with systems that deal with data on an intraday basis. Tracking the profitability of products in varying markets will fall short of expectations unless data stores and currency tables that contain detailed exchange rates and valuation dates are properly integrated into the general warehouse or operational data store. Users will want to see the base currency and "home currency" for each transaction. Many currencies will be tracked against other currencies - the simplest being home currency versus the single currency of the trade/deal/transaction - using parallel fields for each denomination in the appropriate warehouse tables. Thus, if a transaction took place in Japan (in Yen), multiple fields that represent the event would have both U.S. dollars and Yen denominations that communicate up-to-date or restated exchange rates. Be aware, however, that the location of the transaction does not always unequivocally define the currency of the transaction. Many financial events such as currency swaps and spots will fall into this category, making it more laborious to correctly portray the financial picture of your business.
There is another project management issue concerning currency that is often overlooked. Once an agreement is made about which profit center (or whose budget) will fund the international data warehouse, managing and tracking the costs of the warehouse project may be done in multiple disparate currencies. This scenario can sometimes turn ugly. Your team may complete the specified work under budget or be way over budget depending on the currency "pegged" to your individual project components. Exchange rates are volatile and can move in either direction quickly. On large projects, it may be a good idea to limit risk by hedging your project budget - procuring future or forward contracts in the currency of relevance. Also, make sure the accounts receivable department knows the exact exchange rates when bills and expenses for the warehouse project are paid to vendors locally and abroad.
Cultural and Country Conundrums
Be aware of cultural and language inconsistencies and barriers; they will constantly affect how your multinational data warehouse is configured, managed, implemented and maintained. Language and cultural differences will many times vary from place to place. You will sometimes be carefully managing collaborations where lack of intellectual property laws, the dearth of English language skills, and different legal and regulatory environments are certified project risks. Unique permutations of job roles (from liaison to translator) will exist; you must quickly come to know your data audience and the language ability of users in each data stream. Hence, one can see the wisdom of a sophisticated meta data and data dictionary plan. Will your home office be able to effectively communicate data with rest of world and vice versa? To make these projects successful, there must always be a home base presence on site that can steer service level agreements and make sure ISO 9000/9001 certifications and standards for measuring quality in all areas are in place and being met.
There will be predicaments to face while maintaining enterprise-wide data quality in any single language. In systems that are unable to recognize international characters, the inclusion of non-native data can give you unexpected results and undermine existing data integrity throughout the enterprise. Extensive requirements for ETL transformations may be the norm in such situations where intense language barriers and substantially heterogeneous systems implementations exist. Tools should support the full Unicode standard character set, defining many of the world's languages in a single file and encoding scheme. Both Unicode and double-byte characters must be understood. If your data is sourced in Asia and the Middle East (where multiple character encodings are prevalent and localized language and data requirements vary dramatically), multicharacter support is necessary. Data warehouses must support the Unicode standard and allow for cultural variations in the data.
Language and semantic issues will abound in applications (user GUI and programming logic), models and data elements. Financial analysts, manufacturing resources, distribution and retail employees will all be talking differently about the same products, methods, customers and other concepts. They will be using different measurements to describe product units. The manufacturing people will want to see the universe in pallets, distribution may wish to view data along the lines of various sized boxes, and retail clerks only see things in individual pieces that they can scan. Firm-wide data standards and nomenclature, enforced by an international data steward, must be established. Showing quantified product facts in a single standard unit of measure will be important. One possibility is to build data marts with the local unit of measure from conversion formulas built into fact tables residing in the data warehouse. It is important that applications have a consistent way to convert shared data into specific and idiosyncratic perspectives. The key is to give the users the ability to share the same core information, and then let different business units (marketing, customer service, operations, etc.) use that information - but in different ways - with different patterns of analysis, slicing and dicing.
You will also have to deal with corporate political issues - autonomous global offices may not eagerly follow corporate standards or best practices. These locations may not want to allow direct access to "their" data. When shared data definitions are proposed, problems could ensue and standards may never get promoted. There may have to be layers of meta data - centralized meta data and local meta data. This will impact implemented architectures, as important yet basic units of measurements will differ.
If your data warehouse has infrastructure components that are far offshore - in a less developed country - you will have to make sure you can quickly procure network and systems hardware and software in a jam. Will what you need be available as soon as possible and priced competitively? All licenses will need to be valid (legal) and current in order to get continual just-in-time support. Also, political considerations must be noted. In many countries, there tend to be headaches with hardware and software consistency. Prices and support for both software and hardware may fluctuate greatly due to price and import restrictions, and changing laws and trade agreements can affect the availability of software modules and hardware.
You want a solid return on the investment in your international data warehouse. Often, you can build on top of existing systems, incorporating a mix of legacy systems and components into the shiny and new consolidated data warehouse infrastructure, offering (within budget) flexible real-time data mining and analysis to user communities around the planet. With proper analysis, design, implementation and risk management of the global warehouse, you should have a system that will not only provide the organization with sound data on all aspects of its business, but one that will help shape critical enterprise decisions and future core company values.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access