RFID technology is conceptually easy to understand. So is data warehousing, and we know there's a lot to data warehousing. RFID is a tagging technology that uses electronic transmitters, or backscatters, containing (usually) a standard code representing the item that it is attached to - whether that is a shirt, an automobile, a cow, a bird, a computer, a pallet of goods or a passport. The tags are read by readers, which are usually in a fixed location.
Hence, the combination of the item, the location, the time of the read and sometimes other factors, such as temperature, provides the meaningful information a company seeks from the use of RFID. The clear advantage is that, through well-placed readers and inference, the company knows close to or exactly where the items are at any point in time. However, the information will be useless unless - and until - there is information access and business knowledge of how to process the information.
The infrastructure setup process is its own discipline. Take, for example, a grocery store. Store chains could issue shopper cards linked to a credit card to create a personalized experience as well as a line-free checkout by breezing through the reader at the end of the shopping experience. It would be erroneous to think that reader saturation is necessary for effective RFID applications. Only key locations such as entrances, exits and pickup areas are required to enable this experience. The many setup factors include the practical problems of "overlap" (multiple readers picking up a tag) and "nonresponse" (no reader picking up a tag). Overlap is an example of an issue that is more prevalent in shipping and receiving areas, where readers are concentrated in high-pallet traffic areas.
Tag styles can be passive, semipassive or active, as I discussed in my October and November 2006 columns. And, obviously, there are tag reader choices as well. There are also antennae design decisions. And tag generation and ensuring accuracy in printing are not to be taken lightly. RFID tag generation and printing is a major industry in itself.
Once we are past the setup, it's time to deal with the iceberg under the tip - and that is the information onslaught. The amount of information uptake into a corporation when RFID is implemented can be unprecedented. The largest data stores in the world soon will be in manufacturing and will comprise mostly item movement data.
Consider 150,000 items on a retailer's store shelf and a reasonable number of readers to check on the major item movement conditions, and expected traffic comes to two gigabytes of database management system storage per day per store. A 1,000 store chain would generate about two terabytes per day - although the records would be fairly small. Writebacks have yet to emerge commercially, but that also stands as a potential data exploder.
The Electronic Product Code Information Services (EPCIS) standard has been recently approved. EPCIS is a way for partners to share information utilizing EPC as the product master. In keeping with the spirit of item-level tracking ability, EPCIS is more granular than its predecessor, GDSN. EPCIS supports push-and-pull models, including, in some cases, open queries for supply chain partners, and paves the way for more data explosion if the retailer decides to track their stock keeping units (SKUs) before they are moved to the retailer's distribution center. This tracking occurs through the manufacturer's EPCIS network. Size estimates on the information to handle with the RFID information architecture increase exponentially.
The SKUs would be placed into cartons, pallets and lots, each of which would be tagged as well. This brings me to the four pillars in the information architecture to support RFID - master data, operational business intelligence (BI), concentration and analytics.
Figure 1: Do you envision your company offering RFID products/services in the next three years?
EPC is an emerging standard for assignment of codes for RFID reading. It consists of 96 bits to uniquely define the item. It's broken out like a UPC code with a category and subcategory; but there's also room for a serial number to uniquely identify each item. The existing specification for this code will account for most every physical asset on the planet worth tagging for the foreseeable future. It's important to understand that any reader can read any tag unless cryptography has been applied to the tag.
By any criteria in determining if an external standard should be used, EPC should be used as the RFID product master in a master data management (MDM) environment, hence the interaction of RFID data with MDM. The EPC standard is huge, but what would be interesting is a scaled-back version including just the products the company interfaces with.
Operational Business Intelligence
In my regular column "Building Business Intelligence," I have been writing about operational business. RFID information management is an excellent modern example of the need for operational BI.
In RFID context, this is where operational actions are processed in real time. Most reads will be normal business and not require much intelligence. For example, at a ski resort, a tag read may compare the tag number to valid tags issued that day and open the gate if the tag number is valid. The American Express Blue Card will send an authorization transaction for the card number to its authorization system. A store might tackle the operational issues of theft, shrinkage, temperature sensitivity of certain products, replenishment, etc. Customs may check the person, via his passport, against a terrorist list.
This pillar will require intelligent deduplication of the streaming location data as well as intelligent matching of relationship data such as what truck was carrying which lots based on proximate timestamps at the same reader location.
Some actions may require other data and, of course, that data must be available to the operational action. The key here is limiting the analytics to what will process in real time, considering the data flow issues if other tag reads would actually be part of the analytics. For example, alarms going off if a product is at the "out door" reader without going through the "register" reader requires the tracking of recent register reads in an operational database for SKU comparisons. This recognition must occur before the shopper or thief gets too far beyond the store doors.
This database functions similarly to the style of business activity monitoring (BAM) databases. BAM databases contain temporary data on a high-speed retrieval infrastructure. Some of these databases contain a vast array of trigger database objects that spawn a number of actions related to incorrect shipments, items and other actionable alerts. Others will be faced with dashboards with drill down and other controls which add latency, but allow users to monitor and control business processes in real time.
For some RFID implementations, this operational BI is all there is. There's enough ROI right there. However, other implementations will want to go much further and concentrate the data in the data warehouse to enable analytics - and possibly on to the enterprise resource planning (ERP) and other systems.
The reader-to-data-store ratio could be hundreds to one. This would be on the extreme upper end of concentration processes occurring today. Many data warehouses have been built with singular ERP primary source systems. Multistore corporations currently receiving point-of-sale data from 100+ stores will understand the rigor involved in the RFID extract, transform and load (ETL) process.
However, before pulling the juggernaut of data into the data warehouse, the cost of managing that data needs to be pitted against its analytical value. It's highly possible that the data needs to be summarized.
RFID concentration is where the big data decisions are made. After operational BI is applied to the RFID transaction, a decision must be made about the value of the data in a post-operational, analytical data warehouse. Recalling the data volumes previously mentioned, many current data warehouses trying to harness detail data would be beset with scalability issues. There must be further need for the information beyond the operational BI layer. I estimate that adding all RFID data to a non-RFID data warehouse in the retail industry will enlarge that data warehouse by at least fivefold.
In such an environment, there will always be missing information in the data warehouse. RFID data warehouses must be supported with strong metadata describing this missing data to its users in a reader-day exception format. Catch-up mechanisms for acquiring missing data when it becomes available must also be put into place.
One manufacturing data warehouse being built will comprise location data with a nearly sole purpose of discovering flaws in the design process. Through the needed 500TB data storage and the use of a seemingly mundane tracer through the data, its early flaw finding will more than pay for the cost of the storage.
This is an example of the nature of the data warehouse in not only RFID applications, but also in many future applications where operational business intercepts requirements. However, long-term historical storage of clean information for regulatory, compliance and data mining requirements will continue to increase. The data warehouse will need to perform when called upon for queries, mining, strategic dashboarding and reporting. However, concurrency requirements may not need to scale.
By taking a perspective of RFID data well beyond the capabilities of operational BI, a company will be enabled to do a much deeper analysis of trends. The longer-term, larger benefits will eventually accrue from smart business decision-making by looking at product (and customer) movement patterns and learning to intervene appropriately into business processes.
Information management is the solid foundation upon which RFID will fulfill or fail on its promise. Unprecedented data volumes will need to flow appropriately through the pillars of master data, operational BI, concentration and analytics. Only when RFID information management is sound can RFID data properly flow to real-time operational actions and downstream analytics and enable the true success businesses are seeking with RFID.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access