Implemention Technology
Business Intelligence
Data Administration and Acquisition
Warehouse Engine
  Compaq Alpha 4100

CalFarm is the largest provider of insurance to small and medium-sized farms in the state of California. The company provides small business, automobile, homeowner and health insurance coverage to farmer and non-farmer alike. Recently acquired by Nationwide Insurance of Columbus, Ohio, CalFarm writes about $230 million in premiums each year. In 1996, the company decided to improve the information available within its organization by initiating a data warehousing project.

CalFarm wanted the ability to delve more deeply into the underlying economics of its business, because in the insurance business you don't know how much your product costs when you sell it. You only find out several years later when loss information becomes available. Actuaries look at claims for prior years, so they can predict what future losses will be. That's how they determine reserves and appropriate pricing for current policies. In addition, CalFarm wanted a more effective way to evaluate the business done by both captive and independent agents and to provide better information for external and regulatory reporting requirements.

To take care of CalFarm's information needs, our team built a data warehouse. It includes all premium, billing and loss transactions in CalFarm's property and casualty lines since the current mainframe policy entry system was implemented from 1985 to 1987. That's about 40 to 50 gigabytes of raw data, which is over 100 gigabytes when indices, summaries and downstream data marts are included.

From the outset, our idea was to give users point-and-click access to the data. We didn't want users writing programs with various utilities and languages to access information from the mainframe. We wanted to go beyond just making the information available to making it easily available. This meant moving to a LAN environment.

After evaluating many products and solutions, we chose BrioQuery (Brio Technology) as the query tool and Oracle8 as the database engine. Windows NT was chosen as the database platform because of the need for dedicated processing capability, decreased DASD and administration costs, and the targeted size for the data warehouse. A Compaq Alpha 4100 was chosen as the initial hardware platform.

When our team began transferring the data from the company's mainframe systems to servers on the LAN, we immediately ran into several problems. Often we found something unexpected in the mainframe data that had to be corrected before we loaded it into the Oracle database. For example, we combine data from two systems ­ one system uses a one-digit form code and the other a two-digit code. Instead of waiting for mainframe programming changes to be implemented to make the form codes consistent, we wanted to make these kinds of changes on the LAN with the data we had so we could load it faster and deliver it very quickly. Afterward we could alert the mainframe team to make appropriate programming corrections the next time they ran the data warehouse staging jobs.

Unfortunately, we didn't have the right tools to do this work easily. We would have had to develop C programs, which would be very time-consuming. We then remembered SyncSort (Syncsort Inc., Woodcliff Lake, New Jersey), a sorting utility that provided both speed and an excellent set of utility functions for similar tasks on the mainframe. We found that SyncSort provided mainframe-strength performance and features on both Windows NT and Alpha NT, and also included an easy-to-use GUI. SyncSort proved to be the perfect tool for a wide variety of data warehouses tasks We use it to:

  • Manipulate data for special tasks such as making form codes consistent and adding fields to restrict access to employee-owned policies.
  • Sort load data to improve load performance. It is more efficient to sort the data on the LAN servers instead of within DB2 on the mainframe.
  • Sort extracted data on NT to free up the mainframe while still getting mainframe-strength sort performance.
  • Sort on the join keys to provide better warehouse performance.
  • Copy and merge data to reduce resource usage. SyncSort's copy capability lets us manipulate data without sorting and merge lets us avoid resorting.
  • Exploit parallel processing. SyncSort can run many sorts simultaneously with no degradation in performance.

Once the data warehouse was up and running, CalFarm found more applications for it than expected. That's primarily because CalFarm implemented new front-end systems over the past year, moving from an old mainframe-based system to a newer client/server version. The legacy systems did all their own downstream reporting; but the new systems don't feed the old, so CalFarm now has two separate data sources.
Although CalFarm could have forced the new data into the old reporting system, it made more sense to use the data warehouse for reporting. Since the data warehouse was reading both sources of data, the warehouse became not only an offline system against which people ran queries, but also the company's primary reporting mechanism. The data warehouse is now becoming the system of record, doing scheduled LAN and mainframe-based financial and claims reports that go out on a daily, weekly, monthly, quarterly and annual basis.

In addition to standard reports created by IS, CalFarm has a pilot group of twenty users, primarily actuaries, who use BrioQuery from their desktops to run queries against the data warehouse. They use the information to answer numerous business questions, perform analysis tasks, build "what-if" scenarios including profit center planning and forecasting models, and establish premium rates and the amount of reserves that the company needs to cover expected losses.

The data warehouse is also opening new opportunities for CalFarm to improve its position with agents who sell its policies. Our warehouse team is beginning to provide CalFarm agents with access to the warehouse using an extranet. Agents can run queries on individual policies, look at claims and retrieve driver and vehicle information on an automobile policy or specific rating information on homeowner policies. Our target is to provide agents with full billing information as well, so when customers call, the agents can have all the billing and payment information at their fingertips.

In fact, its data warehouse is an integral part of CalFarm's future plans. Internally, it provides CalFarm with a better understanding of its customers and the marketplace, giving it the ability to make better decisions and manage its agents better. The data warehouse also makes it easier to do business with CalFarm. If CalFarm can provide agents with better information easily, the agents will sell CalFarm's products instead of those of another company. CalFarm's data warehouse puts it ahead of the curve of its competition.

Warehouse Tips

  • One of the best ways to reap the rewards of a data warehouse quickly is to build the warehouse in a modular fashion or in related data marts. However, these components need to be developed using an integrated corporate data warehouse architecture.
  • Providing warehouse data to end users through easy-to-use graphical tools in a LAN environment allows the speediest, most convenient access.
  • Try to make the warehouse development team as autonomous as possible, so you can deliver data quickly and efficiently. For example, we use tools such as Brio to optimize the query process and SyncSort to alter data on the LAN quickly, postponing permanent changes on the mainframe until they can be scheduled efficiently.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access