|Business Intelligence |
|Data Mart |
|Warehouse Engine |
Compaq 4100 Server Cluster
Dell 6100 Web Servers
At MCI WorldCom, a worldwide telecommunications provider, increasingly intensive competition in the post-deregulation marketplace makes it imperative for us to find better ways of serving our customers. Part of this process involves identifying trends in the international call detail records (CDRs). To accomplish this, we required at least 180 days of information accessible for reporting. In six hours, information must be gathered, enhanced, appended to previous data, stored and managed and the resulting query environment made available to nationwide users in a manner allowing for complex queries to be executed in seconds or minutes.
According to Darel Stokes, senior software architect, "Relational technology has made great strides in data retrieval performance, but the fastest relational databases in the world are still too slow to scan the vast amount of International Traffic Data Warehouse (ITDW) data and retrieve the necessary information in an acceptable amount of time." Stokes continued, "After realizing this, we abandoned the idea of a relational database and began searching for a technology that would meet our needs."
We created a data warehouse, (ITDW), with two repositories. The first, a conventional relational database, would store all CDRs in their enhanced form and be searched at a much slower rate than the higher performing data mart. In the second, summarized CDR data representing the switch traffic would be used as the main reporting mechanism and would bear the brunt of the reporting work.
The application involved receiving CDRs from multiple sources and verifying, reformatting and enhancing the data prior to loading it simultaneously into the ITDW for archival purposes and into the QueryObject data mart for near-time analysis.
The ITDW consists of four components:
Data harvester receives CDRs from the switch infrastructure, enhances them into load files for the data warehouse and data mart, and stores a few weeks of data for retrieval should the mart or warehouse platforms fail. It consists of two Compaq 4100s configured in an available server cluster, redundant RAID controllers and twin path disk drives configured in RAID5 sets.
Cube creation engine a Compaq 4100 running the QueryObject cube build process. Its main duty is to generate cubes from summarized data received from the data harvester processors. It hosts the sort/merge functions during report generation prior to moving to Microsoft Excel for final report preparation.
Cache server a four CPU, 512MB Windows NT 4.0 Server hosting distributed common object model (DCOM) objects that perform remote data object (RDO) communication to the QueryObject Systems (QOS) ODBC driver. This processor hosts a large disk farm that contains fractal cubes created by the cube creation engine.
Web servers host the Java front end allowing customers to interact with the ITDW system. These servers also contain DCOM objects that receive the report descriptions from the Java interface, schedule work, process communication to the cache server via DCOM and output the final report to the user's directory in Excel format. The Web servers are Dell 6100s running Windows NT 4.0 and Internet Information Server 4.0.
Our data warehouse runs Oracle as its database engine for storing CDRs. QueryObject System runs as a parallel analytical data store feeding current data to users for timely analysis. Intranet-accessed reports from the system are generated using a Java-based query tool that speaks directly to the QueryObject open ODBC interface, enabling the user to pull and sort the data required for analysis.
We evaluated many technologies in our search. Most had limiting factors that disqualified them from our solution set until we found QueryObject Systems. The critical features that the fractal cube provided us were:
High-speed data access. Query solutions designed around the cube are orders of magnitude faster than the same solution modeled on an RDBMS platform. MCI WorldCom challenged our best relational database designers to implement a schema to compete with the cube. We tested our solutions against the cube and found that none of the relational solutions came within hours of the completion time of the cube queries.
Flexibility. The cube is accessible from UNIX and Windows NT (server and workstation).
Economy of size. The fractal cube allows the ITDW to store weeks of data in 33 percent of the space required by a normalized relational model. (A relational model was built to substantiate this claim).
Support. QueryObject Systems provided excellent pre-sales support rivaled only by their post-sales technical support. According to Stokes, "In my seventeen years of computing experience, I've never seen better customer support and speed of problem resolution."
The fractal cube also has inherent limiting factors: the cube is read-only. This is not a technology to be used as a transaction-processing engine a non-issue for us, as we required read-only data. The solution set of answers is limited to a billion billion answers. This seems large, but becomes a design constraint because all possible answers are calculated during the cube creation process.
QueryObject System allows ITDW to support hundreds of millions of new CDRs per month. Captured daily, they are processed and merged with previously captured data, and the resulting aggregate cube is hosted on a Web server for analysis. The initial QueryObject-based system was specified to handle just six months of data, but it worked so well that the specification was revised upward and is now running at over 14 months of data, providing a detailed historic view of network usage trends.
Ad hoc, on-demand reports containing hundreds of complex queries are generated, typically in less than three minutes, over a corporate intranet. Web query speed is consistent, whether the question asked involves a single day's data or the full 14-month aggregate. Prior to installing the QO-based system, users had online access to only one week's data and query times could run as much as 20 minutes. Data from previous weeks was only accessible through an overnight query process.
The benefits have been tremendous and include improvement of load balancing, better analysis of network usage, improved fraud detection, next-day analysis of routing changes, more timely attention to network issues and better trending due to larger data samples.
All this has been achieved with a system that runs on ordinary hardware and provides access to the analytical environment through an industry-standard interface. The first QueryObject data mart was implemented by just six dedicated MCI WorldCom personnel in less than two months. Several more QOS data marts have since been constructed, providing alternate views of data. "The QueryObject data mart empowered MCI to make better business decisions based on timely, responsive access to all the relevant business information," said Stokes. "The bottom line is the exponential increase in productivity and corresponding return on investment."
Stokes recommends the following guidelines:
- Pick reliable vendors who want to partner in a successful venture, not just sell hardware and/or software.
- Don't be afraid to look at the unconventional solution when "accepted practices" are not sufficient.
- Prototype the solutions as much as possible.
- Know your data! (If no one on your staff understands your data, find a "domain expert" who can provide insight.)
- Proliferate your business knowledge to the entire team. (Nothing is more frustrating than requiring business knowledge that can only be attained from someone who has called in sick.)
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access