BACKGROUND: The Year 2000 U.S. decennial census will survey approximately 270 million Americans in more than 100 million households. Responsibility for tabulating the results producing 640 statistical tables, some of them for as many as 10 million geographic areas, in a period of six months falls to the Census Bureau's Data Access and Dissemination System (DADS). The job is complicated by the as-yet-unresolved controversy over application of statistical adjustment techniques to correct undercount and double-counting problems. For the Census 2000 dress rehearsal currently underway, DADS must prepare to perform both adjusted and unadjusted tabulations.
PLATFORMS: Dress rehearsal tabulations are performed on a dual processor Compaq ProLiant running Windows NT. This server has a RAID disk configuration, one gigabyte main memory and a backup machine dedicated to analysis. Tabulations will likely be moved to the data-preparation and output- processing machine, an IBM RS/6000 SP running AIX, for the Year 2000 census.
PROBLEM SOLVED: The Census Bureau chose a unique off-the- shelf package, SuperCROSS from Space-Time Research (STR) of Melbourne, Australia, as the Census 2000 analytical tabulation engine. The choice saved an estimated year of development work, which allowed DADS to focus on data quality concerns rather than coding.
PRODUCT FUNCTIONALITY: SuperCROSS is the core of an integrated suite, the SuperSTAR system, that also includes the SuperMAP geographic information system, SuperCHART for analytical graphing, SuperTABLE for dissemination of summarized data and the SuperMART Builder. SuperCROSS belongs to the OLAP category of analytical tools with an underlying multidimensional data model. As befits a tool designed for analyzing demographic statistics, geography and time have special meanings to the system, which also permits multiple hierarchies in a given dimension. Formatted analytical tables are composed via a graphical interface that allows fields dimensions from the source database that may be recoded according to user criteria or transformed with user-defined functions to be placed in row, column and wafer axes of a table with nested and composite dimensions permissible. Tabulations can be run interactively or via a production module that creates a job file comprising the cross product of one or more tables with one or more hierarchical dimensions that potentially represent thousands of statistical tables. The client/server monitor manages job submission to a remote server, which can protect access to record-level data; smaller jobs against nonprotected data can be run locally on the user's desktop.
STRENGTHS: SuperCROSS is a proven solution that has been used, albeit for smaller-scale problems, at a number of national statistical offices. The data model and GUI are well-suited for analysis of demographic data.
WEAKNESSES: SuperCROSS currently runs only on Windows and NT, which offer limited scalability, security and manageability. This shortcoming will disappear with the late 1999 delivery of SuperSTAR II, which will run on leading UNIX platforms in addition to Windows. This next generation release will also offer a Web interface and a scripting interface.
SELECTION CRITERIA: The essential criterion was the ability to accurately compute a very large volume of statistical tables in a very tight time frame. SuperCROSS' performance in evaluation trials far exceeded that of the leading statistical analysis engine and the leading RDMBS, even when running on large multiprocessor systems and programmed by expert consultants. SuperCROSS provided a unique advantage in being the only available tool with a suitable, built-in GUI.
DELIVERABLES: Building around SuperCROSS, the lead DADS team was able to deliver a documented processing system for production of Census 2000 dress rehearsal data products on schedule and under budget. Load testing and redeployment to a UNIX platform is planned for 1999 in anticipation of Census 2000 data availability in the fall of 2000.
VENDOR SUPPORT: The vendor, Space-Time Research, is a small company located literally on the other side of the world from the United States. The company's size and location have positive and negative aspects. STR is nimble and responsive and questions can often be answered overnight. Conversely, support during the U.S. workday won't be available until the anticipated opening of a U.S. office and conclusion of a support agreement with a U.S. company.
DOCUMENTATION: STR provides comprehensive printed and electronic (PDF) documentation. New software releases are often available for Web download.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access