Continue in 2 seconds

Scalable I/O, Part I: Disk Striping

Published
  • May 01 1998, 1:00am EDT

Denormalization and advanced indexing techniques are highly effective for improving the performance of your data warehouse and other applications. But, regardless of how effectively you use these techniques, you will still need to frequently execute large and complex queries. To get good query response times, you need (among other things) high performance and scalability out of your I/O subsystem. By using scalable and parallel I/O techniques, you can push your system components (disks, controllers, CPUs and memory) to the limits of their capacity, thereby fully utilizing your hardware's capabilities.

I/O and Database Scan Threads

One of the key factors in a database I/O discussion is the disk reader process, commonly referred to as a "scan thread" or "scan process," which is responsible for getting data off of a disk and sending it to a CPU. It is important to realize that at any given point in time, a scan thread can only read from a single disk and can only ship that data to a single CPU. In addition, each disk can only serve a single request from a single scan thread at a time. Therefore, if we have four CPUs in our system, unless we have four scan threads and at least four disk drives, all four CPUs will not be able to run simultaneously--at least one would be idle.

Our goal is simple: keep data flowing to all CPUs at all times. If we were to put all the data on one disk, then only one scan thread could be active at a time. Therefore, we need to spread the data out across multiple disks. There are two complementary techniques for doing this: striping and database partitioning. This article will cover striping; next month I'll cover partitioning.

The Basics of Striping

The process of taking a pool of data and evenly dividing it up across a set of drives is called "striping." To illustrate, let's look at a data warehouse that stores information pertaining to foreign automobile sales for the United States. Assume that our platform has four CPUs. Also, for simplicity's sake, assume our warehouse only has information on Mercedes, Porsche, BMW and Volvo, and that it has roughly an equivalent amount of information on each type of car, all of which is stored in a single table called "Car_Sales." If we were to put all the information on a single disk drive (assuming for a minute that it would fit), then only one scan process would be able to read the table at a time. The optimal solution is to spread the data (that is, stripe the data) across at least four disks as shown in Figure 1.

Now, when we query Car_Sales, we can use four scan processes, and each process will read one-fourth of the total table, completing a query in approximately one-fourth of the time it would have taken to scan the table without parallelism. We can continue extending this principle even further by adding more CPUs, striping the data over more disk drives and using more scan processes.

Striping can be performed by either the hardware, the operating system or the database. Each one has its benefits and drawbacks.

Hardware Striping

This method of disk striping involves purchasing specialized intelligent disk array technology which includes additional hardware that automatically handles striping the data across the multiple disks in the disk array. To the rest of the system, this disk array usually looks like a single (albeit very fast) disk drive that has the ability to simultaneously handle multiple I/O requests. The striping is usually done in a round- robin fashion, which means that chunks of data (usually 32K to 64K each) are distributed to disk drives similar to the way a card dealer deals out a deck of cards. The benefit of using this technique is that data is spread evenly over many physical devices, balancing the I/O load across all the disks. This, therefore, minimizes the risk of having disk "hot spots" which occur when data is requested from some drives much more frequently than others. Another advantage is that these intelligent disk arrays also automatically handle various RAID levels.

However, the hardware striping solution is usually the most expensive method of achieving disk striping. Also, the resulting parallelism is not necessarily what you would hope for. In general, to maximize I/O throughput, you always want to have the disk head move smoothly across the disk in one continuous motion, streaming the data back to the system as it goes. Unfortunately, because the database is unaware of how the data is actually striped (remember, with hardware striping the striping is intended to be transparent to the system), the I/O requests issued by a single scan thread will almost always reference data that is on multiple drives. This problem affects each scan thread, so all the disks in the array are constantly satisfying requests from multiple threads. The disk heads will have to constantly seek back and forth, significantly lowering I/O performance.

Operating System Striping

Operating system striping introduces the concept of a "logical volume group," which (similar to hardware striping) appears as a single disk device to the database. However, it actually consists of pieces from multiple physical disk drives logically grouped together to give the appearance of one physical device. The data is distributed across the pieces of the various disks in a round- robin fashion. As with hardware striping, this approach removes hot spots, and since there is no special hardware required, this solution is cheaper. However, more CPU resources are needed to manage the logical volume group. Also, it suffers from the same head-seeking problem I discussed earlier, since a scan thread can only issue a request to the logical volume group, not to a specific disk within that group (see Figure 2).

Database Striping

Of the three striping techniques available, database striping is the easiest to employ and offers the best performance when there is a smaller number of concurrent users (than in typical OLTP applications) running parallel queries. Using this technique, a database table is divided into a number of sections (called "fragments" or "extents"), and each section is assigned to a specific drive. The database then has the ability to assign a single scan process to a single fragment/extent and, as a result, be assured that the head seek movement will be minimized (because the scans are sequential and the disk head does not have to be repositioned elsewhere on the disk platter to service another request).

The downside of database striping lies in the fact that each fragment/extent is a separate operating system file. Therefore, there will be many more data files to manage compared with operating system striping or hardware striping, where the four separate sections would be treated and managed as a single file. It's simply a tradeoff between performance and ease of maintenance.

Conclusion

A scalable application must be thought of as a "performance chain." All components in the chain must be scalable for the entire application to be scalable. If any component is not scalable, then a bottleneck exists in your performance chain, and your application as a whole will have limited scalability. As a major component of any application, the I/O subsystem must also be scalable. Striping is one of the most effective techniques for removing disk hot spots, and it's required if you want to be able to take advantage of I/O parallelism and have a highly scalable I/O subsystem.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access