I remember going with my mother to Steinback's Department Store in Red Bank, New Jersey, sometime around 1965. It was the first time I heard the word "computer." The context was a billing error. And the computer was the excuse for that billing error. More, I remember the clerk telling my mother that he could not fix the error at that time; he would have to send the information along to the computer, clearly somewhere else, and we would have to wait. I wondered what this "computer" was.
Fast-forward 40+ years and I doubt we can find many eight-year-olds that don't know what a computer is. At the same time, few if any of today's eight-year-olds would have any notion of waiting for a computer. We are simply accustomed to computers serving us all the time, when and where we need them. By and large, computers are highly available and highly reliable.
This column focuses on high availability in databases, specifically in SQL Server 2005. When we planned "Yukon," we took a long look at the reasons databases become unavailable or seem to become unavailable. That's a subtle distinction. Certain pathologies exist that will lead a user to think the database is not responding and, therefore, unavailable; yet the database administrator who checks on the database will find it running. Of course, if the database is running but appears not to be, the user will be unhappy just the same.
SQL Server 2005 implements several key features that increase availability of the database. This column discusses new availability features. Clustering is not included because while much improved in 2005, it already existed in 2000. The first and perhaps the most important high-availability feature is database mirroring. SQL Server 2005 Enterprise Edition and Standard Edition implement mirroring. In concept, mirroring is simple and builds on a feature we had in 2000, log shipping. When two servers are mirrored, one acts as the principle while the other acts as the mirror. The application and its users see one server, with one name. A third server, called the witness, keeps an eye on the whole arrangement. As the principle server accepts and handles transactions, it makes sure the mirror gets exact copies. That is, it ships log records to the mirror. If the principle becomes unavailable for any reason, the mirror takes over with exactly the same data. Strictly speaking, the witness is optional. When present, it confirms for the mirror that a failure has happened, allowing the mirror to automatically take on the role of principle. Since the (former) mirror has all the transactions of the principle, no work is lost. All transactions committed on the principle before the failure will appear on the (former) mirror. Your database administrator or systems administrator can fix the former principle and bring it back online. No user will ever notice the failure.
SQL Server 2005 supports two types of mirroring, synchronous and asynchronous. In both modes, the principle hardens (writes to disk) its transaction log records and sends them to the mirror. In synchronous mode, the principle waits for acknowledgement from the mirror before moving on. The mirror hardens its version of the log records and sends the acknowledgment back to the principle. This is very safe. But because it adds time and motion to every transaction, synchronous mirroring is slower than no mirroring. If you can trade off a little safety, look at asynchronous mirroring. In this mode, the principle gets its transactions over to the mirror and moves on with its workload. There is virtually no performance penalty. However, there are small windows during which a failure of the principle will lead to loss of data. Asynchronous mirroring is especially useful when the principle and mirror are far apart.
I have spoken about mirroring with many audiences. In 2005, an audience member asked if the principle, mirror and witness can run on the same physical machine. Most of the audience laughed. But the question illustrates an important point about SQL Server 2005. The person who asked the question was in a sense saying that he trusts hardware more than software. It is possible to buy "bulletproof" hardware.
You can buy machines with redundant processors, memory, storage (SAN, for example) and even power supplies - for a price. Database mirroring in SQL Server 2005 provides bulletproof availability on commodity hardware at a far lower price. It is another example of enterprise capability at affordable prices.
Another feature of SQL Server 2005, online operations, addresses cases where you formerly took a database offline to perform maintenance tasks, for example, to rebuild an index. With online indexing, your database remains available while SQL Server works in the background to perform the indexing task. Users will experience some degradation, generally on the order of a few percent. This is far less than the degradation of taking a table or whole database offline. In the old days, we could do maintenance at night or over the weekend. Of course, global companies have no "night," and increasingly, there is no weekend either. Online indexing is available only in SQL Server 2005 Enterprise Edition.
An example of a database appearing to be unavailable when in fact it is running is the classic case of readers blocking writers and writers blocking readers. Typically database engines use record locking to control concurrency. If users are to see consistent results during queries, the database has to lock rows to prevent other transactions from changing data before these queries complete. A long-running read transaction (query) can block a write transaction. In SQL Server 2005, we added snapshot isolation. Using this isolation mode, SQL Server maintains transaction-specific row versions in tempdb. You may insert or change rows in the table while my read transaction is open. Before returning rows to me, the database engine compares row versions and only shows me the versions that were current when I started my transaction. This is very useful for databases that support reporting and ad hoc analysis applications.
Last, let's look at a feature called database snapshots. This is like CTRL-Z (undo) for databases. In the event of a user or application error, administrators may want to restore a previous state of the database. Copying the database with any frequency is too expensive to consider - unless you can make that snapshot very quickly. SQL Server 2005 implements "copy-on-write" when making database snapshots. Initially a snapshot contains no data. Once the snapshot is taken, SQL Server makes copies of data pages before it changes those pages. Over time, as more pages change, the snapshot gets larger.
When the administrator wants to restore a snapshot, the engine simply restores the changed pages. Copy-on-write makes it reasonable to take frequent snapshots, providing an extra measure of database availability.
I want to share a quick story. The moral is that high availability requires your DBA's involvement. No matter how robust and sophisticated your database platform is, your administrator's practices have at least as much to do with the availability you experience as the software does. I was visiting a customer several years ago. They were using SQL Server 7.0 at the time. I made some point about customers' experience with high availability. Right then the IT director slapped the table with his palm and said, in essence, that we could forget five nines, he wasn't seeing five sixes. We were shocked that any customer would experience only two-thirds availability and asked to see the physical server. They led us to it. We found a desktop computer, on its side, under the administrator's desk. At the moment we walked up, he was playing MS Golf ... on the database machine.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access