Today we resume with Day 25 of the series and I want to discuss improving performance of backups and restores. The good news is that what works for one generally works for the other. Most actions you take to speed up backups will also speed up restores. Double bonus.
If you missed any of the earlier posts in my DR series, you can check them out here:
- 31 Days of disaster Recovery
- Does DBCC Automatically Use Existing Snapshot?
- Protection From Restoring a Backup of a Contained Database
- Determining Files to Restore Database
- Back That Thang Up
- Dealing With Corruption in a Nonclustered Index
- Dealing With Corruption in Allocation Pages
- Writing SLAs for Disaster Recover
- Resolutions for All DBAs
- Use All the Checksums
- Monitoring for Corruption Errors
- Converting LSN Formats
- Extreme Disaster Recovery Training
- Standard Backup Scripts
- Fixing a Corrupt Tempdb
- Running DBCC CheckTable in Parallel Jobs
- Disaster Recovery Gems From Around The Net
- When are Checksums Written to a Page
- How to CHECKDB Like a Boss
- How Much Log Can a Backup Log
- The Case of the Backups That Wouldn’t Restore
- Who Deleted That Data?
- Which DBCC CHECK Commands Update Last Known Good DBCC
- Restoring Differential Backups With New Files
- Handling Corruption in a Clustered Index
Adjust the Backup Buffers
I mentioned this first tip earlier in the series when I posted some DR gems from around the net. Nic Cain (blog|@SirSQL) has written a process (Automated Backup Tuning) to automate determining the best buffer settings for backups. This is great because the only real way to figure out the optimal setting for any given server is through trial and error. Nic’s process automates the trial part of it and lays out the results so you can easily pick out the optimal settings.
Skip Creating Database Files
When you run a restore, the first step of the restore process is to create the database files to the same sizes they were in the database when it was backed up. There are a couple of ways to skip the creation of some or all of the database files. For a very large database, this can save you a great deal of time in the restore process.
The first way is to make sure you have Instant File Initialization (IFI) enabled. Normally, when SQL Server creates a database file, it has to fill it with zeroes (i.e., zero it out) to mark the limits of the file. IFI is a feature that allows SQL Server to mark the limits of the files without filling them. This means that to create a 100 GB file, for example, it does not have to write 100 GBs of zeroes. The file creation process is almost instantaneous. Unfortunately, the transaction log files cannot be instantly initialized. This only allows us to skip the creation of the data files. This feature is enabled at the OS level by granting Perform Volume Maintenance Tasks rights to the SQL Server service account via Local Security Policy Editor or Group Policy Editor.
The other option works for data and log files. If the database files for the database you are restoring are already present, it will reuse the files that already exist. If the files exist and are the right size, then you are basically skipping the file creation process. I’ve seen people delete the existing database before starting the restore instead of restoring over the top without realizing that they are prolonging the process. This is particularly helpful when you need to restore a backup of a very large database. If I have to restore a large backup on a new server, the copy process is going to take a long time. In the meantime while it is copying, I will create an empty version of the database with the exact file specifications of the one I’m going to restore. Once the copy process finish, the creation of the database is probably finished as well and I can skip the file creation step by restoring over the database I just created.
Multiple Files on Multiple Dedicated Drives
I can’t stress enough that this recommendation has 3 parts. 1) Multiple files on 2) multiple 3) dedicated drives. You won’t see much if any improvement by simply having multiple files if they are on the same drive. You get 1 backup thread per LUN or mountpoint, not per file. If you write the backup to multiple files on the same drive or even different drives on the same LUN, you only get a single backup thread. No performance boost. The only benefit to this is manageability of moving around and storing smaller files or if you will be able to use multiple dedicated drives for the restore.
A common mistake I see people make is that rather than getting multiple LUNs for backups, they will put one of the backup files on the backup drive and one on a drive that holds database files. If you are trying to write a backup file to a drive that has an active database on it, it will affect performance for the activity of the database as well as the backup. I have seen many cases where this was even slower than writing to a single backup file. It is very important that the LUNs/drives be dedicated to backups only. I have seen directly proportionate improvements in backup time by adding more dedicated backup drives. I used to manage a VLDB that took 6 hours to back up onto a single drive. When we added a 2nd dedicated backup drive, back up time dropped to half, 3 hours. And when we added a 3rd dedicated backup drive, the time dropped to 2 hours.
Other Tips
- Use Compression: Unless you are using Transparent Data Encryption (TDE) for the database, you should be using backup compression if it is available. Either use SQL native compression or a 3rd party compression tool. Either way, this is a real must have.
- Crank up the SAN Throughput: People love to talk about the speed of the drives in their SAN or the RAID type, but more often than not, I see SANs bottlenecking on throughput long before they reach the limits of the drives. Work with your SAN admin to increase the queue depth and the number of paths to the SAN. Rule of thumb: more SAN paths (multi-pathing) = more throughput.
- Use Differential/Partial backups: not all backups have to be full backups. Figure out a mixture of Full and Differential/Partial backups and log backups that allow you to meet your SLA for recovery time and recovery point.
Warwick
Robert, Your comment “You get 1 backup thread per LUN or mountpoint, not per file.” Is that not dependant on the type of storage attached. Ie if SAN and the SAN consists of 10 underlying disks, If the presented MP has multiple Backup files placed onto it are you saying this still only gets 1 thread and will not perform any better than 1 backup file.
Thanks
SQLSoldier
No, SQL Server is ignorant of the type of storage you have under the covers. When I say “backup thread” that’s a thread sending data to the IO subsystem and in now way signifues what happens in the IO subsystem one it gets the IO. The IO subsystem can absolutely affect the speed of the backup if the it can’t write the data as fast as it receives it.
Good question, Warwick. Thanks for asking.
Boris Hristov
Can you please clarify this again, because there are some typos in your post…
So, no matter how many drives are “sitting” behind the MP, you are still going to work with one thread?
What will happen if you start 2 database backups that reside on the same MP behind which there are 10 disks?
SQLSoldier
The number of disks has no effect on the number of backup threads you get. SQL Server has no knowledge of the number of disks behind the storage. You get one backup thread per LUN or MP.
Yes, of course, if you start 2 backups running at the same time, each one gets its own backup thread. They don’t share one.
Day 28 of 31 Days of Disaster Recovery: Recovering SQL if the Tempdb Drive Dies | SQLSoldier
[…] Improving Performance of Backups and Restores […]
Day 30 of 31 Days of Disaster Recovery (T-SQL Tuesday #40): Using Partial Availability and Initialize from Backup to Replicate a Partial Database | SQLSoldier
[…] Improving Performance of Backups and Restores […]
Schema-only Backups and Restores | SQLSoldier
[…] dedicated LUNs set up so they could speed up the backup by striping it across multiple drives (see Day 25 of 31 Days of Disaster Recovery: Improving Performance of Backups and Restores) so that the total backup time was within a reasonable window. Unfortunately, their database […]