Why is Your SQL Server Slow? All-for-one and One-for-all

I have one more thing I want to talk about regarding the conversation I had recently with my friends and former colleagues at Idera about the things that slow down SQL Server. Another “quick performance wins” you can take advantage of is segregating the types of files SQL Server uses. If you install SQL Server by just clicking NEXT until you get to the end, you end up with everything running on the same drive. Different files have different types of workloads and combining the workloads not only has the potential to very easily overload the I/O of the drive by simply driving too much work, but it makes all reads and writes essentially random. And of course, random reads and writes are slower than sequential reads and writes.

This conversation led to a whitepaper on 5 things that are making your SQL Servers slow and can be downloaded here: 5 Reasons Your SQL Server is Slow . Watch Idera’s Resource Central or follow @Idera_Software on Twitter to be the first to hear whenever a new whitepaper is released. Also, be sure to check out the other posts I made on this topic:

File Segregation

Most DBAs know that you need to segregate database data files from database log files. Hopefully, most DBAs also know that that tempdb files should be segregated to its own dedicated drive. However, there are many who overlook segregating backups to a dedicated drive. It is a common issue I see too frequently. I find backups on the data, log, and tempdb drives all the time. Usually, when I come across this, it is because someone on a forum is complaining either about how long it takes to back up a not very large database or query performance while the database is being backed up.

The answer is simple, if you are backing up to the same drive that the process is reading from, backup performance is going to suffer because you’re asking a single drive to do too much at one time. Similarly, if the the backup is being written to a drive that at the same time there is heavy activity in the database, even if it’s not the same database that is being backed up, query performance can suffer due to I/O latency because again, you are asking a single drive to do to much work. This is equally true if you are restoring a backup on drive that also hosts database files.

But it doesn’t stop at database files, tempdb files, and backup files. There are other scenarios where things may need to be segregated out even more. Specifically, large full-text indexes, distribution database for replication, and the snapshot files for replication. In some rare cases, you may even need to put msdb onto a dedicated drive.

Prior to SQL Server 2008, full-text indexes were saved in catalogs that were stored separate from the database files. For SQL 2005 and earlier, large catalogs were frequently placed on dedicated drives. Even though full-text indexes are part of the database now and not separate files, you may still need to segregate the indexes onto dedicated files. Normal full-text operations do not generally require segregation, but performing a full crawl to populate or repopulate a full-text index can cause extreme I/O contention on a drive. If the index is small and the crawl is quick, the short degradation may be acceptable. If the index is very large and the crawl takes a very large time, the databases using that drive may become unusable.

When your database hosts a replication distributor, I recommend always putting the distribution database on a dedicate drive of its own. The distributor very often has a very heavy workload and it can easily consume all of the I/O resources of a drive. If you have a lot of publications using snapshots, for initialization or for snapshot replication, or very large snapshots, the snapshot activity can in some cases cause high I/O contention. I don’t put the snapshot on a dedicated drive by default, but if I do have a very heavily used snapshot load, I will monitor it for I/O contention and move it to a dedicated drive if I deem it is needed.

Another case where I normally don’t need to provide a dedicated drive, but in some rare cases, it may be warranted is for msdb. In some rare cases, msdb can be under an extremely heavy load with things like monitoring and tracking jobs, heavy usage of database mail and service broker, etc. It is a rare case, but if you have heavy workloads that hit msdb, it is something you may need to watch.

Summary

When you have slowness in your systems, it is never a bad idea to check the I/O load on your drives. The dynamic management view (DMV) sys.dm_io_virtual_file_stats() provides you with some very useful historic information for the files in your server. Below is 3 of my favorite ways to look at these stats, one by file, one by drive letter, and one by database.

-- IO stats by file
Select DBName = DB_NAME(FS.database_id),
    IsSnapshotOf = DB_NAME(D.source_database_id),
    LogicalName = MF.name,
    FilePathName = MF.physical_name,
    Filetype = MF.type_desc,
    FileState = MF.state_desc,
    NumberOfReads = FS.num_of_reads,
    ReadStall_ms = FS.io_stall_read_ms,
    NumberOfWrites = FS.num_of_writes,
    WriteStall_ms = FS.io_stall_write_ms,
    TotalStall = FS.io_stall,
    AvgReadTransfer_ms = Case When FS.num_of_reads = 0 Then 0
                        Else FS.io_stall_read_ms/FS.num_of_reads
                        End,
    AvgWriteTransfer_ms = Case When FS.num_of_writes = 0 Then 0
                        Else FS.io_stall_write_ms/FS.num_of_writes
                        End,
    AvgStall_ms = Case When FS.num_of_reads +  FS.num_of_writes = 0 Then 0
                        Else FS.io_stall/(FS.num_of_reads + FS.num_of_writes)
                        End,
    BytesPerRead = Case When FS.num_of_reads = 0 Then 0
                        Else FS.num_of_bytes_read/FS.num_of_reads
                        End,
    BytesPerWrite = Case When FS.num_of_writes = 0 Then 0
                        Else FS.num_of_bytes_written/FS.num_of_writes
                        End
From sys.dm_io_virtual_file_stats(null, null) FS
Inner Join sys.master_files MF On MF.database_id = FS.database_id
    And MF.file_id = FS.file_id
Left Join sys.databases D On D.database_id = FS.database_id;
Go

-- I/O stats per drive letter
Select DriveLetter = Left(MF.physical_name, 1),
    NumberOfReads = SUM(FS.num_of_reads),
    ReadStall_ms = SUM(FS.io_stall_read_ms),
    NumberOfWrites = SUM(FS.num_of_writes),
    WriteStall_ms = SUM(FS.io_stall_write_ms),
    TotalStall = SUM(FS.io_stall),
    AvgReadTransfer_ms = Case When SUM(FS.num_of_reads) = 0 Then 0
                        Else SUM(FS.io_stall_read_ms)/SUM(FS.num_of_reads)
                        End,
    AvgWriteTransfer_ms = Case When SUM(FS.num_of_writes) = 0 Then 0
                        Else SUM(FS.io_stall_write_ms)/SUM(FS.num_of_writes)
                        End,
    AvgStall_ms = Case When SUM(FS.num_of_reads + FS.num_of_writes) = 0 Then 0
                        Else SUM(FS.io_stall)/SUM(FS.num_of_reads + FS.num_of_writes)
                        End,
    BytesPerRead = Case When SUM(FS.num_of_reads) = 0 Then 0
                        Else SUM(FS.num_of_bytes_read)/SUM(FS.num_of_reads)
                        End,
    BytesPerWrite = Case When SUM(FS.num_of_writes) = 0 Then 0
                        Else SUM(FS.num_of_bytes_written)/SUM(FS.num_of_writes)
                        End
From sys.dm_io_virtual_file_stats(null, null) FS
Inner Join sys.master_files MF On MF.database_id = FS.database_id
    And MF.file_id = FS.file_id
Group By Left(MF.physical_name, 1);
Go

-- I/O stats per database
Select DBName = DB_NAME(FS.database_id),
    IsSnapshotOf = MIN(DB_NAME(D.source_database_id)),
    NumberOfReads = SUM(FS.num_of_reads),
    ReadStall_ms = SUM(FS.io_stall_read_ms),
    NumberOfWrites = SUM(FS.num_of_writes),
    WriteStall_ms = SUM(FS.io_stall_write_ms),
    TotalStall = SUM(FS.io_stall),
    AvgReadTransfer_ms = Case When SUM(FS.num_of_reads) = 0 Then 0
                        Else SUM(FS.io_stall_read_ms)/SUM(FS.num_of_reads)
                        End,
    AvgWriteTransfer_ms = Case When SUM(FS.num_of_writes) = 0 Then 0
                        Else SUM(FS.io_stall_write_ms)/SUM(FS.num_of_writes)
                        End,
    AvgStall_ms = Case When SUM(FS.num_of_reads + FS.num_of_writes) = 0 Then 0
                        Else SUM(FS.io_stall)/SUM(FS.num_of_reads + FS.num_of_writes)
                        End,
    BytesPerRead = Case When SUM(FS.num_of_reads) = 0 Then 0
                        Else SUM(FS.num_of_bytes_read)/SUM(FS.num_of_reads)
                        End,
    BytesPerWrite = Case When SUM(FS.num_of_writes) = 0 Then 0
                        Else SUM(FS.num_of_bytes_written)/SUM(FS.num_of_writes)
                        End
From sys.dm_io_virtual_file_stats(null, null) FS
Inner Join sys.master_files MF On MF.database_id = FS.database_id
    And MF.file_id = FS.file_id
Left Join sys.databases D On D.database_id = FS.database_id
Group By FS.database_id
Order By DBName;
Go

If you want to get a good idea of the current IO statistics for your system, simply capture the stats at time regular intervals and then calculate the difference between the values in the ranges. Set up a SQL job to log the results to a table and then you can compare the delta in the captures later to determine where your hot spots are and at what times.