70-450: IO Throughput

Today I’m getting back to studying for the 70-450. To follow up the last few articles I’m still in the first bullet point from the official guide.

  • Designing a SQL Server Instance and a Database Solution (14 percent)
    • Design for CPU, memory, and storage capacity requirements.
      • This objective may include but is not limited to: RAID, calculating table size, IO throughput, transaction per second, data compression, non-uniform memory access (NUMA), tempdb capacity

    Today’s topic is IO throughput. One of the ways you can maximize throughput on your databases is to implement multiple Files and Filegroups. At this point in your DBA career you should know that by default, each database has two files a data file (*.mdf) and a log file(*.ldf). You know that each file is only accessible to one Database.

    What you need to know is what you can do with filegroups as well. Each filegroup is a collection of one or more data files (*.ndf). By implementing filegroups, you can separate your database objects (including tables and partitions of tables) so that you separate the IO operations. You all know that the more you can spread out those operations the faster your IO becomes, right?

    When you implement a filegroup with more than one file, the SQL Storage Engine will fill those data files in such a way that they grow at equivalent rates. So if you have one file with 2GB and a file with 1GB, more data will be written to the larger file (or whichever one has the most free space) until they have approximately the same amount of free space. Then SQL writes to them equally, until there isn’t enough free space to complete a write. Then SQL will expand each file one at a time, to allow more data to be stored.

    While the data will be written to one file, then the next, then the next (serialized), the reads can come from more than one file in the file group at a time. You then have software based striping, rather than RAID striping. The good news is this splitting gives you a boost in reads, even if all the files are on the same disk. But, if you were to move the files to separate discs…you’d seen an even bigger performance gain, not only in reads, but in writes too!

    If you implement multiple filegroups, you can move heavily used tables and objects into their own filegroup, and then optimize the files under that filegroup to maximize performance. You could then group low usage objects into a shared filegroup, and get the best of both worlds.

    While this article is mostly about read and write performance gains, I want to tell you that when you implement multiple files and filegroups, you can backup and restore files and filegroups separately. This could mean a performance boost during a disaster recovery situation. And that’s nothing to sneeze at. Imagine being able to bring back just the single 2 GB file you need in order to get up and running again, rather than having to restore the entire terabyte database!

    The recommendations

    While the MSDN has one set of recommendations, mine are a little different.

    1. Start with a single data file and log file.
    2. When you start seeing high IO waits across all objects, consider adding multiple files to your primary filegroup. I usually max out 1 file per core. Others will suggest up to two files per core. Before going to 2 files per core, test it, see if the additional files help or hurt performance.
    3. When you start seeing high IO wait for a few IO objects, consider adding additional filegroups (with a single file first), relocate the high use objects to one or more filegroups. If a single file – filegroup is insufficient, then add multiple files. Again, test to see how many files you should add.
    4. If you have just one table that’s pulling all your IO waits, put it in a filegroup by itself. Add multiple files. If that’s still not enough, move the files to separate drives (or LUNs). If that’s still not enough, consider partitioning if you’re dealing with very large objects that you could come up with a partitioning scheme to cover.
    5. One and only one transaction log file per database. Adding more does nothing. Logged transactions are simple written round-robin to those extra log files. Don’t put the transaction log file on the same disk as the data. If you start seeing transaction log waits, separate your log files so that you use separate disks (or LUNs) for each log file. While extreme, you can see a great performance boost when a log file is by itself on a drive.

    Summary

    Learning how to manipulate your database with multiple files and filegroups gives you a very powerful weapon when it comes to performance tuning. Before jumping to making changes to your file structure, make sure you’ve got your database design in check.

    If you’re doing high transactional volume, make sure your tables are structured for it. Don’t use huge, wide tables for transactional. Don’t join on varchars. Optimize the design for the work you’re doing.

    Next, make sure your indexes are right. If your IO waits are coming from the fact you have to update 20 non-clustered indexes every time you insert, update, or delete… you’re not going to solve any problems by changing the file structure.

    When all else fails, and you’re trying to resolve IO waits… then, consider the changing up the file structure.

    If you have any questions, send them in. I’m here to help you study for the 70-450, and in any other way I can!

    References

    Using Files and Filegroups
    Files and Filegroups Architecture

Leave a Reply

Your email address will not be published. Required fields are marked *