70-450: Calculating Table Size

In my previous three posts I covered different RAID solutions, and when I use each type of RAID. That’s just the first part of the first section of the 70-450.

Designing a SQL Server Instance and a Database Solution (14 percent)

• Design for CPU, memory, and storage capacity requirements.
• This objective may include but is not limited to: RAID, calculating table size, IO throughput, transaction per second, data compression, non-uniform memory access (NUMA), tempdb capacity

Let’s move on to the next topic, calculating table size. I’ll admit, when I get questions about table size it’s usually how large is the table now, and how has it grown over time. That way we can make the call on whether or not we need to move the database to a new disk with more space, or add a data file to the database, or simply allow the database more room to grow.

When I did some digging into the MSDN, it looks like they want you to be able to calculate the table sizes based on table designs. So after you’ve decided what columns you’ll need, what data types those columns will be, whether or not those columns are nullable, and if you’ll use a heap or a clustered index, and if you’ll include any non-clustered indexes.

After some googling, I found a book that tries to cover this topic: SQL Server 2008 Administration: Real-World Skills for MCITP Certification. The explanation in the book is a great mass of text explaining how to do the calculations.

Turning to the MSDN, I found A little more help here. Again, it’s still a mass of text(s). The only way I could think to make this estimation make more sense was to build out an excel spreadsheet.

The problem is you’re not going to be able to take that spreadsheet in with you on an exam. I suggest you build your own spreadsheet based on the calculations give on the three pages: Estimating the Size of a Heap, , and Estimating the Size of a Nonclustered Index.

I have to admit this estimation is a bit detailed compared to what I’ve had to do in the past. Most of the time, I have to anticipate the volume of rows a table will hold, multiply that by the row size, then double that number to give myself room to grow.

The extra details I learned while exploring these calculations include:

• null bitmaps – is a part of a row that if a column is null or not. It’s size is 2 + ((Number of columns + 7) / 8).
• Microsoft suggests estimating the size of a variable length field as 2 + (Number of variable Columns * 2) + max variable length. I usually just added up the max length of all columns, rather than doing this calculation, I wanted more buffer in my estimates.
• when calculating row size in bytes, add 4 for the row header.
• When calculating clustered index space requirements, don’t forget to include information about the index you’re making (remember: clustered indexes are the physical order of the table, you have to store that information somewhere!)
• Calculating non-clustered indexes is similar to calculating a clustered index, but you only count included and indexed columns.

This material is worth reviewing, but only for the exam. In production, you’d use the excel spreadsheet, or another tool to actually build out your size estimates.

Next time, We’ll hit IO throughput. You’ll see me referencing the RAID configurations I told you about last time, as well as adding filegroups, files and other slight of hand in order to maximize throughput. Until then, if you have any questions, send them in. I’m here to help!

By Shannon Lowder

Shannon Lowder is the Database Engineer you've been looking for! Look no further for expertise in: Business Analysis to gather the business requirements for the database; Database Architecting to design the logical design of the database; Database Development to actually build the objects needed by the business logic; finally, Database Administration to keep the database running in top form, and making sure there is a disaster recovery plan.