Let’s look at the following code using your copy of AdventureWorks.
USE AdventureWorks GO SELECT COUNT(*) AS countStar , COUNT(ALL EmployeeID) AS CountAllEmployeeID , COUNT(DISTINCT EmployeeID) as CountDistinctEmployeeID FROM HumanResources.JobCandidate
You should get three counts
countStar CountAllEmployeeID CountDistinctEmployeeID 13 2 2
You need to understand the differences in these counts.
COUNT(*) returns the number of items in a group. This includes NULL values and duplicates. Basically this is a count of the number of rows in your table. The only time this can become tricky is if you couple this count with NOLOCK, and someone else is busy loading or deleting data from the table you’re counting.
If you use this with NOLOCK, you have to understand that you’re seeing a count of all records, even those that are not (and may never be) committed.
COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values. This shows you a count of records that have that expression defined. If you see our example, you’ll see only 2 out of the 13 are non NULL values.
COUNT(DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values. Again, referring to our example, the 2 non NULL values are different.
Oh, and if you start counting <i>lots</i> of rows, (more than 2^31-1) COUNT produces an error. Use COUNT_BIG instead.
This may seem simple, but learning these variations can save you from writing out complex WHERE clauses, or counting a sub query. They are defined for a reason!
As always, if you have any questions…Send them in. I’m here to help!