UNION, INTERSECT, and EXCEPT

Earlier today I was discussing methods you can use to differences in data sets between two tables or views.  My colleague was discussing the usual method

SELECT
   columName(s)
FROM table1 t1 (READCOMMITTED)
LEFT JOIN table2 (READCOMMITTED)
   ON t1.sharedKey = t2.sharedKey
WHERE
   t2.sharedKey IS NULL

While this is a perfectly cromulent way of finding out what records are missing from table2, I mentioned the EXCEPT operator and got a questioning look, so I took the time to explain UNION, INTERSECT, and EXCEPT to him.  Now, I’d like to share this with you.

UNION

The UNION operator allows you to combine the output of two SELECT statements into one result set.  There are a few catches that you need to know about.  First, the number of columns must be the same between the result sets.  Second, the data types for the columns returned must me the same, or able to be converted to the same.  Finally you should also know that UNION will return only a distinct record set, if you want to keep your duplicate rows, you must use UNION ALL instead.

SELECT
  colmn1, column2
FROM testing.dbo.table1 t1 (READCOMMITTED)
UNION --ALL --all is optional
SELECT
   colmn1, column2
FROM testing.dbo.table1 t1 (READCOMMITTED)

INTERSECT

Just like the UNION operator, INTERSECT allows you to join two data sets into one.  The difference is, The INTERSECT operator will only return records from the first table or view where the records exist in the second table or view.  This can be used like an INNER JOIN, except each column of the first table is compared to the columns of the second table.  This can allow you to perform quicker comparisons between the two data sets.

I will issue a word of caution, don’t use SELECT * with the INTERSECT or EXCEPT function, if the columns are in different orders, or have slightly different data types on the columns, you may get results different than you’d expect.  Since the difference will be seen by SQL, it may ignore records based on a precision change, and you wouldn’t want that, right?

SELECT
  colmn1, column2
FROM testing.dbo.table1 t1 (READCOMMITTED)
INTERSECT
SELECT
   colmn1, column2
FROM testing.dbo.table1 t1 (READCOMMITTED)

EXCEPT

Finally, there’s EXCEPT, this would be similar to a LEFT JOIN with a WHERE clause that looks for NULL values in the second table.  This is the solution I introduced my colleague to.  By running a SELECT on both tables, and including all the key columns, he could quickly look for records that were missing from the first table. Again, please keep in mind SELECT * is a bad idea when using EXCEPT.  Explicitly calling column names will save you much grief in the long haul.

SELECT
  colmn1, column2
FROM testing.dbo.table1 t1 (READCOMMITTED)
EXCEPT
SELECT
   colmn1, column2
FROM testing.dbo.table1 t1 (READCOMMITTED)

As always, if you have any questions, send them in!  I look forward to helping you add more tools to your SQL tool belt!

References

UNION

EXCEPT and INTERSECT

Tags:

2 Responses to “UNION, INTERSECT, and EXCEPT”

  1. WP Themes 20100726 at 03:00 #

    Good dispatch and this post helped me alot in my college assignement. Thanks you as your information.

Trackbacks/Pingbacks

  1. SQL201-The Syllabus | Shannon Lowder - 20110811

    [...] INTERSECT, UNION, EXCEPT –Combine and compare result sets [...]

Leave a Reply

Notify me of followup comments via e-mail. You can also subscribe without commenting.