SQL 101 – JOIN

Overview

I want to cover a fundamental topic for SQL programming.  This is the topic you should start studying right after you have a basic understanding of what a relational database, what tables are, and that tables can be related to each other.  If you’re a fan of Duct tape, then Joins will become the duct tape of your SQL tool belt.

To download a script that will give you a copy of all the tables used as examples in this post, click here.

When two tables are defined that have a column in common, then you can write a query that will tell you the information in the one table that relates to information in the other table.  There are three basic types of joins:

  • INNER – show only the records where there is a match in both tables.
  • OUTER – show the records from one table, even if there isn’t a match in the other table.
  • CROSS – Show all the records of the first table, matching each record of the first table to each record of the second table.

Take a minute to process those options.  You’ll find that if two pieces of data are truly related, any relationship you can think of will fall into one of these three categories.  In future lessons I’ll go into much greater detail about when you would want to use each of these three joins.  Also, in later posts, I’ll introduce you to performance implications of each of these joins.  But for now, I’m going to give you examples of each of these types.  You will start to learn how to use them, and you may even begin to think about when you would want to use each of these in your own programming.

INNER JOIN

Whenever you’re looking to get data from one table, where there’s matching data in the other, you want to use an INNER JOIN.  Let’s revisit a problem from the last post.  In our previous post we wanted to see all the data in the productSale table, where there was a match in the product table.  Here is the code we used

SELECT *
FROM productSale
WHERE
productName IN (
SELECT
productName
FROM products
WHERE
price <= '1.00')
buyer            productName    purchaseDate            qtypurchased    pricePaid
Shannon Lowder    paper        2000-01-01 00:00:00.000    2                1.00
Shannon Lowder    pencil        2000-01-05 00:00:00.000    1                0.25
Shannon Lowder    pencil        2000-01-07 00:00:00.000    1                0.25

This same query can now be re-written as an INNER JOIN and show us the same records.

SELECT productSale.*
FROM productSale
INNER JOIN products
on productSale.productName = products.productName
WHERE
products.price <= 1.00
buyer            productName    purchaseDate            qtypurchased    pricePaid
Shannon Lowder    paper        2000-01-01 00:00:00.000    2                1.00
Shannon Lowder    pencil        2000-01-05 00:00:00.000    1                0.25
Shannon Lowder    pencil        2000-01-07 00:00:00.000    1                0.25

In small record sets there won’t be a difference in the performance of these two queries.  But when you start dealing with hundreds of thousands of rows.  You’ll find there is a difference in performance.

Another good point to take from this example is there is almost always more than one way to write a query, and get the same results.  When discussing queries with other developers, keep that in mind.  New techniques will come to you all the time, you have to learn to take the strengths of these new techniques and compare them with the way you’re doing things now.  You never know when a difference of method will teach you a better method for doing something.

OUTER JOIN

While INNER JOIN will show you records where there is a match, OUTER JOIN will show you data, even if there isn’t a match.  There are three variations of the OUTER JOIN in Microsoft SQL.  LEFT, RIGHT, and FULL OUTER joins will combine one table’s data with another, and give you three slightly different versions of the data.

LEFT JOIN

A LEFT JOIN will show you all the data from the first (or LEFT) table, even if that record doesn’t have a match in the second (or RIGHT) table.

SELECT *
FROM leftTable
LEFT JOIN rightTable
on leftTable.value = rightTable.value
id    value    id      value
1    a        NULL  NULL
2    b        10      b
3    C        20      c

Notice that there is a record for “a” in leftTable, but there isn’t a record for “a” in rightTable.

RIGHT JOIN

A RIGHT JOIN will show you all the data from the second (or RIGHT) table, even if that record doesn’t have a match in the first (or LEFT) table.

SELECT *
FROM leftTable
RIGHT JOIN rightTable
on leftTable.value = rightTable.value
id        value    id        value
2        b        10        b
3        C        20        c
NULL    NULL    30        d

Notice there is a record for “d” in rightTable, but there isn’t a record for “d” in leftTable.

FULL OUTER JOIN

A FULL OUTER JOIN will show you all the records of the first (or LEFT) table and all the records of the second (or RIGHT) table.

SELECT *
FROM leftTable
FULL OUTER JOIN rightTable
on leftTable.value = rightTable.value
id        value    id        value
1        a        NULL    NULL
2        b        10        b
3        C        20        c
NULL    NULL    30        d

With the FULL OUTER JOIN you see all the data from both tables, even where that value is missing from the other.

OUTER JOIN can be a very useful tool when you’re trying to locate missing data.  This will be a reoccurring task you’ll face as a database administrator.  Practice with these joins until you know the difference by memory.  It will serve you well!

CROSS JOIN

The CROSS JOIN is the least used query.  At least in my opinion, it is.  In a CROSS JOIN, ach record of the first table is joined to each record of the second table.  This join is also called a cross-product.  The only time I’ve found it useful to use a CROSS JOIN is to create a lot of data quickly for test scenarios.

The reason is simple, if you have 10 test records, and you want 1000, simply CROSS JOIN the 10 records to a table with 100, and INSERT the results into a new table.  Let’s look at a CROSS JOIN now.

SELECT *
FROM leftTable
CROSS JOIN rightTable
id    value    id    value
1    a        10    b
2    b        10    b
3    C        10    b
1    a        20    c
2    b        20    c
3    C        20    c
1    a        30    d
2    b        30    d
3    C        30    d

I’d also like to point out a query style that can cause you problems.  If you see a query like the following, try to rewrite it as an OUTER or INNER JOIN.

SELECT *
FROM leftTable, rightTable

Occasionally you’ll see this style join (referred to Microsoft style joins) in code, and the person who wrote it, thinks it’s an INNER JOIN.  The problem is, just this part of the query is actually a CROSS JOIN.  It will save you a lot of effort and grief to use explicitly defined joins, rather than not knowing for certain what JOIN you’ve written.

Otherwise you might get duplicates, and not understand why.

Conclusion

Adding a JOIN to your query will dramatically increase the types of data you can pull from your database.  It’s the foundation of relationships in relational database management systems.  Learn when to use each type of JOIN.  More importantly, learn when you don’t want to use a certain JOIN.

Master using it, and you can have this.

As always, if you have any questions, send them in!  I’m here to help!

Previous: Sub QueriesNext: INSERT

Tags: , ,

11 Responses to “SQL 101 – JOIN”

  1. RPhillips 20060128 at 12:09 #

    Thank you! This was very helpful.I needed to delete from a joining table with a two-field primary key so I couldn’t use my normal method since there was no single unique field:DELETE FROM table WHERE field IN (SELECT…)Instead I used:DELETE JFROM Users Uinner join LinkingTable J on U.id = J.U_idinner join Groups G on J.G_id = G.idWHERE G.Name = ‘Whatever’and U.Name not in (‘Exclude list’)

  2. Some Guy 20060128 at 13:09 #

    Just thought I’d point out something to any relative newbies like me trying to use the where clause subquery method with more than one join.I’m using MS Access for my particular problem. Haven’t tried it in grown up T-SQL but trying to make the link in one go with a concatenation made the wonderful Access hang:WHERE (A & B & C) not in (SELECT (A & B & C)FROM TheOtherTable)(Obviously for & read + in T-SQL)I’m not totally certain this would have any similar trouble in Transact-SQL but for those using Access get rid of the second table from your design view then in SQL you’ll need separate subqueries for each join like this:WHEREA not in (SELECT A FROM TheOtherTable)and B not in (SELECT B FROM TheOtherTable)and C not in (SELECT C FROM TheOtherTable)There’s probably a better way but it’s the best I’ve found so far.

  3. RPhillips 20060128 at 17:09 #

    Thank you! This was very helpful.I needed to delete from a joining table with a two-field primary key so I couldn’t use my normal method since there was no single unique field:DELETE FROM table WHERE field IN (SELECT…)Instead I used:DELETE JFROM Users Uinner join LinkingTable J on U.id = J.U_idinner join Groups G on J.G_id = G.idWHERE G.Name = ‘Whatever’and U.Name not in (‘Exclude list’)

  4. RPhillips 20060129 at 03:09 #

    Thank you! This was very helpful.I needed to delete from a joining table with a two-field primary key so I couldn’t use my normal method since there was no single unique field:DELETE FROM table WHERE field IN (SELECT…)Instead I used:DELETE JFROM Users Uinner join LinkingTable J on U.id = J.U_idinner join Groups G on J.G_id = G.idWHERE G.Name = ‘Whatever’and U.Name not in (‘Exclude list’)

  5. Muhammed 20060201 at 01:09 #

    Thanks for this article got help from it.

Trackbacks/Pingbacks

  1. Tweets that mention SQL 101 – JOIN | ToyboxCreations -- Topsy.com - 20110109

    [...] This post was mentioned on Twitter by Jose Chinchilla. Jose Chinchilla said: RT @shannonlowder: There's still time to sign up for my SQL 101– JOINs on 1/13 @ 6pm. Let me know if you want in! http://ow.ly/3zi01 [...]

  2. SQL 101 – INSERT | Shannon Lowder - 20110124

    [...] Previous: JOIN [...]

  3. SQL 101 | Shannon Lowder - 20110124

    [...] JOIN — using the relationship part of relational database system [...]

  4. SQL 102 – Foreign Keys | Shannon Lowder - 20110203

    [...] products table.  In the productSales table, productName is the foreign key.  If you recall the JOIN article I wrote, you’ll recall we wrote a query that combined these two tables in order to [...]

  5. SQL 201 – Sub Queries | Shannon Lowder - 20110209

    [...] Next: JOIN [...]

  6. SQL 201 – Combining Query Results | Shannon Lowder - 20110209

    [...] Previous: JOIN [...]

Leave a Reply

Notify me of followup comments via e-mail. You can also subscribe without commenting.