Skip to content

shannonlowder.com

Menu
  • About
  • Biml Interrogator Demo
  • Latest Posts
Menu

SQL 103 – Relating Your Tables

Posted on October 5, 2011October 5, 2011 by slowder

Last time we were discussing how to build normalized tables.  We designed 3 tables: Customer, Address and PhoneNumber.  The Address and PhoneNumber tables are related to the Customer table, but how do we relate two tables?

 

Primary Keys

When we talk about relating one table to another, we’re really talking about relating a row in one table to a row in another  table.  So before we can go relating those rows, we have to be able to identify those rows to the table in which they belong.  When we want to identify a row to the table, we’re talking about primary keys.

We have to choose a column or columns that will uniquely identify a row in a table.  Sometimes this column will be one or more of the columns you’re storing.  That’s the case in the PhoneNumber table.  Each phone number is unique.  If we entered the number in the table twice, it wouldn’t mean two different phone numbers would it?

But sometimes, there really isn’t anything in the data that uniquely identifies the row to the table.  That’s what we have in the Customer table.  Think about it, if we have two rows with John, Smith, you really don’t know if that’s one person or two, right?  In the case where you don’t have a uniquely identifying attribute, or column, you have to add your own column that you’ll use to uniquely identify each row.

So, let’s add primary keys to our Customer and Address tables.  For both of these tables, we’re going to use an integer to uniquely identify each row to the table.  That integer cannot be re-used.  Once it’s assigned to a row, that’s the only row that will ever be represented by that integer.  If you’re more familiar with SQL Server, you know we’re talking about an IDENTITY() column.

For the Customer table, we’re going to add a column called CustomerID.  In Address, we’ll add a column called AddressID.  These are going to be the primary keys in their respective tables.

We’re not going to make any changes to our PhoneNumber table at this time, for now, we’re going to use the PhoneNumber column as the primary key in that table.

 

Foreign Keys

Now that we have a way to uniquely identify each row to the table, we can relate a row in one table, to a row in the other table.  To do this, we’re going to add a column to the table and store a copy of the primary key for the row we want to relate to.  Let’s start by relating an phone number to a customer.  We want to design this relationship in such a way that each phone number can belong to only one customer.

This kind of relationship is called a 1 to 1 relationship.

To make this relationship work, we need to save a copy of the CustomerID that the phone number row will “belong” to.  So let’s add the column CustomerID to the PhoneNumber table.

But look back at the data we’re going to get in the source file.  We’re going to receive home, work, and cell numbers for both the customer and their spouse.  In reality, we need a 1-to-many relationship.

Guess what!

We can use the design as we’ve made it to support a 1-to-many.  For each number we want to enter, we just add a row to the PhoneNumber table, changing the phonetype to reflect the type of phone number being stored.  Then for each CustomerID you can have one or more phone numbers on file.

You could also consider this as a relationship from phone numbers to customers.  If you did, then this relationship would appear to be a many-to-1 relationship.  Both are valid, it’s all a matter of perspective.  The further you get into database design, the more you’ll find the answer is “it depends.”

 

Many-to-Many and Mapping Tables

An additional relationship type I want to cover with you today is a many-to-many relationship.  Consider the following: a couple will both share the same address.  That’s a many-to-1.  But, a customer may have a billing address and a mailing address, right?  That’s a 1-to-many.  Both are true.  That’s a many-to-many relationship.  When you want to model that kind of relationship, you’ll often resort to a mapping table.

A mapping table is a very narrow table, often only including two foreign keys.  It allows you to list all the relationship between records in two different tables.  Let’s look at the source file. Notice how there is only one address per record, but there are two names?

We’re going to want to be able to show a relationship between both the customer and spouse and the address.  That means for each entry in the customer table (one for customer and one for spouse) we’ll have a single address.  So in our mapping table we’ll have one entry with the customer’ CustomerID mapped to the AddressID, and then we’ll have the spouse’s CustomerID and that same AddressID.  Let’s design that table now.

Self or Recursive Reference

There is one final relationship type I want to cover.  It’s when one row in a table references another row in that same table.  It’s often known as a recursive reference.  It’s really useful when you’re relating two entities of the same type.  In our example we are sent two customers in each row, a customer and his or her spouse.  Both are customers.  But we need a way of relating one customer to another.  That’s where the self reference comes in.

We’re going to add a foreign key to the Customer table that references CustomerID.  We’re going to call it SpouseID.  This column will hold the CustomerID for the current Customer’s spouse.  Bonus points if you ask yourself would both customer records reference the other customerID!

 

 Summary

Relating your tables takes little more than using a column to store data from another column.  This is the most duplication that you should have in a normalized database design.  If you start copying columns of text from one table to another, you may be heading towards de-normalization.  Just remember, the column that identifies a row to the it’s table is the primary key, has to be unique.  The column that identifies a row in another table is the foreign key, it doesn’t have to be unique.  You can also reference another row in the same table.  This is just the beginning, mastering referential integrity takes time, and you’re just taking your first steps down that path now.  Practice your designs, and if you have any questions about your designs, let me know.  I’m here to help!

 

 

 

 

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • A New File Interrogator
  • Using Generative AI in Data Engineering
  • Getting started with Microsoft Fabric
  • Docker-based Spark
  • Network Infrastructure Updates

Recent Comments

  1. slowder on Data Engineering for Databricks
  2. Alex Ott on Data Engineering for Databricks

Archives

  • July 2023
  • June 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • October 2018
  • August 2018
  • May 2018
  • February 2018
  • January 2018
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • June 2017
  • March 2017
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • August 2013
  • July 2013
  • June 2013
  • February 2013
  • January 2013
  • August 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
  • March 2005
  • February 2005
  • January 2005
  • November 2004
  • September 2004
  • August 2004
  • July 2004
  • April 2004
  • March 2004
  • June 2002

Categories

  • Career Development
  • Data Engineering
  • Data Science
  • Infrastructure
  • Microsoft SQL
  • Modern Data Estate
  • Personal
  • Random Technology
  • uncategorized
© 2025 shannonlowder.com | Powered by Minimalist Blog WordPress Theme