Thursday, July 7, 2016

SQL Joins - Basics Part 1

This topic takes me back to my first SQL Server database class I had in college. Back then it was extremely difficult subject for me. Partly because I have worked very little with databases and secondary, because of the way it was presented. Taking all that into account let's try to attack it in most basic manner. SQL JOIN in its basic form is just a way to connect two tables together. There are several types of JOINs that can be defined: LEFT, RIGHT, FULL, INNER, CROSS. There are time and place for each one, but most of the time I can get away with using only LEFT and INNER.

Its is great to talk about JOINs, but I'm sure that there are other people who like me don't learn but hearing, they learn by seeing and trying it them self. To do that, we need to create two basic tables and populate it with some data.

First, let us create two tables to store data for our tests. Since the concept of student and teacher is most easily understood, I'm creating Teacher table and Student table with a key to link back to Teacher Table.

CREATE TABLE [dbo].[Student]
(
 [ID] [int] NULL
 ,[FullName] [varchar](50) NULL
 ,[TeacherID] [int] NULL
)

GO

CREATE TABLE [dbo].[Teacher]
(
 [ID] [int] NULL
 ,[FullName] [varchar](50) NULL
)
Now to load some data, I generated insert script with random names from the following website.
 
INSERT INTO Teacher (ID, FullName)
VALUES  (1, 'Roy Chad')
  ,(3, 'Dudley Goddard')
  ,(7, 'Raphael Philander')
  ,(5, 'Sonnie Davin')

INSERT INTO Student (ID, FullName,TeacherID)
VALUES  (1,'Jepson Balfour',1)
  ,(2,'Milburn Brett',1)
  ,(4,'Clinton Schuyler',3)
  ,(5,'Norbert Kemp',3)
  ,(7,'Meriwether Kennedy',7)
  ,(8,'Braith Cornelius',7)
  ,(10,'Dion Hayden',9)

Now that we got some data loaded into tables let the JOINing commence. First will look at 'INNER' JOIN

SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
INNER JOIN Student s
        ON s.TeacherID = t.ID
INNER JOIN throws out all records that do not match. In our case Teacher "Sonnie Davin" and Student "Dion Hayden" do not appear in result because there was no way to connect those records.

TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Roy Chad Milburn Brett
Dudley Goddard Clinton Schuylerr
Dudley Goddard Norbert Kemp
Raphael Philander Meriwether Kennedy
Raphael Philander Braith Cornelius

LEFT OUTER JOIN, returns all records from main table and attempts to match records from secondary table.

SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
LEFT OUTER JOIN Student s
        ON s.TeacherID = t.ID
As seen in this result list, Teacher "Sonnie Davin" appears in the list but does not have any student records associated. LEFT JOIN is most usefull when ever you not sure if you all records from main table are matched in secondary table. Almost every time I write query I use LEFT JOIN the first I'm JOINing tables together to know for sure that I'm not excluding any records that I wanted to include.

TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Roy Chad Milburn Brett
Dudley Goddard Clinton Schuylerr
Dudley Goddard Norbert Kemp
Raphael Philander Meriwether Kennedy
Raphael Philander Braith Cornelius
Sonnie Davin NULL

RIGHT OUTER JOIN returns all records from RIGHT table and matches records from left table. The following two queries can be written as RIGHT or LEFT join with same results.

SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
RIGHT OUTER JOIN Student s
        ON s.TeacherID = t.ID

SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Student s
LEFT OUTER JOIN Teacher t
        ON s.TeacherID = t.ID
In this result set because we reversed the condition of LEFT JOIN above, we can see record from Student table "Dion Hayden" who is not matched with a teacher record.

TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Roy Chad Milburn Brett
Dudley Goddard Clinton Schuylerr
Dudley Goddard Norbert Kemp
Raphael Philander Meriwether Kennedy
Raphael Philander Braith Cornelius
NULL Dion Hayden

FULL OUTER JOIN attempts to match records from both tables just like INNER JOIN and also returns all additional records that do not match from ether table.

SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
FULL OUTER JOIN Student s
        ON s.TeacherID = t.ID
In the this result set we have all records that we had from INNER JOIN, plus extra record from LEFT JOIN and one record from RIGHT JOIN. Teacher "Sonnie Davin" appears in result set without student and Student "Dion Hayden" also appears in result set without teach. This join is most usefull when need to get all records no matter if match exists or not.

TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Roy Chad Milburn Brett
Dudley Goddard Clinton Schuylerr
Dudley Goddard Norbert Kemp
Raphael Philander Meriwether Kennedy
Raphael Philander Braith Cornelius
Sonnie Davin NULL
NULL Dion Hayden

CROSS JOIN, returns all records from main table and matches them to every record from secondary table.

SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
CROSS JOIN Student s
What we get back is the following result set that has every record from both table. Since no condition has be specified every record appears in result set. Number of records in result set will always be equal to Number of records in TableA multiple by number of record in TableB. In our case we had 4 Teachers and 7 Students therefore we got 28 records in result. But what if we had 4 million teachers and 70 million students. This JOIN has worst performance and will require most memory to complete the operation.

TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Roy Chad Milburn Brett
Roy Chad Clinton Schuylerr
Roy Chad Norbert Kemp
Roy Chad Meriwether Kennedy
Roy Chad Braith Cornelius
Roy Chad Dion Hayden
Dudley Goddard Jepson Balfour
Dudley Goddard Milburn Brett
Dudley Goddard Clinton Schuylerr
Dudley Goddard Norbert Kemp
Dudley Goddard Meriwether Kennedy
Dudley Goddard Braith Cornelius
Dudley Goddard Dion Hayden
Raphael Philander Jepson Balfour
Raphael Philander Milburn Brett
Raphael Philander Clinton Schuylerr
Raphael Philander Norbert Kemp
Raphael Philander Meriwether Kennedy
Raphael Philander Braith Cornelius
Raphael Philander Dion Hayden
Sonnie Davin Jepson Balfour
Sonnie Davin Milburn Brett
Sonnie Davin Clinton Schuylerr
Sonnie Davin Norbert Kemp
Sonnie Davin Meriwether Kennedy
Sonnie Davin Braith Cornelius
Sonnie Davin Dion Hayden

This concludes the first look at logical JOIN operations. Just like I mentioned in beginning of the post, just about every time I write query it ends up being LEFT or INNER JOIN. Do I ever had to use others? Yes, but with specific business cases.

Please leave a comment with questions and/or feedback on post.

No comments :

Post a Comment