Tuesday, August 30, 2016

SQL Joins - Basics Part 2

Having already covered the basics of SQL Join syntax, now it is time to jump into a little more advanced stuff. Early in my career, I remember being confused about the difference in results when the same clause is placed in JOIN instead of WHERE clause. This post is aimed to clarify those questions and few others around ON and WHERE clauses. For demonstration I will use table built and populated in SQL Joins - Basics Part 1

First, let us look at basic LEFT join from Part 1. All rows are returned from Teacher table and all matching rows from Student table are displayed.
 
SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
LEFT OUTER JOIN Student s
        ON s.TeacherID = t.ID
TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Roy Chad Milburn Brett
Dudley Goddard Clinton Schuylerr
Dudley Goddard Norbert Kemp
Raphael Philander Meriwether Kennedy
Raphael Philander Braith Cornelius
Sonnie Davin NULL

Now we introduce simple WHERE clause. By adding condition to WHERE clause we are now restricting entire results set to specific condition, any rows that do not satisfy that condition are excluded from results. Therefore we only end up with one row seen below.
 
SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
LEFT OUTER JOIN Student s
        ON s.TeacherID = t.ID
WHERE s.FullName = 'Jepson Balfour'
TeacherFullName StudentFullName
Roy Chad Jepson Balfour

So what happens when you move same clause to the ON clause? Ok, lets test it!
 
SELECT  t.FullName AS TeacherFullName
       ,s.FullName AS StudentFullName
FROM    Teacher t
LEFT OUTER JOIN Student s
        ON s.TeacherID = t.ID
  and s.FullName = 'Jepson Balfour'
TeacherFullName StudentFullName
Roy Chad Jepson Balfour
Dudley Goddard NULL
Raphael Philander NULL
Sonnie Davin NULL
What happened? Result set looks nothing like first or second example. When I first did that I was thoroughly confused. To explain the result lets run another query.
 
SELECT s.FullName
FROM dbo.Student as s
WHERE s.FullName = 'Jepson Balfour'
StudentFullName
Jepson Balfour

Even though last two queries look different in reality they are placing exactly same restriction on Student table. On clause on Student table became where clause that restricts results to only rows that specific that critirea. Since Teacher table is not joined with INNER join it is not restricted by what happens to Student therefore we see all Teachers displyaed but only one of them actually showing a student.
Conclusion:
The ON clause is a powerfull way to change your result set exactly to what you need it to be, but if used without understanding of what happens to the data it can produce unpredicted result set. Each statement placed in ON clause will be evaulated prior to WHERE clause. This goes back to understanding order of operations in SQL Server. Below are just few of the operations listed in correct order. By knowing and understanding the order of operations in SQL Server it helps to understand why queries behaved the way that they did above. Each one was evaluated by SQL Server in correct order which produced correct output based on that structure.
  1. FROM
  2. ON
  3. WHERE
  4. SELECT
As always feel free to leave comments, questions, etc. For next post I will answer common question of TOP clause.

No comments :

Post a Comment