Tuesday, April 11, 2017

THROW versus RAISERROR

Error handling is just as important as verifying that your code runs and produces correct results. For developers, error handling is last chance to prepare code to fail gracefully instead of just blowing up and having to pick up pieces later.

When it comes to SQL Server, there are few options for error handling. Basics of error handling are wrapping the code that is likely to fail inside the TRY block and then dealing with failures inside the CATCH block.

Let's look at a most basic scenario below.
 
SELECT 1 / 0

SELECT 1
The first statement generates the error division by 0, but the second statement still executed as nothing is controlling what happens when errors encountered. In some cases, this is perfectly fine, but in others, we want to stop the execution of our script in the event of encountered errors. We might even want to roll back the transaction to prevent partial data change.

Now if we take the same code and place it inside TRY block we get different behavior. In this case, the first statement generates the error but the second statement is not run.
 
BEGIN TRY
SELECT 1 / 0

SELECT 1
END TRY
BEGIN CATCH

END CATCH

Now that we covered the basics let's look at controlling a flow of the code through the use of RAISERROR and THROW. In some cases, we need to fail code based on business rules which pass syntax test. For example, if our procedure is responsible for withdrawing money from an account, we might have a business rule that prevents going into negative. For the first example, we are just going to try to withdraw money.
 
CREATE PROCEDURE dbo.WithdrawMoney
    @AccountId INT
  , @Amount    DECIMAL(18, 2)
AS
    BEGIN
        UPDATE  dbo.Accounts
        SET Amount = Amount - @Amount
        WHERE   AccountID = @AccountId
    END
In this example we can keep withdrawing money until arighmetic overflow error will be thrown for creating large negative number. Since we want to avoid negative balance we are going to add code to where clause.
 
CREATE PROCEDURE dbo.WithdrawMoney
    @AccountId INT
  , @Amount    DECIMAL(18, 2)
AS
    BEGIN
        UPDATE  dbo.Accounts
        SET Amount = Amount - @Amount
        WHERE   AccountID              = @AccountId
                AND (Amount - @Amount) > 0
    END
Now, whenever you try to withdraw money that results in negative amount update will "fail." This failure is only reflected by the number of rows affected. In this instance, it is zero. Now we need something in front end application that reads the number of rows affected by stored procedure to understand what happened.

Another way of handling this would be to run a check first and to generate the error that is sent back to front end. So we write two stored procedures one using RAISERROR and another one using THROW and expect the same result.
 
CREATE PROCEDURE dbo.WithdrawMoney_RAISERROR
    @AccountId INT
  , @Amount    DECIMAL(18, 2)
AS
    BEGIN
        IF EXISTS
            (
                SELECT  1
                FROM    dbo.Accounts
                WHERE   AccountID              = @AccountId
                        AND (Amount - @Amount) > 0
            )
            RAISERROR( 'Insufficient Amount in account to perform withdrawal', 16, 1 )

        UPDATE  dbo.Accounts
        SET Amount = Amount - @Amount
        WHERE   AccountID = @AccountId
    END
GO
CREATE PROCEDURE dbo.WithdrawMoney_THROW
    @AccountId INT
  , @Amount    DECIMAL(18, 2)
AS
    BEGIN
        IF EXISTS
            (
                SELECT  1
                FROM    dbo.Accounts
                WHERE   AccountID              = @AccountId
                        AND (Amount - @Amount) > 0
            )
            THROW 51000, 'Insufficient Amount in account to perform withdrawal', 1

        UPDATE  dbo.Accounts
        SET Amount = Amount - @Amount
        WHERE   AccountID = @AccountId
    END
Upon executing the first procedure, we get the error message back to the front end, but after checking balance, we find that money withdrawn from the account, but in the case of the second procedure, the same error returned to the front end but money still there.

Now we begin to scratch our head trying to figure out why we lost the money even though we got errors in both cases. The truth behind is the fact that RAISERROR does not stop the execution of code if it is outside of TRY CATCH block. To get same behavior out of RAISERROR, we would need to rewrite procedure to look something like following example.
 
CREATE PROCEDURE dbo.WithdrawMoney_RAISERROR
    @AccountId INT
  , @Amount    DECIMAL(18, 2)
AS
    BEGIN
        BEGIN TRY
            IF EXISTS
                (
                    SELECT  1
                    FROM    dbo.Accounts
                    WHERE   AccountID              = @AccountId
                            AND (Amount - @Amount) > 0
                )
                RAISERROR( 'Insufficient Amount in account to perform withdrawal', 16, 1 )


            UPDATE  dbo.Accounts
            SET Amount = Amount - @Amount
            WHERE   AccountID = @AccountId
        END TRY
        BEGIN CATCH
            DECLARE @ErrorMessage  VARCHAR(4000)
                  , @ErrorSeverity INT
                  , @ErrorState    INT

            SELECT  @ErrorMessage  = ERROR_MESSAGE()
                  , @ErrorSeverity = ERROR_SEVERITY()
                  , @ErrorState    = ERROR_STATE()

            RAISERROR( @ErrorMessage, @ErrorSeverity, @ErrorState )
        END CATCH
    END
Now we added another complexity to the procedure because if we wrap RAISERROR in TRY CATCH block, we suppress the error and the front end left clueless as to what happened inside stored procedure. Therefore we need to "rethrow" it by catching the information about it and doing another RAISERROR. If we choose to do the same with THROW, it is a whole lot simpler. Just look at the next example.
 
CREATE PROCEDURE dbo.WithdrawMoney_THROW
    @AccountId INT
  , @Amount    DECIMAL(18, 2)
AS
    BEGIN
        BEGIN TRY
            IF EXISTS
                (
                    SELECT  1
                    FROM    dbo.Accounts
                    WHERE   AccountID              = @AccountId
                            AND (Amount - @Amount) > 0
                )
                THROW 51000, 'Insufficient Amount in account to perform withdrawal', 1


            UPDATE  dbo.Accounts
            SET Amount = Amount - @Amount
            WHERE   AccountID = @AccountId
        END TRY
        BEGIN CATCH
            THROW
        END CATCH
    END
In this case, we simply add THROW to CATCH block which acts as "RETHROW." Unfortunately for me, I write most of the code for SQL Server 2008 R2 which did not have THROW command. It is only available starting with SQL Server 2012. Therefore I have to use RAISERROR and always remember to wrap it inside TRY CATCH block to ensure that it always stops execution when I want it to do so.

Be sure to leave a comment and thank you so much for reading.

Tuesday, March 21, 2017

What kind of JOIN syntax is this?

I am not often puzzled when looking at TSQL syntax, but this one time I could not figure out the syntax. Syntax looked similar to the example below.
 
SELECT  t.*
      , s.*
FROM    Teacher AS t
LEFT OUTER JOIN dbo.Student AS s
JOIN dbo.Grades AS gr
    ON s.ID = gr.StudentID
       AND  gr.GradeLetter = 'A'
    ON s.TeacherID = t.ID


My first gut reaction was that this code is broken and would not run. To my amazement code ran just fine. Now came the hard part, which was to figure out what the code was doing because I have never seen this syntax before. Since I did not understand what I was looking at I could not BING "weird join syntax" to get an answer. As a developer, I learned long time ago to break down code into smallest possible chunks to get the answer.

After I have figured out the relationship between tables, I was able to understand what query was doing. To be able to read query better it can be rewritten in the following way.
 
SELECT  t.*
      , s.*
FROM    Teacher AS t
LEFT OUTER JOIN
            (
                SELECT  s.*
                      , gr.*
                FROM    dbo.students AS s
                JOIN dbo.Grades AS gr
                    ON s.ID                = gr.StudentID
                       AND  gr.GradeLetter = 'A'
            ) AS s
    ON s.TeacherID = t.ID
This code is much easier to read and to understand. Students filtered by Grades and we are only interested in Students who have "A" grade. Therefore we get all records from Teacher table but only get Students who have good grades. If we would try to rewrite the query in the following matter, we would only get Teachers that have Students with perfect grades as we specified inner join; therefore, it takes precedence over the left outer join.
 
SELECT  t.*
      , s.*
FROM    Teacher AS t
LEFT OUTER JOIN dbo.Student AS s
    ON s.TeacherID         = t.ID
JOIN dbo.Grades AS gr
    ON s.ID                = gr.StudentID
       AND  gr.GradeLetter = 'A'
Now that I knew what code was doing I wanted to know performance because I do not care how the code looks as long as it does not perform fast it is useless. The execution plans of the two queries are almost identical with the only difference being that when I rewrote the original query as sub-query, I got additional operation added to the plan. Physical joins were identical and execution time was almost exact.


I could not believe I learned something new after writing T-SQL for many years. All of the sudden I got another tool in my toolbelt to write better code. Now if I ever need to filter tables that should not effect result I can write with this new syntax. To be honest, I have not taken this to the extreme to see how many JOIN operations I can nest before it becomes unreadable. However, I am sure it would not take many before someone else could not figure out what was going on and would have to break it down piece by piece to see it clear.

Just imagine having to support code that looks like example 1 below instead of example 2. Even though the code in the following case produces same execution plan, it becomes hard to read and therefore hard to maintain, and if you have junior developers who are not strong in SQL, they are more likely to make mistakes.
 
--Example 1:
SELECT  t.FullName AS TeacherName
      , s.FullName AS StudentName
      , gr.Class
FROM    dbo.Teacher AS t
JOIN dbo.Student AS s
JOIN dbo.Grades AS gr
JOIN dbo.Class AS c
JOIN dbo.Room AS r
    ON c.ID = gr.ClassID
    ON r.ClassID = c.ID
    ON s.ID = gr.StudentID
    ON s.TeacherID = t.ID

--Example 2:
SELECT  t.FullName AS TeacherName
      , s.FullName AS StudentName
      , gr.Class
FROM  dbo.Teacher AS t
JOIN dbo.Student AS s
    ON t.ID = s.TeacherID
JOIN dbo.Grades AS gr
    ON s.ID = gr.StudentID
JOIN dbo.Class AS c
    ON c.ID = r.ClassID
JOIN dbo.Room AS r
    ON c.ID = gr.ClassID
P.S. I still don't know if this JOIN syntax has special name, but at least I now know what it does.

Thursday, December 29, 2016

Date Dimension aka Calendar Table

Before we get into discussing how to create it date dimension and how to use it, first let's talk about what it is and why do we need it. Depending on who you talk to, people can refer to this concept as "Calendar table" or "Date Dimension," which is usually found in Data Warehouse. No matter how it is called, at the end of the day, it is a table in SQL Server which is populated with different date/calendar related information to help speed up SQL queries which require specific parts of dates.

In my case, I have created it to be able to aggregate data by quarters, years and month. Depending on how large your requirements are it will add additional complexity to building it. Since I don't care about holidays (for now at least), I will not be creating holiday schedule which can be complicated to populate.

Now onto writing some SQL. Step 1: Create a table and add "covered" Nonclustered index that I need for my queries. Depending on your needs you will need to create additional indexes and/or modify the one I'm creating.
 
CREATE TABLE [dbo].[DateDimension]
    (
        [DateKey]             [INT]         NOT NULL
      , [Date]                [DATE]        NOT NULL
      , [Day]                 [TINYINT]     NOT NULL
      , [DaySuffix]           [CHAR](2)     NOT NULL
      , [Weekday]             [TINYINT]     NOT NULL
      , [WeekDayName]         [VARCHAR](10) NOT NULL
      , [IsWeekend]           [BIT]         NOT NULL
      , [DOWInMonth]          [TINYINT]     NOT NULL
      , [DayOfYear]           [SMALLINT]    NOT NULL
      , [WeekOfMonth]         [TINYINT]     NOT NULL
      , [WeekOfYear]          [TINYINT]     NOT NULL
      , [Month]               [TINYINT]     NOT NULL
      , [MonthName]           [VARCHAR](10) NOT NULL
      , [Quarter]             [TINYINT]     NOT NULL
      , [QuarterName]         [VARCHAR](6)  NOT NULL
      , [Year]                [INT]         NOT NULL
      , [MMYYYY]              [CHAR](6)     NOT NULL
      , [MonthYear]           [CHAR](7)     NOT NULL
      , [FirstDayOfMonth]     [DATE]        NOT NULL
      , [LastDayOfMonth]      [DATE]        NOT NULL
      , [FirstDayOfQuarter]   [DATE]        NOT NULL
      , [LastDayOfQuarter]    [DATE]        NOT NULL
      , [FirstDayOfYear]      [DATE]        NOT NULL
      , [LastDayOfYear]       [DATE]        NOT NULL
      , [FirstDayOfNextMonth] [DATE]        NOT NULL
      , [FirstDayOfNextYear]  [DATE]        NOT NULL ,
    )

GO

ALTER TABLE dbo.DateDimension
ADD CONSTRAINT PK_DateDimension
    PRIMARY KEY CLUSTERED (DateKey)

IF NOT EXISTS
    (
        SELECT  1
        FROM    sys.indexes AS i
        WHERE   i.name = 'IX_DateDimension_Date'
    )
    CREATE NONCLUSTERED INDEX [IX_DateDimension_Date]
        ON dbo.DateDimension ([Date])
        INCLUDE ([Month], [Quarter], [Year])
GO


The reason why I chose to include Month, Quarter and Year columns in my index is based on my requirements to aggregate data by those columns. Originally I did not have any columns included, which produced Key Lookup. Now to populate table I have taken some suggestions from Aaron's script, link at the bottom of the post. In my case, I'm truncating table and limiting it to 300 years which is more than plenty for my scenario.

 
TRUNCATE TABLE dbo.DateDimension

DECLARE @StartDate     DATE = '01/01/1900'
      , @NumberOfYears INT  = 300;

DECLARE @CutoffDate DATE = DATEADD( YEAR, @NumberOfYears, @StartDate );

-- prevent set or regional settings from interfering with 
-- interpretation of dates / literals
SET DATEFIRST 7;
SET DATEFORMAT MDY;
SET LANGUAGE US_ENGLISH;
-- Thanks to Aaron for this hint

WITH Pass0 AS
    (
        SELECT  1 AS C
        UNION ALL
        SELECT  1
    )
   , Pass1 AS
    (
        SELECT  1 AS C
        FROM    Pass0 AS A
              , Pass0 AS B
    )
   , Pass2 AS
    (
        SELECT  1 AS C
        FROM    Pass1 AS A
              , Pass1 AS B
    )
   , Pass3 AS
    (
        SELECT  1 AS C
        FROM    Pass2 AS A
              , Pass2 AS B
    )
   , Pass4 AS
    (
        SELECT  1 AS C
        FROM    Pass3 AS A
              , Pass3 AS B
    )
   , Pass5 AS
    (
        SELECT  1 AS C
        FROM    Pass4 AS A
              , Pass4 AS B
    )
   , Pass6 AS
    (
        SELECT TOP (DATEDIFF( DAY, @StartDate, @CutoffDate ))
            rn = ROW_NUMBER() OVER (ORDER BY Pass5.C)
        FROM    Pass5
    )
   , Pass7 AS
    (
        SELECT  a.Date
              , [day]          = DATEPART( DAY, a.Date )
              , [week]         = DATEPART( WEEK, a.Date )
              , [month]        = DATEPART( MONTH, a.Date )
              , [quarter]      = DATEPART( QUARTER, a.Date )
              , [year]         = DATEPART( YEAR, a.Date )
              , [DayOfWeek]    = DATEPART( WEEKDAY, a.Date )
              , [FirstOfMonth] = CONVERT( DATE, DATEADD( MONTH, DATEDIFF( MONTH, 0, a.Date ), 0 ))
              , FirstOfYear    = CONVERT( DATE, DATEADD( YEAR, DATEDIFF( YEAR, 0, a.Date ), 0 ))
              , [MonthName]    = DATENAME( MONTH, a.Date )
              , Style101       = CONVERT( CHAR(10), a.Date, 101 )
              , Style112       = CONVERT( CHAR(8), a.Date, 112 )
        FROM
            (
                SELECT  [Date] = DATEADD( DAY, Pass6.rn - 1, @StartDate )
                FROM    Pass6
            ) a
    )
INSERT  dbo.DateDimension WITH (TABLOCKX)
SELECT  DateKey             = CONVERT( INT, CONVERT( CHAR(8), pass7.Date, 112 ))
      , [Date]              = pass7.Date
      , [Day]               = CONVERT( TINYINT, pass7.day )
      , DaySuffix           = CONVERT(   CHAR(2)
                                       , CASE
                                             WHEN pass7.day / 10 = 1
                                                 THEN 'th'
                                         ELSE CASE RIGHT(pass7.day, 1)
                                                  WHEN '1'
                                                      THEN 'st'
                                                  WHEN '2'
                                                      THEN 'nd'
                                                  WHEN '3'
                                                      THEN 'rd'
                                              ELSE 'th'
                                              END
                                         END
                                     )
      , [Weekday]           = CONVERT( TINYINT, pass7.DayOfWeek )
      , [WeekDayName]       = CONVERT( VARCHAR(10), DATENAME( WEEKDAY, pass7.Date ))
      , [IsWeekend]         = CONVERT(   BIT
                                       , CASE
                                             WHEN pass7.DayOfWeek IN (
                                                                         1, 7
                                                                     )
                                                 THEN 1
                                         ELSE 0
                                         END
                                     )
      , [DOWInMonth]        = CONVERT(   TINYINT
                                       , ROW_NUMBER() OVER (PARTITION BY pass7.FirstOfMonth
                                                                       , pass7.DayOfWeek
                                                            ORDER BY pass7.Date
                                                           )
                                     )
      , [DayOfYear]         = CONVERT( SMALLINT, DATEPART( DAYOFYEAR, pass7.Date ))
      , WeekOfMonth         = CONVERT(   TINYINT
                                       , DENSE_RANK() OVER (PARTITION BY pass7.year
                                                                       , pass7.month
                                                            ORDER BY pass7.week
                                                           )
                                     )
      , WeekOfYear          = CONVERT( TINYINT, pass7.week )
      , [Month]             = CONVERT( TINYINT, pass7.month )
      , [MonthName]         = CONVERT( VARCHAR(10), pass7.MonthName )
      , [Quarter]           = CONVERT( TINYINT, pass7.quarter )
      , QuarterName         = CONVERT(   VARCHAR(6)
                                       , CASE pass7.quarter
                                             WHEN 1
                                                 THEN 'First'
                                             WHEN 2
                                                 THEN 'Second'
                                             WHEN 3
                                                 THEN 'Third'
                                             WHEN 4
                                                 THEN 'Fourth'
                                         END
                                     )
      , [Year]              = pass7.year
      , MMYYYY              = CONVERT( CHAR(6), LEFT(pass7.Style101, 2) + LEFT(pass7.Style112, 4))
      , MonthYear           = CONVERT( CHAR(7), LEFT(pass7.MonthName, 3) + LEFT(pass7.Style112, 4))
      , FirstDayOfMonth     = pass7.FirstOfMonth
      , LastDayOfMonth      = MAX( pass7.Date ) OVER (PARTITION BY pass7.year
                                                                 , pass7.month
                                                     )
      , FirstDayOfQuarter   = MIN( pass7.Date ) OVER (PARTITION BY pass7.year
                                                                 , pass7.quarter
                                                     )
      , LastDayOfQuarter    = MAX( pass7.Date ) OVER (PARTITION BY pass7.year
                                                                 , pass7.quarter
                                                     )
      , FirstDayOfYear      = pass7.FirstOfYear
      , LastDayOfYear       = MAX( pass7.Date ) OVER (PARTITION BY pass7.year)
      , FirstDayOfNextMonth = DATEADD( MONTH, 1, pass7.FirstOfMonth )
      , FirstDayOfNextYear  = DATEADD( YEAR, 1, pass7.FirstOfYear )
FROM    Pass7


Few things to consider in the script. Depending on requirements and local setting, part of the script would need to be adjusted to reflect correct data. For example, Sunday could the first day of the week in some countries and last day of the week in other coutnries. Same goes for date formats, in United States we have Month/Day/Year back home in Russia I always wrote Day/Month/Year which could create large issues when trying to compare data stored in one format to date dimension that is stored in a different format. That is why the following two commands are so important SET DATEFIRST 7 and SET DATEFORMAT MDY;

Now that table is populated you too can run aggregate queries without having to generate data at run time. Enjoy!

References:
Bertrand, Aaron. "Creating a Date Dimension or Calendar Table in SQL Server." Creating a Date Dimension or Calendar Table in SQL Server. N.p., 20 Oct. 2015. Web. 29 Dec. 2016. <https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/>.