SQL Server Logical Query Processing

Recently, while reviewing one of the stored procedure performance, I suggested a couple of changes to a developer to improve the performance of the SP. One of the suggestions was to replace DISTINCT with GROUP BY.

A few days later, he came back to me with a performance matrix to show that there was no performance improvement when he replaced the DISTINCT with GROUP BY.

I wished I could have informed him the reasons for replacing it during our initial discussion. But, it happened for a good purpose because the performance matrix double confirmed that we were not going to see performance improvement in our scenario by changing the clause.

Below are the two reasons to ask him to replace the DISTINCT with GROUP BY;

  • Logical Query Processing order
  • In some cases, GROUPBY performs better than DISTINCT.

Logical Query Processing or Binding Order

The “Logical Query Processing” or “Logical Binding Order” of a SELECT statement defines how a query should be processed and final result achieved. The following list shows the order in which the clauses are logically processed by SQL Server. 

8. SELECT
9. DISTINCT
11. TOP
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH CUBE/ROLLUP
7. HAVING
10. ORDER BY
12. OFFSET/FETCH

Let’s arrange the order to make it easier to read.

1.  FROM
2.  ON
3.  JOIN
4.  WHERE
5.  GROUP BY
6.  WITH CUBE/ROLLUP
7.  HAVING
8.  SELECT
9.  DISTINCT
10. ORDER BY
11. TOP
12. OFFSET/FETCH

Demonstration

Let’s use the below sample query to understand it.

  • If you look at the “ORDER BY” (order by salesperson) of the inner query, it is using alias “Salesperson” not “s.[Salesprson Key]” actual column name.  If I execute the above query, it runs successfully.
USE WideWorldImportersDW
GO
SELECT * FROM (
			SELECT DISTINCT TOP 10 s.[Salesperson Key] AS Salesperson, sum(s.profit) AS profit
			FROM fact.Sale s 
			INNER JOIN  fact.Sale s1 ON s1.[sale key] = s.[Sale Key]
			INNER JOIN Dimension.Employee e ON s.[Salesperson Key] = e.[Employee Key]
			WHERE e.[Is Salesperson] =1
			GROUP BY s. [Salesperson Key]
			HAVING COUNT (s.[Salesperson Key])>1
			ORDER BY Salesperson 
		) A 
ORDER BY a.[Salesperson]
OFFSET 7 ROWS;

Here is the result of the query.

ReasonIf you follow the above logical processing order, the “ORDER BY” comes after “SELECT” clause.

  • Let’s use the alias name “Salesperson” in the WHERE clause (Salesperson =15) to filter out the data.
USE WideWorldImportersDW
GO
SELECT * FROM (
			SELECT DISTINCT TOP 10 s.[Salesperson Key] AS Salesperson, sum(s.profit) AS profit
			FROM fact.Sale s 
			INNER JOIN  fact.Sale s1 ON s1.[sale key] = s.[Sale Key]
			INNER JOIN Dimension.Employee e ON s.[Salesperson Key] = e.[Employee Key]
			WHERE e.[Is Salesperson] =1
			AND Salesperson = 15
			GROUP BY s. [Salesperson Key]
			HAVING COUNT (s.[Salesperson Key])>1
			ORDER BY Salesperson 
		) A 
ORDER BY a.[Salesperson]
OFFSET 7 ROWS;

The query failed with below error. 

Msg 207, Level 16, State 1, Line 49
Invalid column name 'Salesperson'.

ReasonIn the logical processing order, the “WHERE” comes before “SELECT” clause that’s why it got failed.

  • Now, let’s use the alias name “Salesperson” in the HAVING clause (HAVING COUNT (Salesperson)>1).
USE WideWorldImportersDW
GO
SELECT * FROM (
			SELECT DISTINCT TOP 10 s.[Salesperson Key] AS Salesperson, sum(s.profit) AS profit
			FROM fact.Sale s 
			INNER JOIN  fact.Sale s1 ON s1.[sale key] = s.[Sale Key]
			INNER JOIN Dimension.Employee e ON s.[Salesperson Key] = e.[Employee Key]
			WHERE e.[Is Salesperson] =1
			GROUP BY s. [Salesperson Key]
			HAVING COUNT (Salesperson)>1
			ORDER BY Salesperson 
		) A 
ORDER BY a.[Salesperson]
OFFSET 7 ROWS;

The query again failed with below error. 

Msg 207, Level 16, State 1, Line 49
Invalid column name 'Salesperson'.

ReasonThe “HAVING” clause comes before “SELECT” clause.

Happy Learning!!

Leave a comment

Your email address will not be published. Required fields are marked *