Common Table Expressions (CTE) were introduced into SQL Server 2005. The primary purpose was to perform recursive queries – a big plus feature to SQL Server. However they can also be used for many other purposes too.
A problem I was recently presented with was to take a common GROUP BY query that showed sales by category and also show the top selling item within that category.
Here is a sample schema and data we are working with:
CREATE TABLE Products( ProductID int NOT NULL PRIMARY KEY, ProductName varchar(100) NOT NULL, CategoryID int NOT NULL); INSERT INTO Products SELECT 1, 'Red Car', 1 UNION ALL SELECT 2, 'Green Car', 1 UNION ALL SELECT 3, 'Black Car', 1 UNION ALL SELECT 4, 'Red Truck', 2 UNION ALL SELECT 5, 'White Truck', 2; CREATE TABLE ProductCategories( CategoryID int NOT NULL PRIMARY KEY, CategoryName varchar(100) NOT NULL); INSERT INTO ProductCategories SELECT 1, 'Cars' UNION ALL SELECT 2, 'Trucks'; CREATE TABLE SalesOrderDetails( DetailID int IDENTITY(1,1) NOT NULL PRIMARY KEY, ProductID int NOT NULL, Qty int NOT NULL); INSERT INTO SalesOrderDetails SELECT 1, 5 UNION ALL SELECT 2, 4 UNION ALL SELECT 2, 2 UNION ALL SELECT 2, 6 UNION ALL SELECT 1, 1 UNION ALL SELECT 3, 4 UNION ALL SELECT 4, 3 UNION ALL SELECT 4, 2 UNION ALL SELECT 5, 1 UNION ALL SELECT 5, 8 UNION ALL SELECT 5, 1;
Historically this query would be done by creating a temp table or at least a derived table, performing the GROUP BY and doing a sub query to get the top item. This would work and performance would be fine on a small set of data. However it involves several statements and the sub query is limited to one column being returned.
Using 3 of the newer features we can write the query in one statement as follows:
WITH CTE AS( SELECT P.CategoryID, ProductName, SUM(D.Qty) Qty, ROW_NUMBER() OVER (PARTITION BY P.CategoryID ORDER BY SUM(D.Qty) DESC) as [RANK] FROM Products P JOIN SalesOrderDetails D ON D.ProductID = P.ProductID GROUP BY P.CategoryID, ProductName) SELECT C.CategoryName, SUM(CTE.Qty) TotalUnitSales, T.ProductName TopSellingProduct FROM CTE JOIN ProductCategories C ON C.CategoryID = CTE.CategoryID CROSS APPLY (SELECT ProductName FROM CTE PR WHERE PR.[Rank] = 1 AND PR.CategoryID = CTE.CategoryID) T GROUP BY C.CategoryName, T.ProductName;
Using the CTE we can get the total sales per product, we also rank the sales using the Row_Number feature over each category. Running just the CTE portion of the statement gives you:
The second part of the query aggregates the sales per category. And then lastly we use CROSS APPLY to select the top selling product for each category. This gives us the final resultset of:
And happy days we have our results as expected. This is a very simple example of what could be done but hopefully this gives you the general idea.