Powered By Blogger

Sunday, May 13, 2012

Pivot two or more columns in SQL Server 2005


Results.jpg

Introduction

MSDN states that you cannot pivot two columns, this is correct. If you want two columns with the same column header, then you cannot achieve it short of a manual massage or a reporting tool.
However, you can pivot two or more columns with a little organization. In this mini article, I will show how to pivot two fields. From the Adventureworks database, I will display the number of male and female hires per year per department, as in the above image.

Background

A developer often runs across a requirement that the tool, be it a language, or SQL or third party tool, is not designed to accomplish. In my opinion, the skill of the developer is in bending the environment to meet the requirements. Sometimes, this is not possible, but a good developer should be able to achieve most business requirements.
Most examples of the pivot use static column labels; this is rarely, if ever a real world solution. In every single case where I have used a pivot to solve a requirement, I have had to use dynamic column headers. This requires you to use dynamic SQL, the bane of all developers' lives.
Normally, I would use a table variable to prepare my data; however, you cannot pass a table variable to dynamic SQL. You can, however, reference a temp table, and that is what has been used in this example.
Some of the SQL demonstrated in this snippet:
  • Using a temporary table to get a result set into dynamic SQL
  • Cleaning up temp tables
  • Table variables
  • Case statement to do conditional aggregation
  • Dynamic SQL
  • Concatenation of rows in a single Select statement
  • How to construct a pivot select

Using the code

You need to have the Adventureworks sample database installed. The download includes three files pivot01/2/3. These demonstrate the stages of the development of the snippet.
When constructing a pivot statement, you need three types of information in your table. The row axis, in this case the department labels, the column axis, the year/gender labels, and the data to populate the body of the pivot, in this case, the count of male/females by their hire year.
Preparing the data: We need to get a table with the following results. This is achieved with Pivot01.sql. Note the use of the CASE statement to aggregate only a single gender in each of the union selects.
Also, we need to differentiate the column labels for each year/gender.
--Create a temp table because we cannot pass a table var into dynamic SQL
CREATE TABLE #TblX(Department VARCHAR(100),Staff INT, Yr VARCHAR(10))

--Insert the male records Note the case statement for the gender and the label creation
INSERT [#TblX]    
SELECT 
    D.Department,
    -- we only need the male staff in this column
    SUM(CASE E.Gender WHEN 'M' THEN 1 ELSE 0 END) Staff, 
    -- set the may year label
    CONVERT(CHAR(4),DATEPART(yy,E.HireDate)) + '-M' Yr 
FROM HumanResources.Employee E
    INNER JOIN HumanResources.vEmployeeDepartment D ON D.EmployeeID = E.EmployeeID
GROUP BY 
    D.Department,
    DATEPART(yy,E.HireDate)

UNION    
-- Now get the female staff, changing the label
SELECT 
    D.Department,
    SUM(CASE E.Gender WHEN 'F' THEN 1 ELSE 0 END) Staff,
    -- set the female year label
    CONVERT(CHAR(4),DATEPART(yy,E.HireDate)) + '-F' Yr 
FROM HumanResources.Employee E
    INNER JOIN HumanResources.vEmployeeDepartment D ON D.EmployeeID = E.EmployeeID
GROUP BY 
    D.Department,
    DATEPART(yy,E.HireDate)

Organizing the column labels

This is done in Pivot02.SQL. What we need is a CSV string of all the labels (year/gender) to be pivoted. This is one of the reasons I use a table variable, I do not need to go back to the disk again to get the column labels.
Because I cannot use Select Distinct and the string concatenation, I have split it into two operations and used another table variable to get the distinct year/gender column labels.
--We now need to build a CSV list of labels to use
DECLARE @TblY TABLE(Yr CHAR(6))
INSERT @TblY 
SELECT  DISTINCT Yr FROM [#TblX]
    
--Get the dynamic list of years (DISTINCT and the concatenation does not work)
SELECT @List = ISNULL(@List,'') + CASE 
    WHEN ISNULL(@List,'') = '' THEN '[' + yr + ']' 
    ELSE ',[' + Yr + ']' end
FROM @TblY
ORDER BY Yr
We now have all the bits ready to put together into a pivot select.
So, the inner query which will supply the data is:
Select Department, Yr, Staff FROM #TblX F
Prepare_data.jpg
This we need to wrap with the start and end parts of the pivot. The start supplies the column information, so it isDepartment and @List. The end supplies the aggregation and the pivot. In this case, the aggregation has already be done, so we simply use MAX() to display the staff count, tell it the column to pivot, and supply the column values - @List.
--Now to build the pivot query
Set @SQL = 'Select Department, ' + @List + char(13) 
--Inside query
Set @SQL = @SQL + 'From (Select Department, Yr, Staff ' + char(13) 
Set @SQL = @SQL + 'FROM #TblX F) P ' + char(13) 
--Pivot 
Set @SQL = @SQL + 'Pivot (Max(Staff) For Yr In ('  + @List + ')) as Pvt' + char(13) 


Print @SQL
Exec (@SQL)
DROP TABLE [#TblX]

Points of interest

I hate the gotcha where you cannot use a variable in dynamic SQL. I find the concatenation technique endlessly useful. I just wish I knew who originated it to thank them.
Adding additional columns to be pivoted is simply a matter of organizing the labels and rows in the temp table. E.g., you could go to sales, and pivot sales/target/performance for each sales person per year.
Of course, the business is going to come back and complain that they do not like the NULLs and want them replaced by 0. This can be achieved by a cross join of year/gender and departments and have your initial preparation select update the temp table instead of inserting.

No comments: