I was asked today how to select distinct columns while ordering by another column that does not

appear in the column output list. For example:

select distinct [name]
from master..syscolumns
order by xtype

If you try to run this code, you will get the error message:

Server: Msg 145, Level 15, State 1, Line 1
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

One way to get around this is to place the ORDER BY in a derived table, used as the source for

the DISTINCT query. eg:

select distinct [name]
from (
select [name]

from master..syscolumns order by xtype
) as dt

However you will get this error:

Server: Msg 1033, Level 15, State 1, Line 5
The ORDER BY clause is invalid in views, inline functions, derived tables, and subqueries, unless

TOP is also specified.

But what if you don’t want to limit your results to just the X TOP rows? What if you

want all the rows?

Well not only can TOP be used to limit the number of rows a query returns, it can be used to

instruct (some what redundantly) that you want all the rows: TOP 100 PERCENT.

select distinct [name]
from (
select top 100 percent [name]
from master..syscolumns order by xtype
) as dt

And there you have it! Distinct results ordered by a column that is not output!

Here’s the SHOWPLAN_TEXT that shows the sort takes place:

  |--Hash Match(Aggregate, HASH:([syscolumns].[name]), RESIDUAL:([syscolumns].[name]=[syscolumns].[name]))
       |--Sort(ORDER BY:([syscolumns].[xtype] ASC))
            |--Clustered Index Scan(OBJECT:([master].[dbo].[syscolumns].[syscolumns]))
No tags for this post.