Friday, November 5, 2010

T-SQL Select any row, but only one per key value

A colleague of mine asked me for help with a database query. The table in question where of the normal type with an id column and several columns of data. However, the id column was not holding unique id's but the same id could be used several times, with different data for each occurrence. The task was to select only one row per id and it could be any one of the available.

My example table looked like this


The first column, t1, is holding the to-be distinct id:s. t2 and t3 is random data.

A common solution to this problem seems to be to use cursors or temporary tables. I couldn't see why it shouldn't be possible to do such a select without using them, so after some thinking, I came up with the following

SELECT t1, t2, t3
FROM (
SELECT t1, t2, t3, ROW_NUMBER() OVER (PARTITION BY t1 ORDER BY t1) rowrank
FROM Table_1
) temp_a
WHERE rowrank <= 1

Basically we select over the rows creating a ranking value for each occurrence of the id. Then we select from this data set only the rows with the rank of 1, giving us just the first occurence per id. Hence we get this output


So for every occurrence of an id in the key column t1, only one row will be selected. With modification of the inner query, a choice can be made of which one of the rows that is deemed more interesting than the others (but in this case, it didn't matter).

No comments:

Post a Comment