SQL SELECT DISTINCT and GROUP BY problem
I need to return only 1 row per specific database entry. For e.g. if I have:
ID col1 col2
1 1 A
2 1 B
3 1 C
4 2 D
5 3 E
6 4 F
7 4 G
in MySQL I can run a query
SELECT DISTINCT col1, col2 FROM table GROUP BY col1
and I will get ->
ID col1 col2
1 1 A
4 2 D
5 3 E
6 4 F
which is what I want, but if I run the same query in SQL Server I get an error..
So, basically I need to return only ONE (or FIRST) "col1" AND its "col2" from each row in the table..
What would be the correct syntax for SQL Server?
Thank y开发者_开发知识库ou for your time!
Andrej
EDIT:
the complete query that works in mysql is ->
SELECT DISTINCT list_order, category_name, category_id
FROM `jos_vm_category`
WHERE `category_publish` = 'Y'
GROUP BY list_order
so, FOR EACH "list_order" number i want to return the category_name and category_id from that row, and ignore every other row with the same "list_order" number
Another approach is to use the ROW_NUMBER()
function in T-SQL, which allows you to "partition" your data by something - like your col1
here - and get incremental numbers for each of those data partitions. If you want only the first entry for each "partition", just query it this way (using a Common Table Expression - CTE):
;WITH Distincts AS
(
SELECT ID, col1, col2,
ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY ID) AS 'RowNo'
FROM dbo.Table
)
SELECT
ID, col1, col2
FROM
Distincts
WHERE
RowNo = 1
This gives me the output you're looking for.
You can get all rows corresponding to the first instance of a value in col1 as follows
SELECT col1, col2
FROM table t1
where not exists
(select 1
from table t2
where t2.col1 = t1.col1 and
t2.id < t1.id)
EDIT: posting a comparison of the queries
EDIT2: Posting the comparison of actual query execution plan, with about 2000 rows of data randomly generated, and with a id being a identity column with a unique index. A second index on id and col1.
Turns out Marc's query is faster after all, at least with this data!
精彩评论