Getting the first of a GROUP BY clause in SQL
I'm trying to implement single-column regionalization for a Rails application and I'm running into some major headaches with a complex SQL need. For this system, a region can be represented by a country code (e.g. us
) a continent code that is uppercase (e.g. NA
) or it can be NULL
indicating the "default" information. I need to group these items by some relevant information such as a foreign key (we'll call it external_id
).
Given a country and its continent, I need to be able to select only the most specific region available. So if records exist with the country code, I select them. If, not I want a records with the continent code. If not that, I want records with a NULL
code so I can receive the default values.
So far I've figured that I may be able to use a generated CASE
statement to get an arbitrary sort order. Something like this:
SELECT *, CASE region
开发者_Python百科 WHEN 'us' THEN 1
WHEN 'NA' THEN 2
ELSE 3
END AS region_sort
FROM my_table
WHERE region IN ('us','NA') OR region IS NULL
GROUP BY external_id
ORDER BY region_sort
The problem is that without an aggregate function the actual data returned by the GROUP BY
for a given row seems to be untameable. How can I massage this query to make it return only the first record of the region_sort
ordered groups?
Only the first record or only the first group of records? It's not even clear from what you've written whether there is more than one record or not.
In any case, it seems you are bending over backwards to do this in one query, but the database structure is not optimized for this. If there's just 3 levels and you want the most specific, why not just:
SELECT * FROM my_table WHERE region = 'us' GROUP BY external_id
If that returns something then you, stop, otherwise you run 1 or 2 more queries conditionally.
I could be wrong, but my instinct says that will yield much better overall performance, though I suppose it depends on the particulars of your DB.
精彩评论