How I do a select a limited number of rows with the same attribute?
I have a table of URLs and one attribute is the domain.
Supposing I have 100 URLs from Google, 100 from Facebook, 100 from Ebay and the same for others domains, but I want to retrieve the first 30 URLs from Google, 30 from Faceboo开发者_高级运维k, 30 from Ebay and 30 from the others limiting to a max of 500 URLs.
How I do this?
The following SQL resolves my case, but the urls are out of order because the row_number don't follow the orders. I think this SQL need some improvement.
SELECT url,row_number FROM(
SELECT url,row_number() OVER (PARTITION BY domain) FROM website
WHERE domain IN
(SELECT DISTINCT domain FROM link)
) AS links
WHERE row_number <= 10
LIMIT 25
How about something like this:
SELECT url FROM link WHERE domain='Google' LIMIT 30
UNION
SELECT url FROM link WHERE domain='Facebook' LIMIT 30
UNION
SELECT ...
etc.
My solution in PL/pgSQL based. I found a better way, that you don't need additional temporary table (using INSERT
to that table in function's body), that is you can use that function to get result directly:
CREATE OR REPLACE FUNCTION getURLs(singleLimit integer, totalLimit integer)
RETURNS SETOF RECORD AS $$
DECLARE
dom text;
nrOfDomains integer;
i integer;
lim integer;
remainder integer;
BEGIN
nrOfDomains := totalLimit/singleLimit; -- integer division (truncate)
remainder := totalLimit%singleLimit;
IF remainder <> 0 THEN
nrOfDomains := nrOfDomains + 1;
END IF;
i := 1;
FOR dom IN SELECT DISTINCT domain FROM website LIMIT nrOfDomains
LOOP
IF i = nrOfDomains AND remainder <> 0 THEN
lim := remainder;
ELSE
lim := singleLimit;
END IF;
RETURN QUERY SELECT * FROM website WHERE domain = dom LIMIT lim;
i := i + 1;
END LOOP;
RETURN;
END $$
LANGUAGE 'plpgsql';
Here is some test-drive:
postgres=> CREATE TABLE website(url text, domain text);
CREATE TABLE
postgres=> INSERT INTO website
SELECT 'http://' || d.column1 ||'/' || n, d.column1
FROM generate_series(1, 100) n CROSS JOIN
(VALUES ('google'), ('facebook'), ('ebay')) d;
INSERT 0 300
postgres=> SELECT * FROM getURLs(10, 25) website(url text, domain text);
Result:
url | domain
--------------------+----------
http://google/1 | google
http://google/2 | google
http://google/3 | google
http://google/4 | google
http://google/5 | google
http://google/6 | google
http://google/7 | google
http://google/8 | google
http://google/9 | google
http://google/10 | google
http://facebook/1 | facebook
http://facebook/2 | facebook
http://facebook/3 | facebook
http://facebook/4 | facebook
http://facebook/5 | facebook
http://facebook/6 | facebook
http://facebook/7 | facebook
http://facebook/8 | facebook
http://facebook/9 | facebook
http://facebook/10 | facebook
http://ebay/1 | ebay
http://ebay/2 | ebay
http://ebay/3 | ebay
http://ebay/4 | ebay
http://ebay/5 | ebay
(25 rows)
精彩评论