开发者

SQL: select random row from table where the ID of the row isn't in another table?

I've been looking at fast ways to select a random row from a table and have found the following site: http://74.125.77.132/search?q=cache:http://jan.kneschke.de/projects/mysql/order-by-rand/&hl=en&strip=1

What I want to do is to select a random url from my table 'urls' that I DON'T have in my other table 'urlinfo'.The query I am using now selects a random url from 'urls' but I need it modified to only return a random url that is NOT in开发者_StackOverflow中文版 the 'urlinfo' table.

Heres the query:

SELECT url 
FROM urls JOIN (SELECT CEIL(RAND() * (SELECT MAX(urlid)
                                     FROM urls
                                     )
                           ) AS urlid 
               ) AS r2 USING(urlid);

And the two tables:

CREATE TABLE urls (
 urlid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 url VARCHAR(255) NOT NULL
) ENGINE=INNODB;


CREATE TABLE urlinfo (
 urlid  INT NOT NULL PRIMARY KEY,
 urlinfo VARCHAR(10000),
 FOREIGN KEY (urlid) REFERENCES urls (urlid)
   ) ENGINE=INNODB;


How about working from this random solution:

SELECT TOP 1 * FROM urls
WHERE (SELECT COUNT(*) FROM urlinfo WHERE urlid = urls.urlid) = 0
 ORDER BY NEWID()


You could use where not exists to exclude rows that are in the other table. For a random row, one option is a order by rand() with a limit 1:

SELECT url
FROM urls
WHERE NOT EXISTS (
    SELECT *
    FROM urlinfo ui
    WHERE ui.urlid = urls.urlid
)
ORDER BY RAND()
LIMIT 1


You need to first do a left outer join to get the set of records in 'urls' that are not in 'urlinfo', then pick a random record from that set.

SELECT * FROM urls
LEFT OUTER JOIN urlinfo
ON urls.urlid = urlinfo.urlid
WHERE urlinfo.urlid IS null

Now pick a random row from this set - you can do something like

SELECT newUrls.url
FROM (    
      SELECT urls.urlid, urls.url FROM urls
      LEFT OUTER JOIN urlinfo
      ON urls.urlid = urlinfo.urlid
      WHERE urlinfo.urlid IS null
     ) as newUrls
WHERE urls.urlid >= RAND() * (SELECT MAX(urlid) FROM urls) LIMIT 1

However, this will only work if the urlids in urlinfo are roughly randomly distributed across the range of possible values.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜