开发者

Optimization of SQL query regarding pair comparisons

I'm working on a pair comparison site where a user loads a list of films and grades from another site. My site then picks two random movies and matches them against each other, the user selects the better of the two and a new pair is loaded. This gives a complete list of movies ordered by whichever is best.

The database contains three tables;

fm_film_data - this contains all imported movies

fm_film_data(id int(11), 
             imdb_id varchar(10), 
             tmdb_id varchar(10), 
             title varchar(255),     
             original_title varchar(255),    
             year year(4),
             director text,
             description text,
             poster_url varchar(255))

fm_films - this contains all information related to a user, what movies the user has seen, what grades the user has given, as well as information about each film's wins/losses for that user.

fm_films(id int(11),
         user_id int(11),
         film_id int(11),
         grade int(11),  
         wins int(11),   
         losses int(11))

fm_log - this contains records of every duel that has occurred.

fm_log(id int(11),
       user_id int(11),
       winner int(11),
       loser int(11))

To pick a pair to show the user, I've created a mySQL query that checks the log and picks a pair at random.

SELECT pair.id1, pair.id2 
FROM
    (SELECT part1.id AS id1, part2.id AS id2 
    FROM fm_films AS part1, fm_films AS part2 
    WHERE part1.id <> part2.id 
        AND part1.user_id = [!!USERID!!] 
        AND part2.user_id = [!!USERID!!]) 
AS pair
LEFT JOIN
    (SELECT winner AS id1, loser AS id2 
    FROM fm_log
    WHERE fm_log.user_id = [!!USERID!!]
    UNION
    SELECT loser AS id1, winner AS id2 
    FROM fm_log
    WHERE fm_log.user_id = [!!USERID!!])
AS log
ON pair.id1 = log.id1 AND pair.id2 = log.id2
WHERE log.id1 IS NULL
ORDER BY RAND()
LIMIT 1

T开发者_运维百科his query takes some time to load, about 6 seconds in our tests with two users with about 800 grades each.

I'm looking for a way to optimize this but still limit all duels to appear only once.

The server runs MySQL version 5.0.90-community.


i think you are better off creating a stored procedure/function which will return a pair as soon as it found a valid one.

make sure there are proper indexes:

  • fm_films.user_id (try including the film_id also)
  • fm_log.user_id (try including the winner and loser)

DELIMITER $$

DROP PROCEDURE IF EXISTS spu_findPair$$

CREATE PROCEDURE spu_findPair
(
    IN vUserID INT
)
BEGIN
    DECLARE done BOOLEAN DEFAULT FALSE;
    DECLARE vLastFilmID INT;
    DECLARE vCurFilmID INT;
    DECLARE cUserFilms CURSOR FOR
        SELECT id
        FROM fm_films
        WHERE user_id  = vUserID
        ORDER BY RAND();
    DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=TRUE;
    OPEN cUserFilms;
    ufLoop: LOOP
        FETCH cUserFilms INTO vCurFilmID;
        IF done THEN
            CLOSE cUserFilms;
            LEAVE ufLoop;
        END IF;
        IF vLastFilmID IS NOT NULL THEN
            IF NOT EXISTS
                (
                    SELECT 1
                    FROM fm_log
                    WHERE user_id  = vUserID
                        AND ((winner = vCurFilmID AND loser = vLastFilmID) OR (winner = vLastFilmID AND loser = vCurFilmID))
                ) THEN

                CLOSE cUserFilms;
                LEAVE ufLoop;
                #output
                SELECT vLastFilmID, vCurFilmID;
            END IF;
        END IF;
    END LOOP;

END$$

DELIMITER ;


Have you tried applying any indexes to the tables?

The user_id columns would be a good start. The id field that is also used in the WHERE clause would be another index that might be worth adding. Benchmakr to make sure the addition of the indices do result in speedups and do not slow other code (eg. insertions).

However, I have found that simple indexes on short tables like these can still result in some huge speed ups when they apply to fields in the WHERE clauses of SELECT and UPDATE statements.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜