Related video algorithm MySql using genre
I've looked around for info on an efficient 'related videos' algorithm but i'm struggling to get well ordered, accurate results
I get given the 'genre' as a pipe-delimited string. eg: |Action|Sci-Fi|Thriller|
$genre = explode("|", $row['genre']);
if (count($genre) == 3) {
$sql = "SELECT title FROM `movie` WHERE genre LIKE '%$genre[1]%' LIMIT 0,8";
} else {
$sql = "SELECT title FROM `movie` WHERE (genre LIKE '%$genre[1]%' AND genre LIKE '%$genre[2]%') UNION SELECT title FROM `movie` WHERE (genre LIKE '%$genre[1]%' 开发者_StackOverflowOR genre LIKE '%$genre[2]%') LIMIT 0,10";
}
$related = mysql_query($sql);
Then I basically explode it and do a manual, inefficient search for genre matches depending on genre count. The results are poor and returns anything that is semi related.
This code makes me want to gag! It works but I hate it and I know its uber lame. Any tips to improve the SQL and getting richer results?
Move the mappings of genres to movies into a new table movie_genres
with columns movie
and genre
.
This allows you to do this:
$genres = explode('|', trim($row['genre'], '|'));
$sql = "SELECT `movie`, COUNT(*) AS hits
FROM `movie_genres`
WHERE `genre` IN ('" . join("', '", $genres) . "')
GROUP BY `movie`
ORDER BY `hits` DESC
LIMIT 8";
You have to make sure to prevent SQL injection, though.
The extra table is also a good idea, because your database schema is not normalized. Especially Chris Date's fourth condition of the first normal form is violated:
Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
精彩评论