Efficiently selecting from many-to-many relation in H2

2023-03-27 02:35 问答作者：

I'm using H2, and I have a database of books (table Entries) and authors (table Persons), connected through a many-to-many relationship, itself stored in a table Authorship. The database is fairly large (900'000+ persons and 2.5M+ books).

I'm trying to efficiently select the list of all books authored by at least one author whose name matches a pattern (LIKE '%pattern%'). The trick here is that the pattern should severly restrict the number of matching authors, and each author has a reasonably small number of associated books.

I tried two queries:

SELECT p.*, e.title FROM (SELECT * FROM Persons WHERE name LIKE '%pattern%')  AS p
  INNER JOIN Authorship AS au ON au.authorId = p.id
  INNER JOIN Entries AS e ON e.id = au.entryId;

and:

SELECT p.*, e.title FROM Persons AS p
  INNER JOIN Authorship AS au ON au.au开发者_运维知识库thorId = p.id
  INNER JOIN Entries AS e ON e.id = au.entryId
WHERE p.name like '%pattern%';

I expected the first one to be much faster, as I'm joining a much smaller (sub)table of authors, however they both take as long. So long in fact that I can manually decompose the query into three selects and find the result I want faster.

When I try to EXPLAIN the queries, I observe that indeed they are very similar (a full join on the tables and only then a WHERE clause), so my question is: how can I achieve a fast select, that relies on the fact that the filter on authors should result in a much smaller join with the other two tables?

Note that I tried the same queries with MySQL and got results in line with what I expected (selecting first is much faster).

Thank you.

OK, here is something that finally worked for me.

Instead of running the query:

SELECT p.*, e.title FROM (SELECT * FROM Persons WHERE name LIKE '%pattern%') AS p
  INNER JOIN Authorship AS au ON au.authorId = p.id
  INNER JOIN Entries AS e ON e.id = au.entryId;

...I ran:

SELECT title FROM Entries e WHERE id IN (
  SELECT entryId FROM Authorship WHERE authorId IN (
    SELECT id FROM Persons WHERE name LIKE '%pattern%'
  )
)

It's not exactly the same query, because now I don't get the author id as a column in the result, but that does what I wanted: take advantage of the fact that the pattern restricts the number of authors to a very small value to search only through a small number of entries.

What is interesting is that this worked great with H2 (much, much faster than the join), but with MySQL it is terribly slow. (This has nothing to do with the LIKE '%pattern%' part, see comments in other answers.) I suppose queries are optimized differently.

SELECT * FROM Persons WHERE name LIKE '%pattern%' will always take LONG on a 900,000+ row table no matter what you do because when your pattern '%pattern%' starts with a % MySql can't use any indexes and should do a full table scan. You should look into full-text indexes and function.

Well, since the like condition starts with a wildcard it will result in a full table scan which is always slow, no internal caching can take place.

If you want to do full text searches, mysql is not the best bet you have. Look into other software (solr for instance) to solve this kind of problems.

继续阅读：h2 many-to-many sql

Efficiently selecting from many-to-many relation in H2

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？