Implementing a find-or-insert for one-to-many tables

2023-02-12 21:38 问答作者：

I have 2 tables, tracklist and track, where tracklist has many tracks. At some points, I will receive user input which refers to a list of tracks, and I need to either create that tracklist, or return an existing tracklist (this is because tracklists are meant to be entirely transparent to users).

My naive solution to this was to find all tracklists with n tracks, and join track against tracklist n times, checking each join against the user input data. For example, with 2 tracks:

SELECT tracklist.id FROM tracklist
  JOIN track t1 ON tracklist.id = t1.tracklist
  JOIN track_name tn1 ON t1.name = tn1.id
  JOIN track t2 ON tracklist.id = t2.tracklist
  JOIN track_name tn2 ON t2.name = tn2.id
 WHERE tracklist.track_count = '20'
   AND (t1.position = 1 AND tn1.name = 'Pancakes' AND t1.artist_credit = '42' AND t1.recording = 1)
   AND (t2.position = 2 AND tn2.name = 'Waffles' AND t2.artist_credit = '9001' AND t2.recording = 2)

However, this really doesn't scale well to large tracklists. My very rudimentary timing shows this can take >500ms for 10 track tracklists, and ~7s for tracklists with 100 tracks. While the latter is an edge case, whatever algorithm I use needs to be able to scale at least up to this.

I'm stuck on other solutions however. The only other thing I can think of is to select all tracklists with n tracks, and all their tracks, and then do the comparison in application code. However, I'd really like to keep this on the database server if I can.

Here is the schema I am working with:

CREATE TABLE track
(
    id                  SERIAL,
    recording           INTEGER NOT NULL, -- references recording.id
    tracklist           INTEGER NOT NULL, -- references tracklist.id
    position            INTEGER NOT NULL,
    name                INTEGER NOT NULL, -- references track_name.id
    artist_credit       INTEGER NOT NULL, -- references artist_credit.id
    length              INTEGER CHECK (length IS NULL OR length > 0),
    edits_pending       INTEGER NOT NULL DEFAULT 0,
    last_updated        TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

CREATE TABLE track_name (
    id                  SERIAL,
    name                VARCHAR NOT NULL
);

CREATE TABLE tracklist
(
    id                  SE开发者_JS百科RIAL,
    track_count         INTEGER NOT NULL DEFAULT 0,
    last_updated        TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Any suggestions?

  SELECT DISTINCT tracklist
    FROM track t0
    WHERE 
      (SELECT COUNT(DISTINCT tracklist)
        FROM track t1
          WHERE 
            (
              (t1.id='test1.id')
              OR 
              (t1.id='test2.id')

              ......
              OR 
              (t1.id='testn.id')
            )
      = 1);

  -- This is OK if you have the track ids for this query.
  -- If you do not then you need to replace each of the t1.id='testm.id' statements
  -- with:
  --      t1.recording='testm.recording' AND
  --      t1.tracklist='testm.tracklist' AND
  --      t1.position='testm.position' AND
  --      t1.name='testm.name' AND
  --      t1.artist_credit='testm.artist_credit' AND
  --      t1.length='testm.length' AND
  --      t1.edits_pending='testm.edits_pending' AND
  --      t1.last_updated='testm.last_updated'

As I may not have the syntax exactly correct, and have had no opportunity to test it, a written description of what I am trying to achieve is next:

I build up a query returning the list of tracks that you have. Once I have built this query I am checking whether the tracklists for these tracks are all the same. If they are, ie there is only one tracklist in the query, then this is the tracklist you require. If there are no tracklists in the query, or there is more than one, then the set of tracks you have do not correspond to any single existing tracklist, so you need to create a new tracklist. This query does not deal with the actual creation, if it proves necessary. I am not sure how it will deal with degenerate cases - there are no tracks at all in the query; or there are no tracklists listed for any of the tracks.

继续阅读：algorithm postgresql sql

Implementing a find-or-insert for one-to-many tables

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？