开发者

NULL-keys for key/value table

(Using Oracle)

I have a table with key/value pairs like this:

create table MESSAGE_INDEX
(
  KEY               VARCHAR2(256) not null,
  VALUE             VARCHAR2(4000) not null,
  MESSAGE_ID        NUMBER not null
)

I now want to find all the messages where key = 'someKey' and value is 'val1', 'val2' or 'val3' - OR value is null in which case there will be no entry in the table at all. This is to save space; there would be a large number of keys with null values if I stored them all.

I think this works:

SELECT message_id
FROM message_index idx
WHERE ((key = 'someKey' AND value IN ('val1', 'val2', 'val3'))
      OR NOT EXISTS (SELECT 1 FROM message_index WHERE key = 'someKey'
      AND idx.message_id = message_id))

But is is extremely slow. Takes 8 seconds with 700K records in message_index and there will be many more records and more search criteria when moving outside of my test environment.

Primary key is key, value, message_id:

  add constraint PK_KEY_VALUE primary key (KEY, VALUE, MESSAGE_ID)

And I added another index for message_id, to speed up searching for missing keys:

create index IDX_MESSAGE_ID on MESSAGE_INDEX (MESSAGE_ID)

I will be doing several of these key/value lookups in every search, not just one as shown above. So far I am doing them nested, where output id's of one level is the input to the next. E.g.:

SELECT message_id from message_index
WHERE (key/value compare)
AND message_id IN
  (
    SELECT ... and开发者_如何学编程 so on
  )

What can I do to speed this up?


"What can I do to speed this up?"

Use a normalized data model rather than a key-value store. Reconstructing the (especially optional) attributes of a message is going to be a continual performance bugbear.


If you have a key that all messages are guaranteed to have:

SELECT  message_id
FROM    message_index mi
WHERE   mi.key = 'GuaranteedKey'
        AND mi.message_id IN
        (
        SELECT  message_id
        FROM    message_index mk
        WHERE   mk.key = 'someKey'
                AND mk.value IN (1, 2, 3)
        )
UNION ALL
SELECT  message_id
FROM    message_index mi
WHERE   mi.key = 'GuaranteedKey'
        AND mi.message_id NOT IN
        (
        SELECT  message_id
        FROM    message_index mk
        WHERE   mk.key = 'someKey'
        )

If you don't:

WITH    mi AS
        (
        SELECT  DISTINCT message_id
        FROM    message_index
        )
SELECT  message_id
FROM    mi
WHERE   mi.message_id IN
        (
        SELECT  message_id
        FROM    message_index mk
        WHERE   mk.key = 'someKey'
                AND mk.value IN (1, 2, 3)
        )
UNION ALL
SELECT  message_id
FROM    mi
WHERE   mi.message_id NOT IN
        (
        SELECT  message_id
        FROM    message_index mk
        WHERE   mk.key = 'someKey'
        )


To speed this up you convert subselects into joins, so your query would become something like this:

SELECT idx.message_id
FROM message_index idx
LEFT JOIN message_index idx2 ON idx2.message_id = idx.message_id AND idx2.key = 'someKey'
WHERE (idx.key = 'someKey' AND idx.value IN ('val1', 'val2', 'val3'))
   OR idx2.message_id IS NULL


I'm not sure your second filter is what you are looking for. Basically the subquery:

(SELECT 1
   FROM message_index
  WHERE key = 'someKey'
    AND idx.message_id = message_id)

won't contain rows only if there is no key 'someKey' for that message_id in the table.

If this is really what you want, and since all columns are NOT NULL, you could rewrite the query with a NOT IN (that will probably be optimized into an HASH ANTI-JOIN):

SELECT message_id
  FROM message_index idx
 WHERE (key = 'someKey' AND VALUE IN ('val1', 'val2', 'val3'))
UNION ALL
SELECT message_id
  FROM message_index    
 WHERE message_id NOT IN (SELECT message_id 
                            FROM message_index 
                           WHERE key = 'someKey');
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜