开发者

How to efficiently structure query for parent-child tables with last child selected

I have two tables, which reflect data about cards handed to clients.

The first table is a list of all clients. The client has a card, number, which is the first card handed to the client.

card:
+----------+-------------+
| card_id  | Name, etc   |
+----------+-------------+
| 123123   | First Client|
| 123124   | 2nd   Client|
+----------+-------------+

The second table is a history file, which contains the card_id as a foreign key. When a card is replaced (eg, lost, stolen, expired), the card table is unchanged, but an entry is created in the history file.

card_history:
+----------+-------------+--------------+----------------+
| card_id  | new_card_id | Date_created | date_replaced  |
+----------+-------------+--------------+----------------+
| 123123   | 123123      | 2010-01-01   | 0000-00-00     |
| 123123   | 123789      | 0000-00-00   | 2010-01-31     |
| 123123   | 123790      | 0000-00-00   | 2010-02-15     |
+----------+-------------+--------------+----------------+

Here you can see a new card was issues on 2010-01-01, and two replacement was made on 2010-01-31 and 2010-02-15.

I need a generate a report (using mysql) that reads like this.

Name,           old_card, new_card, date_issued
------------------------------------------------
"First Client", 123123,   123790,   2010-02-15

My current query is too slow.

I tried two approaches:

.1. Created a single query with a join

 SELECT ...
        FROM card
   LEFT JOIN card_history ON card.card_id = card_history.card_id
         AND ( select ....)

but I can't seem to get the subquery right.

.2. Created a view

   CREATE VIEW v1 as
      SELECT MAX(GREATER(Date_created, date_replaced) as date_issued
        FROM card_history
       GROUP BY card_id

The view works, but is very slow (about 40 seconds per lookup). Both tables are large with about 2 million records. I have indexes on the card_ids.

How can I structure my query so that I can efficiently extract the required data?

UPDATE

Firstly, I omitted to mention that the cards are not issued in any order, they are handed out randomly, which makes the solutions provided handy, but not correct for my case.

Secondly, I've been awed by the wonders of derived table, and have the current query, which derives a join between the two tables, and tries to get the record we seek (last updated entry in card_history table).

SELECT * FROM 
(
   SELECT card.card_id, card.name,
          card_history.new_card_id,
          card_history.date_created, card_history.date_replaced, GREATEST(card_history.date_created, card_history.date_replaced) AS    last_date
          FROM card
LEFT JOIN card_history ON card.card_id  = card_history.card_id 
O开发者_C百科RDER BY last_date DESC

) AS B
;

However, I have a vague suspicion that my query is picking the derived table row at random.

What I need is to extract the new_card_id and last date either issued or replaced.

Does my query cut it?


use test
DROP TABLE IF EXISTS card;
DROP TABLE IF EXISTS card_history;

CREATE TABLE card ( card_id INT NOT NULL PRIMARY KEY,name VARCHAR(255) );

CREATE TABLE card_history
(card_id INT NOT NULL,new_card_id INT NOT NULL,
date_created date,date_replaced date,
PRIMARY KEY (card_id,new_card_id),
KEY (card_id,date_replaced));

INSERT INTO card VALUES (123123,'First Client'),(123124,'2ndClient');
INSERT INTO card_history VALUES
(123123,123123,'2010-01-01','0000-00-00'),
(123123,123789,'0000-00-00','2010-01-31'),
(123123,123790,'0000-00-00','2010-02-15');

.

SELECT AA.name,AA.card_id old_card,
(SELECT MAX(new_card_id)
 FROM card_history BB
 WHERE BB.card_id=AA.card_id) new_card,
(SELECT MAX(GREATEST(date_created,date_replaced))
 FROM card_history BB
 WHERE BB.card_id=AA.card_id) date_issued
FROM (SELECT A.name,B.card_id
      FROM card A
      INNER JOIN card_history B USING (card_id)
      WHERE B.card_id=B.new_card_id) AA;

+--------------+----------+----------+-------------+
| name         | old_card | new_card | date_issued |
+--------------+----------+----------+-------------+
| First Client |   123123 |   123790 | 2010-02-15  |
+--------------+----------+----------+-------------+

If you want to exclude rows where no new cards were issued, do this one :

SELECT *
FROM
  (SELECT AA.name,AA.card_id old_card,
     (SELECT MAX(new_card_id)
      FROM card_history BB
      WHERE BB.card_id=AA.card_id) new_card,
     (SELECT MAX(GREATEST(date_created,date_replaced))
      FROM card_history BB
      WHERE BB.card_id=AA.card_id) date_issued
   FROM (
      SELECT A.name,B.card_id
      FROM card A
      INNER JOIN card_history B USING (card_id)
      WHERE B.card_id=B.new_card_id) AA
) AAA
WHERE old_card <> new_card;

It worked for me !!! Give it a Try !!!

UPDATE

I looked at my EXPLAIN plan and I did not like what I saw.

Please make sure card_history's primary key is (card_id,new_card_id)
I also added and index on (card_id,date_replaced)

Try this one instead; this has a slightly better EXPLAIN plan.

SELECT AA.name,AA.card_id old_card,
   (SELECT MAX(new_card_id)
    FROM card_history BB
    WHERE BB.card_id=AA.card_id) new_card,
   (SELECT MAX(date_replaced)
    FROM card_history BB
    WHERE BB.card_id=AA.card_id) date_issued
FROM (
  SELECT A.name,B.card_id
  FROM card A
  INNER JOIN card_history B USING (card_id)
  WHERE B.card_id=B.new_card_id) AA;

If you want the report to show new cards, stick with my first query.

Have Fun !!!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜