开发者

Taking two tables, 1-to-many, how can I filter down the many table and then join ALL the matches in the 1 table?

I apologize in advance for the length, the solution may well be trivial, just wanted to be as informative as I could.

The Tables

I have two tables of note: items and products, which is a 1 to many relationship. One item can have multiple product which are variations in color and material. Brand is an external category table that doesn't have to much part to play in this select statement.

So an item is, for example, a specific shoe, e.g. a "park avenue" shoe. A product is, for example, merlot burnished calfskin. And the brand would just be Allen Edmonds. Overall you get an Allen Edmonds park avenue shoe in merlot burnished calfskin.

Missing results in a "show almost everything" search

Someone decided to create a manual flag to associate the default color and material with a shoe, so that when you search, each type of shoe only shows up once, and when you click on it you can find it's other colors and materials. That's fine, but some shoes have no default material and color set. As an unfortunate result, those without at least one default set don't show up in the search.

Current Select Statement

Here is the current select, which filters out everything that doesn't have a default manually set:

SELECT DISTINCT items.ItemId
     , items.Name
     , items.BrandCategoryId
     , items.CatalogPage
     , items.GenderId
     , items.PriceRetail
     , items.PriceSell
     , items.PriceHold
     , items.Descr
     , items.FlagStatus as ItemFlagStatus
     , products.ImagetnURL
     , products.FlagDefault
     ,  products.ProductId
     , products.Code as ProductCode
     , products.Name as ProductName
     , brands.Name as BrandName 
FROM items
   , products
   , brands 
WHERE items.ItemId = products.ItemId
  AND items.BrandCode = brands.Code
  AND items.FlagStatus != 'U'
  AND products.FlagStatus != 'U' 
  AND products.FlagDefault = 'Y';

Not my choice of code, I suspect that the "DISTINCT" part of that statement is a bad idea, but I'm not exactly clear how to get rid of it.

The big problem I'm having right now, though is that final line

AND products.FlagDefault = 'Y'

that filters out everything that doesn't have at least one manual default set.

Edit: Here's an explain for the query:

+----+-------------+----------+--------+-----------------------------------------------------------+---------+---------+-------------------------+-------+--------------------------------+
| id | select_type | table    | type   | possible_keys                                             | key     | key_len | ref                     | rows  | Extra                          |
+----+-------------+----------+--------+-----------------------------------------------------------+---------+---------+-------------------------+-------+--------------------------------+
|  1 | SIMPLE      | brands   | ALL    | NULL                                                      | NULL    | NULL    | NULL                    |    38 | Using temporary                |
|  1 | SIMPLE      | products | ALL    | FlagStatus,FlagStatus_2,FlagStatus_3,flagstatusanddefault | NULL    | NULL    | NULL                    | 16329 | Using where; Using join buffer |
|  1 | SIMPLE      | items    | eq_ref | PRIMARY,BrandCode,FlagStatus,FlagStatus_2,FlagStatus_3    | PRIMARY | 4       | sherman.products.ItemId |     1 | Using where                    |
+----+-------------+----------+--------+-----------------------------------------------------------+---------+---------+-------------------------+-------+--------------------------------+
3 rows in set (0.01 sec)

And here is a describe on products, items, a开发者_JAVA百科nd brands:

mysql> describe products;
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Field       | Type         | Null | Key | Default           | Extra                       |
+-------------+--------------+------+-----+-------------------+-----------------------------+
| ProductId   | int(11)      | NO   | PRI | NULL              | auto_increment              |
| ItemId      | int(11)      | YES  |     | NULL              |                             |
| Code        | varchar(15)  | YES  | MUL | NULL              |                             |
| Name        | varchar(100) | YES  |     | NULL              |                             |
| MaterialId  | int(11)      | YES  | MUL | NULL              |                             |
| PriceRetail | decimal(6,2) | YES  |     | NULL              |                             |
| PriceSell   | decimal(6,2) | YES  |     | NULL              |                             |
| PriceHold   | decimal(6,2) | YES  |     | NULL              |                             |
| Cost        | decimal(6,2) | YES  |     | NULL              |                             |
| FlagDefault | char(1)      | NO   |     | N                 |                             |
| FlagStatus  | char(1)      | YES  | MUL | NULL              |                             |
| ImagetnURL  | varchar(50)  | YES  |     | NULL              |                             |
| ImagefsURL  | varchar(50)  | YES  |     | NULL              |                             |
| ImagelsURL  | varchar(50)  | YES  |     | NULL              |                             |
| DateStatus  | timestamp    | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| DateCreated | timestamp    | YES  |     | NULL              |                             |
+-------------+--------------+------+-----+-------------------+-----------------------------+
16 rows in set (0.02 sec)

mysql> describe items
    -> ;
+-----------------+--------------+------+-----+-------------------+-----------------------------+
| Field           | Type         | Null | Key | Default           | Extra                       |
+-----------------+--------------+------+-----+-------------------+-----------------------------+
| ItemId          | int(11)      | NO   | PRI | NULL              | auto_increment              |
| Code            | varchar(25)  | YES  |     | NULL              |                             |
| Name            | varchar(100) | YES  | MUL | NULL              |                             |
| BrandCode       | char(2)      | YES  | MUL | NULL              |                             |
| CatalogPage     | int(3)       | YES  |     | NULL              |                             |
| BrandCategoryId | int(11)      | YES  |     | NULL              |                             |
| TypeId          | int(11)      | YES  | MUL | NULL              |                             |
| StyleId         | int(11)      | YES  | MUL | NULL              |                             |
| GenderId        | int(11)      | YES  | MUL | NULL              |                             |
| PriceRetail     | decimal(6,2) | YES  |     | NULL              |                             |
| PriceSell       | decimal(6,2) | YES  |     | NULL              |                             |
| PriceHold       | decimal(6,2) | YES  |     | NULL              |                             |
| Cost            | decimal(6,2) | YES  |     | NULL              |                             |
| PriceNote       | longtext     | YES  |     | NULL              |                             |
| FlagTaxable     | char(1)      | YES  |     | NULL              |                             |
| FlagStatus      | char(1)      | YES  | MUL | NULL              |                             |
| FlagFeatured    | char(1)      | YES  |     | NULL              |                             |
| MaintFlagStatus | char(1)      | YES  |     | NULL              |                             |
| Descr           | longtext     | YES  |     | NULL              |                             |
| DescrNote       | longtext     | YES  |     | NULL              |                             |
| ImagetnURL      | varchar(50)  | YES  |     | NULL              |                             |
| ImagefsURL      | varchar(50)  | YES  |     | NULL              |                             |
| ImagelsURL      | varchar(50)  | YES  |     | NULL              |                             |
| DateCreated     | date         | NO   |     | 0000-00-00        |                             |
| DateStatus      | timestamp    | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-----------------+--------------+------+-----+-------------------+-----------------------------+
25 rows in set (0.00 sec)

mysql> describe brands;
+--------------+------------------+------+-----+-------------------+-----------------------------+
| Field        | Type             | Null | Key | Default           | Extra                       |
+--------------+------------------+------+-----+-------------------+-----------------------------+
| BrandId      | int(11) unsigned | NO   | PRI | NULL              | auto_increment              |
| Code         | varchar(6)       | YES  |     | NULL              |                             |
| PriceCode    | varchar(4)       | YES  |     | NULL              |                             |
| Name         | varchar(50)      | YES  |     | NULL              |                             |
| WebsiteURL   | varchar(50)      | YES  |     | NULL              |                             |
| LogoURL      | varchar(50)      | YES  |     | NULL              |                             |
| LogoTopURL   | varchar(50)      | YES  |     | NULL              |                             |
| BrandURL     | varchar(50)      | YES  |     | NULL              |                             |
| Descr        | longtext         | YES  |     | NULL              |                             |
| DescrShort   | longtext         | YES  |     | NULL              |                             |
| BeltDescr    | longtext         | YES  |     | NULL              |                             |
| ImageURL     | varchar(50)      | YES  |     | NULL              |                             |
| SaleImageURL | varchar(50)      | YES  |     | NULL              |                             |
| SaleCode     | varchar(6)       | YES  |     | NULL              |                             |
| SaleDateBeg  | date             | YES  |     | NULL              |                             |
| SaleDateEnd  | date             | YES  |     | NULL              |                             |
| FlagStatus   | char(1)          | YES  |     | NULL              |                             |
| DateStatus   | timestamp        | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| DateCreated  | timestamp        | YES  |     | NULL              |                             |
+--------------+------------------+------+-----+-------------------+-----------------------------+
19 rows in set (0.00 sec)

Possibilities that I am exploring

Subselect that grinds everything to a halt

I have a select statement that might, in a perfect, zero-execution-time world, work, by selecting the products the first product for each item, ordered by that flagdefault field, e.g.:

  AND products.productid =
    (select productid
     from products
     where products.itemid = items.itemid
       AND products.FlagStatus != 'U'
     order by FlagDefault='Y'
            , itemid
     limit 1);

replacing the check for a manually toggled default with an id that's only ordered by default, even if it's not toggled, and only takes the first result.

That statement grinds to a halt, and actually causes other use on the site to put mysql statements into deadlock (I suppose because reading of those tables is making them unavailable elsewhere).

Join that makes sure one table is distinct and not the next?

One way to get around it that might work is doing a:

select distinct ItemId from products ORDER BY default

And then just going further to obtain data for those itemids specifically, but I'm not sure how to make that happen in a single statement, not sure how to join select distincts well, and I expect that even making that select "distinct" in the first place isn't ideal, since it's selecting more than is needed to begin with and then cutting them down afterwards, but I don't have a better alternative for determining distinctness, really.

Advice?

In general, the select statement could use a lot of improvement, and specifically I could really use some advice on how to filter down the results for the most specific table and only -then- join upstream to the table that is the "one" in the one to many relationship.


Remove from WHERE:

AND products.FlagStatus != 'U'
AND products.FlagDefault = 'Y'

Add to FROM:

(
   (SELECT ProductId
    FROM products
    WHERE FlagStatus != 'U' AND FlagDefault = 'Y')
UNION
   (SELECT MIN(ProductId)
    FROM products
    WHERE FlagStatus != 'U'
    GROUP BY ItemId
    HAVING MAX(FlagDefault) != 'Y')
) AS defaults

Add to WHERE:

AND defaults.ProductId = products.ProductId

I'm using the term "non-hidden" for rows which have FlagStatus != 'U', since I'm assuming that's what the flag is for.

The first SELECT gives the ProductId of all default products, and the second one gives a ProductId for all the items without a default product. Hidden items are filtered by both, so if a default product has been hidden, a non-default product is displayed instead. When concatenated, you get a ProductId for every item that has some non-hidden product.

I'm assuming FlagDefault can only have values 'Y' or 'N'. The second query filters out the items having a default product by using MAX(FlagDefault), which works because 'Y' > 'N'.

By joining this to the products table of the original query, instead of filtering with FlagDefault, you should get the same results as the original, except you also get one row for every item which does not have a default product.

I've tested this query, but I haven't tested it with your original one since I don't have any meaningful data (read: your data) to test it against. This one works, so the combination should also work. For the same reason, I don't have any real numbers about performance - and I'm not an expert on query performance, either (more like a newbie). However, from what I've heard, subqueries in the WHERE clause are supposed to be bad for the performance, but in the FROM clause they should be okay. So, test it, I hope it's fast enough and fits the job.

As others mentioned, if you haven't got an index for the products.ItemId and BrandCode columns, you should definitely add them. You should also consider if requiring every item to have one hand-picked default would be okay, or maybe ditching the hand-picked defaults and always using random ones. Another thing to consider is if you really need the data from a product when there is no default - could you live without the image url, product name (use the item name?) and product code for those products?

Edit: One more possibility: You could change products.FlagDefault to items.DefaultProductId. That way it'd be easier to find out if an item has a default product and it enforces only one default product per item.


SELECT
        items.ItemId,
        items.Name,
        items.BrandCategoryId,
        items.CatalogPage,
        items.GenderId,
        items.PriceRetail,
        items.PriceSell,
        items.PriceHold,
        items.Descr,
        items.FlagStatus as ItemFlagStatus,
        T3.ImagetnURL,
        T3.FlagDefault,
        T3.ProductId,
        T3.Code as ProductCode,
        T3.Name as ProductName,
        brands.Name as BrandName 
FROM    items INNER JOIN
        (
            SELECT DISTINCT
                T1.ItemId,
                T1.ImagetnURL,
                T1.FlagDefault
                T1.ProductId,
                T1.Code
                T1.Name,
                T1.FlagStatus
            FROM
                products AS T1 LEFT JOIN
                products AS T2 ON T1.products.ProductId = T2.products.ProductId
                    AND T2.FlagDefault = 'Y'
        ) AS T3 ON items.ItemId = T3.ItemId INNER JOIN 
        brands ON items.BrandCode = brands.Code
WHERE   items.FlagStatus != 'U'
        AND T3.FlagStatus != 'U'


I'm not sure I understand fully the FlagStatus and the FlagDefault. For an item with no default, do all its products have products.FlagDefault != 'Y' ?

If yes, can you try this? It will (hopefully return all items with NULLs in the products fields for items with no default):

SELECT items.ItemId
     , items.Name
     , items.BrandCategoryId
     , items.CatalogPage
     , items.GenderId
     , items.PriceRetail
     , items.PriceSell
     , items.PriceHold
     , items.Descr
     , items.FlagStatus as ItemFlagStatus
     , products.ImagetnURL
     , products.FlagDefault
     , products.ProductId
     , products.Code as ProductCode
     , products.Name as ProductName
     , brands.Name as BrandName 
FROM items
  LEFT JOIN
     products 
    ON items.ItemId = products.ItemId
    AND products.FlagDefault = 'Y'
  JOIN
     brands 
    ON items.BrandCode = brands.Code
;

The LEFT JOIN:

  LEFT JOIN
     products 
    ON items.ItemId = products.ItemId
    AND products.FlagDefault = 'Y'

is equivalent to:

  LEFT JOIN
    ( SELECT *
      FROM products
       WHERE products.FlagDefault = 'Y'
    ) AS p
    ON items.ItemId = p.ItemId

So, it does, as you ask, "filters down the results for the most specific table and only -then- joins upstream to ..."

When using LEFT JOINs, the result can be different if you place the filtering conditions you have, at the ONclause, or later after all JOINS at the WHERE clause.


I am not sure about performance as you did not post table structure and sizes or explain plan, but how about a UNION between your first query (items with default products) and a query which fetches one product per item, only for items with no default product?

It's a bit long, but give it a shot - let me know if it gets you the correct data and how long it takes...

(SELECT items.ItemId
     , items.Name
     , items.BrandCategoryId
     , items.CatalogPage
     , items.GenderId
     , items.PriceRetail
     , items.PriceSell
     , items.PriceHold
     , items.Descr
     , items.FlagStatus as ItemFlagStatus
     , products.ImagetnURL
     , products.FlagDefault
     , products.ProductId
     , products.Code as ProductCode
     , products.Name as ProductName
     , brands.Name as BrandName
FROM items
     JOIN products ON items.ItemId = products.ItemId
     JOIN brands ON items.BrandCode = brands.Code
WHERE items.FlagStatus != 'U'
  AND products.FlagStatus != 'U'
  AND products.FlagDefault = 'Y'
GROUP BY items.ItemId)
UNION
(SELECT items.ItemId
     , items.Name
     , items.BrandCategoryId
     , items.CatalogPage
     , items.GenderId
     , items.PriceRetail
     , items.PriceSell
     , items.PriceHold
     , items.Descr
     , items.FlagStatus as ItemFlagStatus
     , products.ImagetnURL
     , products.FlagDefault
     , products.ProductId
     , products.Code as ProductCode
     , products.Name as ProductName
     , brands.Name as BrandName
FROM items
     JOIN products ON items.ItemId = products.ItemId
     JOIN brands ON items.BrandCode = brands.Code
WHERE items.FlagStatus != 'U'
  AND products.FlagStatus != 'U'
  AND products.FlagDefault != 'Y'
  AND items.ItemId NOT IN 
      (SELECT DISTINCT itemId 
       FROM products 
       WHERE products.FlagDefault = 'Y')
GROUP BY items.ItemId)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜