MySQL JOIN with LIMIT 1 on joined table
I want to join two tables, but only get 1 record of table2 per record on table1
For example:
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
This would get me all records in products
, which is not what I want. I want 1 [the first] product per category (I have a sort
column in the products field).
How do I go about doing tha开发者_JAVA百科t?
I like more another approach described in a similar question: https://stackoverflow.com/a/11885521/2215679
This approach is better especially in case if you need to show more than one field in SELECT. To avoid Error Code: 1241. Operand should contain 1 column(s)
or double sub-select for each column.
For your situation the Query should looks like (this query also will work in PostgresQL and it is pretty fast, see my update below):
SELECT
c.id,
c.title,
p.id AS product_id,
p.title AS product_title
FROM categories AS c
JOIN products AS p ON
p.id = ( --- the PRIMARY KEY
SELECT p1.id FROM products AS p1
WHERE c.id=p1.category_id
ORDER BY p1.id LIMIT 1
)
PS. I did the performance test of the query vs other proposed here, and this query is the best option yet!
UPDATE (2022-07-20, PostgresSQL)
I'm not working with mySQL for a while already, so, I decided to test the performance of my solution (which actually works perfect in both MySQL and PostgresQL) with solution provided by @Gravy in PostgresQL v.12.9.
For that I decided to create a dummy tables and data with 100 categories and 100000 products. You can check the code on this gist
I run my query above and it took only 13ms to run.
After I slightly modified (for postgres) the query from @Gravy:
SELECT
id,
category_title,
(array_agg(product_title))[1]
FROM
(SELECT c.id, c.title AS category_title, p.id AS product_id, p.title AS product_title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
ORDER BY c.id ASC) AS a
GROUP BY id, category_title;
and run it too. It took more than 150ms in my machine. Which is >10x times slower.
In defense of @gravy's solution, I agree with n+1 problem. But, in this particular case, usually the number of products is way larger than categories. So, running through each category is way less expensive than running through each product as in @Gravy's query.
By the way, if your table has 1mln products with 100 categories, the speed of my query is still the same (between 9-17ms), but the query from [@Gravy] takes more than 2 seconds to run
In resume, at this moment, my query is the most performant and optimal solution for the current task.
Feel free to comment.
Accepted answer by @goggin13 looks wrong. Other solutions provided to-date will work, but suffer from the n+1 problem and as such, suffer a performance hit.
n+1 problem: If there are 100 categories, then we would have to do 1 select to get the categories, then for each of the 100 categories returned, we would need to do a select to get the products in that category. So 101 SELECT queries would be performed.
My alternative solution solves the n+1 problem and consequently should be significantly more performant as only 2 selects are being performed.
SELECT
*
FROM
(SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
ORDER BY c.id ASC) AS a
GROUP BY id;
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
GROUP BY c.id
This will return the first data in products (equals limit 1)
What about this?
SELECT c.id, c.title, (SELECT id from products AS p
WHERE c.id = p.category_id
ORDER BY ...
LIMIT 1)
FROM categories AS c;
The With clause would do the trick. Something like this:
WITH SELECTION AS (SELECT id FROM products LIMIT 1)
SELECT a.id, c.id, c.title FROM selection a JOIN categories c ON (c.id = a.id);
SELECT
c.id,
c.title,
p.id AS product_id,
p.title AS product_title
FROM categories AS c
JOIN products AS p ON
p.id = (
SELECT MIN(p1.id) FROM products AS p1
WHERE c.id=p1.category_id
)
Using MIN or MAX in a subquery will make your query run much faster.
Assuming you want product with MIN()
imial value in sort
column, it would look something like this.
SELECT
c.id, c.title, p.id AS product_id, p.title
FROM
categories AS c
INNER JOIN (
SELECT
p.id, p.category_id, p.title
FROM
products AS p
CROSS JOIN (
SELECT p.category_id, MIN(sort) AS sort
FROM products
GROUP BY category_id
) AS sq USING (category_id)
) AS p ON c.id = p.category_id
Another example with 3 nested tables: 1/ User 2/ UserRoleCompanie 3/ Companie
- 1 user has many UserRoleCompanie.
- 1 UserRoleCompanie has 1 user and 1 Company
- 1 Companie has many UserRoleCompanie
SELECT
u.id as userId,
u.firstName,
u.lastName,
u.email,
urc.id ,
urc.companieRole,
c.id as companieId,
c.name as companieName
FROM User as u
JOIN UserRoleCompanie as urc ON u.id = urc.userId
AND urc.id = (
SELECT urc2.id
FROM UserRoleCompanie urc2
JOIN Companie ON urc2.companieId = Companie.id
AND urc2.userId = u.id
AND Companie.isPersonal = false
order by Companie.createdAt DESC
limit 1
)
LEFT JOIN Companie as c ON urc.companieId = c.id
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| userId | firstName | lastName | email | id | companieRole | companieId | companieName |
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| cjjt9s9iw037f0748raxmnnde | henry | pierrot | henry@gmail.com | cjtuflye81dwt0748e4hnkiv0 | OWNER | cjtuflye71dws0748r7vtuqmg | leclerc |
In my opinion, this is the best answer (making it general):
SELECT
TB1.Id
FROM Table1 AS TB1
INNER JOIN Table2 AS TB2 ON (TB1.Id = TB2.Id_TB1)
AND TB2.Id = (
SELECT Id
FROM Table2
WHERE TB1.Id = Id_TB1
ORDER BY Table2.Id DESC
LIMIT 1
)
When using postgres you can use the DISTINCT ON
syntex to limit the number of columns returned from either table.
Here is a sample of the code:
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN (
SELECT DISTINCT ON(p1.id) id, p1.title, p1.category_id
FROM products p1
) p ON (c.id = p.category_id)
The trick is not to join directly on the table with multiple occurrences of the id, rather, first create a table with only a single occurrence for each id
Replace the tables with yours:
SELECT * FROM works w
LEFT JOIN
(SELECT photoPath, photoUrl, videoUrl FROM workmedias LIMIT 1) AS wm ON wm.idWork = w.idWork
I would try something like this:
SELECT C.*,
(SELECT P.id, P.title
FROM products as P
WHERE P.category_id = C.id
LIMIT 1)
FROM categories C
精彩评论