SQL query help with non-unique duplicates
I can't think through this one. I have this query:
SELECT
p.person_id,
p.first_nm,
p.last_nm,
pu.purchase_dt,
pr.sku,
pr.description,
a.address_type_id,
a.city_cd,
a.state_cd,
a.postal_cd
FROM
person p
INNER JOIN address a ON p.person_id = a.person_id
INNER JOIN purchase pu ON pu.person_id = p.person_id
INNER JOIN product pr ON pr.product_id = pu.product_id
Simple enough - I just need to get the information for customers that we've shipped returns to. However, because of the addressType table
AddressType
address_type_id address_type_desc
------------------------------------
1 Home
2 Shipping
some customers have multiple addresses in the address table, creating non-unique 开发者_如何学Pythonduplicate entries like this.
1,Smith, John, 12/01/2009, A12345, Purple Widget, 1, Anywhere, CA, 12345
1,Smith, John, 12/01/2009, A12345, Purple Widget, 2, Somewhere, ID, 54321
I'd like to get the query to return just one row/person and return the home address if available otherwise, return the shipping address.
This seems simple enough, and maybe it's just my cold, but this is causing me to scratch my head somewhat.
you want to change your join so it returns the min(addressID) instead of all of them:
INNER JOIN address a ON p.person_id = a.person_id
inner join (select person_id, min(address_type_id) as min_addr
from address group by person_id) a_min
on a.person_id = a_min.person_id and a.address_type_id = a_min.min_addr
SELECT
p.person_id,
p.first_nm,
p.last_nm,
pu.purchase_dt,
pr.sku,
pr.description,
COALESCE(ha.address_type_id, sa.address_type_id) AS address_type_id
CASE WHEN ha.address_type_id IS NOT NULL THEN ha.city_cd ELSE sa.city_cd END AS city_cd,
CASE WHEN ha.address_type_id IS NOT NULL THEN ha.state_cd ELSE sa.state_cd END AS state_cd,
CASE WHEN ha.address_type_id IS NOT NULL THEN ha.postal_cd ELSE sa.postal_cd END AS postal_cd
FROM
person p
LEFT JOIN address ha ON p.person_id = ha.person_id AND ha.address_type_id = 1
LEFT JOIN address sa ON p.person_id = sa.person_id AND sa.address_type_id = 2
INNER JOIN purchase pu ON pu.person_id = p.person_id
INNER JOIN product pr ON pr.product_id = pu.product_id
If SQL Server, or other version with common table expressions (CTE), you could do the following. The CTE adds a row-number column that is grouped by person and ordered by the address_type_id. The main query is altered to return number 1 row for each person from the CTE.
WITH cte AS
(
SELECT
a.person_id,
a.address_type_id,
a.city_cd,
a.state_cd,
a.postal_cd,
ROW_NUMBER() over (PARTITION BY person_id ORDER BY address_type_id) AS sequence
FROM address a
INNER JOIN AddressType at ON a.address_type_id = at.address_type_id
)
SELECT
p.person_id,
p.first_nm,
p.last_nm,
pu.purchase_dt,
pr.sku,
pr.description,
a.address_type_id,
a.city_cd,
a.state_cd,
a.postal_cd
FROM
person p
INNER JOIN cte a ON p.person_id = a.person_id
INNER JOIN purchase pu ON pu.person_id = p.person_id
INNER JOIN product pr ON pr.product_id = pu.product_id
WHERE
a.sequence = 1
By the way, if you have person records that have no addresses, you might want to change the INNER JOIN to an OUTER JOIN on the addresses table (cte in my answer). This may also be appropriate for joins to purchase and product if your requirements indicate so.
精彩评论