SQL group by LIKE pattern

2022-12-16 03:12 问答作者：

I have the following table: what i want to do is group: group by firstname and address on this sort of data

| firstname | address
+-----------+----------
| alex      | 338 s. michigan 
| mark      | finger
| alex      | 338 south michigan
| ted       | port

But the group by will not return similar values of the address field.

Is there a pattern I can apply to the group by? I want to use the LIKE pattern system . for example :

select firstname, address from t group by firstname, '%'+SUBSTRING(address,0,5)+'%'

but that query doesn't seem to work.

Any ideas? Thanks!

if the substring can appear in any position in another address, you can do a join like:

select a.id, a.addr, b.id as b_id, b.addr as b_addr from t a, t b where
    b.addr like concat('%', substr(a.addr, 0, 5), '%') and
    b.id <> a.id;

will return all records whose first 5 address chars appear anywhere inside another address.

or you can just ignore the stuff after char 5 with a group:

select firstname, substr(addr, 0, 5) from t group by firstname,
  substr(addr, 0, 5);

you might want to use a tool to normalize postal addresses, such as:

https://webgis.usc.edu/Services/AddressNormalization/Default.aspx

(free up to 2500 records)

If you want to group by the first 5 characters of the address too, you can do this:

select firstname, MAX(address) AS Address
from t 
group by firstname, SUBSTRING(address,0,5)

Is that what you want?

继续阅读：group-by sql

精彩评论