Matching similarly ending strings with SQL sort/group by
I have a massive table of emails and would like to sort by domain (and count up开发者_开发知识库 the # in each domain)
Example output:
@gmail.com = 1000
@aol.com = 790
@hotmail.com = 550
@somethingweird.com = 2
The regex would be for all strings that match from "@" to the final character in the string.
Any ideas how I could do this?
If you can change your design you may try changing the way you store email addresses in the db, or add an additional column. This will perform much better with indexing than having to do a tablescan through your whole table to generate a list groupings.
If it's massive then you need a scalable solution.
Add a computed column (or separate domain column) to split the email address on @
and index that.
Then it's a simple COUNT.. GROUP BY
If you use Oracle you can GROUP BY regexp_substr(mail_column,'@.*')
精彩评论