counting surnames in a range a-d, e-h, i-l.... with MySQL

2023-04-11 23:33 问答作者：

I have a table of surnames and I want to count the number of surnames in each alphabetic range of A-D or E-H etc.

I came up with the following query, which works, and I woul开发者_JAVA技巧d like to hear people's opinions on it and perhaps ways to do it better.

select count(*) FROM people 
group by surname REGEXP '^[a-d].*', 
         surname REGEXP '^[e-h].*', 
         surname REGEXP '^[i-l].*', 
         surname REGEXP '^[m-p].*', 
         surname REGEXP '^[q-t].*', 
         surname REGEXP '^[u-z].*';

Here's the best way to achieve this (using regex anyway):

select
    sum(surname REGEXP '^[a-dA-D].*') as ad_count,
    sum(surname REGEXP '^[e-hE-H].*') as eh_count,
    sum(surname REGEXP '^[i-lI-L].*') as il_count,
    sum(surname REGEXP '^[m-pM-P].*') as mp_count,
    sum(surname REGEXP '^[q-tQ-T].*') as qd_count,
    sum(surname REGEXP '^[u-zU-Z].*') as uz_count
from people

This elegant brevity works due to the fact that in mysql, true is 1 and false is 0, thus sum(some condition) is the count of how many time it's true.

btw, I added uppercase to your regex.

You would get better performance by selecting from an inner select that does the work of calculating the group more efficiently (eg by using a case on substr(surname, 1, 1)), then summing on tests against that calculated vale value.

Regex is overkill and completely not needed here.

Perhaps something like this, using basic string algebra:

SELECT
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'a' AND 'd' THEN 1 ELSE 0 END) AS `SUM_a-d`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'e' AND 'h' THEN 1 ELSE 0 END) AS `SUM_e-h`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'i' AND 'l' THEN 1 ELSE 0 END) AS `SUM_i-l`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'm' AND 'p' THEN 1 ELSE 0 END) AS `SUM_m-p`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'q' AND 't' THEN 1 ELSE 0 END) AS `SUM_q-t`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'u' AND 'z' THEN 1 ELSE 0 END) AS `SUM_u-z`
FROM `people`

You can make the query a bit more explicit like so:

SELECT 
  SUM(CASE WHEN  surname REGEXP '^[a-d].*' THEN 1 ELSE 0 END) AS a_d_count
  ,SUM(CASE WHEN surname REGEXP '^[e-h].*' THEN 1 ELSE 0 END) AS e_h_count
  ,SUM(CASE WHEN surname REGEXP '^[i-l].*' THEN 1 ELSE 0 END) AS i_l_count
  ,SUM(CASE WHEN surname REGEXP '^[m-p].*' THEN 1 ELSE 0 END) AS m_p_count
  ,SUM(CASE WHEN surname REGEXP '^[q-t].*' THEN 1 ELSE 0 END) AS q_t_count
  ,SUM(CASE WHEN surname REGEXP '^[u-z].*' THEN 1 ELSE 0 END) AS u_z_count
FROM (SELECT surname FROM people ORDER BY surname ASC) p

Avoiding regexes and conditionals, you might do this:

SELECT CONCAT(LEFT(UPPER(surname),1), '-', CHAR(ASCII(UPPER(surname))+3)) AS r, 
  count(id) 
FROM people
GROUP BY ROUND((ASCII(UPPER(surname)-65)/4),0);

This sets your ranges to 4 letters long, which does mean that the last range is'yz', but you can fiddle that around with a bit more math.

counting surnames in a range a-d, e-h, i-l.... with MySQL

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？