Contiguous sequential numbers in MySQL

2023-04-04 09:37 问答作者：

I have the following table and data.

CREATE TABLE county_zip_code (
  id_county INT UNSIGNED,
  from_zip_code INT UNSIGNED,
  to_zip_code INT UNSIGNED
);

INSERT INTO county_zip_code 
(id_county, from_zip_code, to_zip_code)
    VALUES
(12,          2580288,          0),
(12,          2580289,          0),
(12,          2580290,          0),
(12,          2580291,          0),
(12,          2580292,          0),
(15,          8670418,          0),
(15,          8670420,          0),
(15,          8670430,          0),
(16,          7600070,          0),
(16,          7600071,          0),
(16,          7600072,          0),
(16,          7600073,          0)
;

I would like to create a new table called "county_zip_code1" with the开发者_开发百科 contiguous sequences replaced with their endpoints:

id_county | from_zip_code | to_zip_code
    12      2580288         2580292
    15      8670418         0
    15      8670420         0
    15      8670430         0
    16      7600070         7600073

How can I do this?

I've read over some of the answers to the SO question "Evaluate sequential pairs of rows in SQLite", but this kind of SQL is too advanced for me.

It's a little messy to do in SQL. It might be better to handle with a program.

Step back from SQL and relational databases for a second. To make things simple, start with a set of numbers S. Say you wish to find all contiguous sequences in S; how would you go about it? If you could sort and loop (such as in a program), you'd check whether adjacent values in the sorted sequence differ by 1. If you're limited to set operations, you could perform a similar task by taking adjacent pairs from the cross product:

C = {(n,n+1) : n ∈ S ∧ n+1 ∈ S}

To get the endpoints, you could take the transitive closure of C. However, transitive closures in SQL are inelegant; they require a procedural approach, rather than the declarative approach SQL normally affords.

To find a contiguous sequence in SQL, you can perform a self-join on a column with the next value in the sequence. An inner join will filter out the lone items since they won't have a next value. A MIN and MAX will get you the endpoints of the sequence, which also requires grouping by the county ID.

SELECT czc.id_county, 
       MIN(czc.from_zip_code) AS from_zip_code,
       MAX(czc_n.from_zip_code) AS to_zip_code
  FROM county_zip_code AS czc
    JOIN county_zip_code AS czc_n 
      ON     czc.id_county = czc_n.id_county 
         AND czc.from_zip_code = czc_n.from_zip_code-1
  GROUP BY czc.id_county

Note that this solution doesn't cover all cases. If there are disjoint sequences for a county, this will combine them. Add the following to the sample data:

INSERT INTO county_zip_code
   VALUES
(15, 8670424, 0),
(15, 8670425, 0),
(15, 8670426, 0),

(15, 8670450, 0),
(15, 8670451, 0),
;

and the query will result in:

+-----------+---------------+-------------+
| id_county | from_zip_code | to_zip_code |
+-----------+---------------+-------------+
|        12 |       2580288 |     2580292 |
|        15 |       8670424 |     8670453 |
|        16 |       7600070 |     7600073 |
+-----------+---------------+-------------+

Getting the lone items is a little more involved. From the set S, you want items that have no previous or next value:

{(n-1, n, n+1) : n ∈ S ∧ n-1 ∉ S ∧ n+1 ∉ S}

In SQL, again you use a self join, but you select the items that have no previous or next value. Here, you need partial (left or right) joins to get those items ("a ∉ A" can be translated to a null value in joined tables).

SELECT czc.id_county, 
       czc.from_zip_code AS from_zip_code,
       NULL AS to_zip_code
  FROM county_zip_code AS czc
    LEFT JOIN county_zip_code AS czc_p
      ON     czc.id_county = czc_p.id_county 
         AND czc.from_zip_code = czc_p.from_zip_code+1
    LEFT JOIN county_zip_code AS czc_n
      ON     czc.id_county = czc_n.id_county 
         AND czc.from_zip_code = czc_n.from_zip_code-1
WHERE czc_p.from_zip_code IS NULL AND czc_n.from_zip_code IS NULL

Take the union of the two queries and (if desired) sort.

(
  SELECT czc.id_county, 
         MIN(czc.from_zip_code) AS from_zip_code,
         MAX(czc_n.from_zip_code) AS to_zip_code
    FROM county_zip_code AS czc
      JOIN county_zip_code AS czc_n 
        ON     czc.id_county = czc_n.id_county 
           AND czc.from_zip_code = czc_n.from_zip_code-1
    GROUP BY czc.id_county
) UNION (
  SELECT czc.id_county, 
         czc.from_zip_code AS from_zip_code,
         NULL AS to_zip_code
    FROM county_zip_code AS czc
      LEFT JOIN county_zip_code AS czc_p
        ON     czc.id_county = czc_p.id_county 
           AND czc.from_zip_code = czc_p.from_zip_code+1
      LEFT JOIN county_zip_code AS czc_n
        ON     czc.id_county = czc_n.id_county 
           AND czc.from_zip_code = czc_n.from_zip_code-1
  WHERE czc_p.from_zip_code IS NULL AND czc_n.from_zip_code IS NULL
)
ORDER BY id_county, from_zip_code

Try this:

CREATE TABLE county_zip_code1 AS
SELECT id_county, MIN(from_zip_code) AS from_zip_code, MAX(from_zip_code) to_zip_code
  FROM county_zip_code
GROUP BY id_county

继续阅读：php select

Contiguous sequential numbers in MySQL

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？