开发者

Determine if a couple of zip codes are contiguous

My customer has sales regions where each sales region consists of a list of zip codes. The regions are pretty big and could be easier stored in format like :

Region consists of zip code range from 00602 to 10012 and 20020 to 30020.

How can I get from a list of zip codes to a list of such zip code ranges?

Consider the following data

--This would be my list of all available zip codes in us:

CREATE TABLE [Zip](
    [Zip] [nvarchar](20) ,
    [State] [nvarchar](50) ,
)

--This would be the Sales Region List

CREATE TABLE [dbo].[SalesRegion](
    [AreaCode] [nvarchar](50) 
) 

--This would be the original large list Zip Codes for the SalesRegions

CREATE TABLE [dbo].[EnteredZip](
    [Zip] [nvarchar](20) ,
    [AreaCode] [nvarchar](50) 
) 

--This is where I would like to store the Zip Code Ranges

CREATE TABLE [dbo].[SearchableZip](
    [StartZip] [nvarchar](20) ,
    [EndZip] [nvarchar](20) ,
    [AreaCode] [nvarchar](50) 
) 

--Here is my sample Data:

--Some Zip Codes in US
insert into dbo.Zip (Zip,[State]) values ('00501'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00544'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00601'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00602'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00603'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00604'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00605'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00606'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00610'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00611'   ,'PR')
insert into dbo.Zip (Zip,[State]) values ('00612'   ,'PR')


--Some Sales Regions

Insert Into dbo.SalesRegion ( AreaCode ) values('Area1')
Insert Into dbo.SalesRegion ( AreaCode ) values('Area2')
Insert Into dbo.SalesRegion ( AreaCode ) values('Area3')


--The zip codes of the Sales Regions
insert Into EnteredZip (Zip,AreaCode) values ('00544' , 'Area1')
insert Into EnteredZip (Zip,AreaCode) values ('00601' , 'Area1')
insert Into EnteredZip (Zip,AreaCode) values ('00602' , 'Area1')

insert Into EnteredZip (Zip,AreaCode) values ('00604' , 'Area2')
insert Into EnteredZip (Zip,AreaCode) values ('00606' , 'Area2')

insert Into EnteredZip (Zip,AreaCode) values ('00501' , 'Area3')
insert Into EnteredZip (Zip,AreaCode) values ('00544' , 'Area3')
insert Into EnteredZip (Zip,AreaCode) values ('00601' , 'Area3')
insert Into EnteredZip (Zip开发者_开发知识库,AreaCode) values ('00602' , 'Area3')
insert Into EnteredZip (Zip,AreaCode) values ('00603' , 'Area3')
insert Into EnteredZip (Zip,AreaCode) values ('00604' , 'Area3')

insert Into EnteredZip (Zip,AreaCode) values ('00610' , 'Area3')
insert Into EnteredZip (Zip,AreaCode) values ('00611' , 'Area3')
insert Into EnteredZip (Zip,AreaCode) values ('00612' , 'Area3')

Would result in this entries in the table SearchableZip

AreaCode             StartZip             EndZip
-------------------- -------------------- -------------------------
Area1                00544                00602
Area2                00604                00604
Area2                00606                00606
Area3                00501                00604
Area3                00610                00612

Is it possible to create SearchableZip with a sql script?

EDIT

I fixed the table declaration and output data


Yes, it is possible to obtain the ranges from the list with a single query. For that you will use a CTE, ranking, and a bit of grey matter:

WITH ranked AS (
  SELECT
    Zip,
    AreaCode,
    ZipGroup = CAST(Zip AS int)
             - ROW_NUMBER() OVER (PARTITION BY AreaCode ORDER BY Zip)
  FROM EnteredZip
)
SELECT
  StartZip = MIN(Zip),
  EndZip   = MAX(Zip),
  AreaCode
FROM ranked
GROUP BY AreaCode, ZipGroup

Output:

StartZip             EndZip               AreaCode
-------------------- -------------------- -------------------------
00544                00544                Area1
00601                00602                Area1
00604                00604                Area2
00606                00606                Area2
00501                00501                Area3
00544                00544                Area3
00601                00604                Area3
00610                00612                Area3

This output doesn't match yours, but it does match the source data.


UPDATE

If the Zip table is a reference table for determining the contiguity of a list of zip codes, then the above solution should be modified like this:

WITH ZipRanked AS (
  SELECT
    Zip,
    State,
    ZipRank = ROW_NUMBER() OVER (PARTITION BY State ORDER BY Zip)
  FROM Zip
),
EnteredZipRanked AS (
  SELECT
    e.Zip,
    e.AreaCode,
    ZipGroup = z.ZipRank
             - ROW_NUMBER() OVER (PARTITION BY e.AreaCode ORDER BY e.Zip)
  FROM EnteredZip e
    INNER JOIN ZipRanked z ON e.Zip = z.Zip
)
SELECT
  StartZip = MIN(Zip),
  EndZip   = MAX(Zip),
  AreaCode
FROM EnteredZipRanked
GROUP BY AreaCode, ZipGroup


First, I need to tell you that what I think you're planning to do strikes me as a bad idea. The table EnteredZip is just fine for storing which area a ZIP code belongs to. (As long as you put a PRIMARY KEY constraint on ZIP.)

It looks like this is roughly where you're aiming,

select areacode, min(zip), max(zip)
from enteredzip
group by areacode
order by areacode

but it doesn't match your output. Frankly, your sample output doesn't make sense to me.

There's only one row for Area1, but the ZIP codes aren't contiguous. There are two rows for Area2, but each has a single ZIP code. There are two rows for Area3, but the ZIPs aren't contiguous.

Wait . . .

Does contiguous mean you inserted a blank line between the INSERT statements in the sample data?

If that's the case, then you need to store more data. You have to identify which ZIP codes should be considered contiguous, and store those facts in a table. (Also, you left out a blank line in Area2.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜