开发者

I'm Getting Concatenated address line from USPS.. I want to break it some way

We are sending Address Line1 and Address Line2 for validation.

And when it goes to USPS for validation, after validation it concatenates both address lines in Address Line1.

For instance:

AddressLine1: 20 ROOSEVELT AVE
AddressLine2: apt# 22

After validation it concatenates both the address lines:

AddressLine1: 20 Roosevelt Ave Apt# 209
AddressLine2: null

I want to break the returned Address Line1 as a validated address back into two lines, how can 开发者_开发百科I do this?


The USPS validation is reformatting the text beyond just concatentating the two lines. I don't know what kind of reformatting may be involved for different types of addresses, but in your example the only difference seems to be that it has changed from upper to mixed case, and the apartment number has changed. I have no suggestions on how to handle a change to the information (like the number changing), but if only the upper/lower case changes you can do something like the following:

// you specified both Java AND JavaScript; I've picked JavaScript

var originalLine1 = "...",
    originalLine2 = "...";

// somehow call USPS validation to set the following:
var validatedLine1 = "...",
    validatedLine2 = "...",
    validationPassed = true || false;

// now, did validation pass?
if (validationPassed) {
  // if we can match the old line 1 with the left-hand side
  // of the new line 1, and we're not going to be overwriting
  // a non-null value in the new line 2 then split the new line 1
  if (validatedLine2 === null &&
      originalLine1.toLowerCase()
        === validatedLine1.substr(0,originalLine1.length).toLowerCase()) {

    validatedLine2 = validatedLine1.substr(originalLine1.length);
    validatedLine1 = validatedLine1.substr(0, originalLine1.length);
  }
  // do something with the results
}

Having said that, what is the purpose of calling the USPS validation? If it modifies the text but otherwises passes validation maybe you should just use the modified version since presumably that follows USPS's addressing standards?


The readon the USPS concatenated the unit information from the AddressLine2 field you submitted is because it actually belongs on AddressLine1 (according to their specifications). AddressLine2 is meant only for extraneous information that could help a mail carrier deliver the mail (see USPS publication 28).

If you would like the secondary information (apartment, unit, etc...) split into a separate field you would be best served by using a service that leverages official USPS data to verify and parse the address into it's various components as well as the composed delivery line.

I'm a software developer for SmartyStreets, an address verification company that provides just such a service via an API. Our REST/JSON endpoint provides both the separate address components as well as the full delivery line. This would allow you to group the data in whatever way suits your business needs.


Just make sure you keep a copy of the object before sending it off for validation.

Then, when you get the validated object back, you can copy the address information from the old object into the newly returned object.

Edit

I mistakingly used the word "copy" the old information, despite that fact that copying is not what you would want.

I'm not sure how involved the validation gets (e.g., does it do more than just handle capitalization). However, if we assume that every word from the original addresses are mapped to a word in the new address, then a simple idea would be to copy the addresses word by word.

In your example, the original AddressLine1 has three words in it. So you can read three word from the new AddressLine1, and keep them. The remaining two words can then be copied into the new AddressLine2. This could be easily achieved by using a Scanner on the String. For the more adept, I'm sure there is a word-based Regex pattern that could be used, but I'm not so good with those things.


I would search for the street suffix (AVE in this case) to split() it back into two lines. It will not be perfect, but it would be about as correct as possible given the requirements.

You can get a list of recognized USPS street suffix list from here :

http://www.usps.com/ncsc/lookups/abbr_suffix.txt

Note that this also takes abbreviations into consideration as well and seeing that it is theri list, they're also likely using these during validation (high liklihood the returned address would be changed to one of these standard formats).

I don't like the idea of word counts. I've lived on numerous streets that would create an issue. Meadowcrest Dr vs. Meadow Crest Dr. I think this is a perfect example of the types of changes the validation routine would be doing.

Once you have that, I think its pretty simple from there. Let me know if you need more info the idea

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜