开发者

How can I use regex to solve this?

I have two strings that I need to pull data out of but can't seem to get it working. I wish I knew regular expression but unfortunately I don't. I have read some beginner tutorials but I can't seem to find an expression that will do what I need.

Out of this first string delimited by the equal character, I need to skip the first 6 characters and grab the following 9 characters. After the equal character, I need to grab the first 4 characters which is a day and year. Lastly for this string, I need the remaining numbers which is a date in YYYYmmdd.

636014034657089=130719889904

The second string seems a little more difficult because the spaces between the characters differ but always seem to be delimited by at minimum, a single space. Sometimes, there are as many as 15 or 20 spaces separating the blocks of data.

Here are two different samples that show the space difference.

!!92519 C 01 M600200BLNBRN D55420090205M1O

!!95815      A               M511195BRNBRN            D62520070906  ":%/]Q2#0*&

The data that I need out of these last two strings are:

The zip code following the 2 exclamation marks.
The single letter 'M' following that. It always appears to be in a 13 character block
The 3 numbers after the single letter
The next 3 numbers which are the person's height
The following next 3 are the person's weight
The next 3 are eye color
The next block of 3 which are the person's hair color

The last block that I need data from:

I need to get the single letter which in the example appears to be a 开发者_StackOverflow中文版'D'. Skip the next 3 numbers The last and remaining 8 numbers which is a date in YYYYmmdd

If someone could help me resolve this, I'd be very grateful.


For the first string you can use this regular expression:

^[0-9]{6}([0-9]{9})=([0-9]{4})([0-9]{4})([0-9]{2})([0-9]{2})$

Explanation:

^          Start of string/line
[0-9]{6}   Match the first 6 digits
([0-9]{9}) Capture the next 9 digits
=          Match an equals sign
([0-9]{4}) Capture the "day and year" (what format is this in?)
([0-9]{4}) Capture the year
([0-9]{2}) Capture the month
([0-9]{2}) Capture the date
$          End of string/line

For the second:

^!!([0-9]{5}) +.*? +M([0-9]{3})([0-9]{3})([A-Z]{3})([A-Z]{3}) +([A-Z])[0-9]{3}([0-9]{4})([0-9]{2})([0-9]{2})

Rubular

It works in a similar way to the first. You may need to adjust it slightly if your data is not exactly in the format that the regular expression expects. You might want to replace the .*? with something more precise but I'm not sure what because you haven't described the format of the parts you are not interested in.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜