Regex to read fixed width numeric fields
I would like regex(es) that can parse right-justified numeric values in a fixed length field with optional leading whitespace. (This is essentially FORTRAN output but there are many开发者_Go百科 other tools that do this). I know the width of the field.
Assume the field is an integer of width 5 (I5
). Then the following are all conformant numeric values:
" 123"
"12345"
"-1234"
" -1"
I can make no assumption about the previous and following fields. Thus the following is valid for I3,I5,I2
:
"-121234512"
and yields the values -12, 12345
and 12
.
There should be no additional code associated with the regex. I am using Java regex but I would like this to be fairly general (at least conformant with C#).
If this can be done for integers, I would also like the regex(es) for real numbers which include a decimal point, e.g. F10.3
" -12.123"
The regex:
(?=[ ]*-?\d+)[ -\d]{5}
matches all of your examples:
" 123"
"12345"
"-1234"
" -1"
And chaining them in groups:
((?=[ ]*-?\d+)[ -\d]{3})((?=[ ]*-?\d+)[ -\d]{5})((?=[ ]*-?\d+)[ -\d]{2})
on the input:
-121234512
matches:
$1 = -12
$2 = 12345
$3 = 12
A short explanation:
(?= # start positive look ahead
[ ]* # zero or more space
-? # an optional minus sign
\d+ # one or more digits
) # end positive look ahead
[ -\d]{5} # spaces, minus sign or digits, exactly 5 times
As you can see, the lookahead forces the order of the characters (spaces before digits and/or minus sign, minus sign before digits).
And a version for you float example might look like:
(?=[ ]*-?\d+(\.\d+)?)[ -\d.]{10}
You can use the regex:
^(?= *-?[0-9]*$).{5}
Rubular link
精彩评论