开发者

Using regex to add leading zeroes

I would like to add a certain number of leading zeroes (say up to 3) to all numbers of a string. For example:

Input: /2009/5/song 01 of 12

Output: /2009/0005/song 0001 of 0012

What's the best way to do this with regular expressions?

Edit:

I picked the first correct ans开发者_JAVA百科wer. However, all answers are worth giving a read.


In Perl:

s/([0-9]+)/sprintf('%04d',$1)/ge;


Use something that supports a callback so you can process the match:

>>> r=re.compile(r'(?:^|(?<=[^0-9]))([0-9]{1,3})(?=$|[^0-9])')
>>> r.sub(lambda x: '%04d' % (int(x.group(1)),), 'dfbg345gf345', sys.maxint)
'dfbg0345gf0345'
>>> r.sub(lambda x: '%04d' % (int(x.group(1)),), '1x11x111x', sys.maxint)
'0001x0011x0111x'
>>> r.sub(lambda x: '%04d' % (int(x.group(1)),), 'x1x11x111x', sys.maxint)
'x0001x0011x0111x'


A sample:

>>> re.sub("(?<!\d)0*(\d{1,3})(?!\d)","000\\1","/2009/5/song 01 of 3")
'/2009/0005/song 0001 of 0003'

Note:

  • It only works for numbers 1 - 9 for now
  • It is not well test yet

I can't think of a single regex without using callbacks for now* (there might be a way to do it).

Here are two regular expression to process that:

>>> x = "1/2009/5/song 01 of 3 10 100 010 120 1200 abcd"
>>>
>>> x = re.sub("(?<!\d)0*(\d{1,3})(?!\d)","000\\1",x)
#'0001/2009/0005/song 0001 of 0003 00010 000100 00010 000120 1200 abcd'
>>>
>>> re.sub("0+(\d{4})(?!\d)","\\1",x) #strip extra leading zeroes
'0001/2009/0005/song 0001 of 0003 0010 0100 0010 0120 1200 abcd'


Using c#:

string result = Regex.Replace(input, @"\d+", me =>
{
    return int.Parse(me.Value).ToString("0000");
});


Another approach:

>>> x
'/2009/5/song 01 of 12'
>>> ''.join([i.isdigit() and i.zfill(4) or i for i in re.split("(?<!\d)(\d+)(?!\d)",x)])
'/2009/0005/song 0001 of 0012'
>>>

Or:

>>> x
'/2009/5/song 01 of 12'
>>> r=re.split("(?<!\d)(\d+)(?!\d)",x)
>>> ''.join(a+b.zfill(4) for a,b in zip(r[::2],r[1::2]))
'/2009/0005/song 0001 of 0012'


If your regular expression implementation does not support look-behind and/or look-ahead assertions, you can also use this regular expression:

(^|\D)\d{1,3}(\D|$)

And replace the match with $1 + padLeft($2, 4, "0") + $3 where $1 is the match of the first group and padLeft(str, length, padding) is a function that prefixes str with padding until the length length is reached.


<warning> This assumes academic interest, of course you should use callbacks to do it clearly and correctly </warning>

I'm able to abuse regular expressions to have two leading zeros (.NET flavor):

s = Regex.Replace(s, @".(?=\b\d\b)|(?=\b\d{1,2}\b)", "$&0");

It doesn't work if there's a number in the beginning of the string. This works by matching the 0-width before a number or the character before a number, and replacing them with 0.

I had no luck expanding it to three leading zeros, and certainly not more.


The principle: Two replaces in first you add zeros front of that in the second you cut last x places. This worked for my solution to this issue in SQL. Solution of my problem that I solved.

The example: REGEXP_REPLACE(REGEXP_REPLACE(version,'.([0-9][.][0-9][.][0-9])..','\1.00000\2'),'([0-9][.][0-9][.][0-9][.]).*(.....$)','\1\2'),'.','')

this code makes the value 1.1.1.1 => 1.1.1.00001


Here is a Perl solution without callbacks or recursion. It does use the Perl regex extension of execution of code in lieu of the straight substitution (the e switch) but this is very easily extended to other languages that lack that construct.

#!/usr/bin/perl

while (<DATA>) {
   chomp;
   print "string:\t\t\t$_\n";
# uncomment if you care about 0000000 case:
#   s/(^|[^\d])0+([\d])/\1\2/g;
#   print "now no leading zeros:\t$_\n";    
   s/(^|[^\d]{1,3})([\d]{1,3})($|[^\d]{1,3})/sprintf "%s%04i%s",$1,$i=$2,$3/ge;
   print "up to 3 leading zeros:\t$_\n";
}
print "\n";

__DATA__
/2009/5/song 01 of 12
/2010/10/song 50 of 99
/99/0/song 1 of 1000
1
01
001
0001
/001/
"02"
0000000000

Output:

string:                /2009/5/song 01 of 12
up to 3 leading zeros:  /2009/0005/song 0001 of 0012
string:                /2010/10/song 50 of 99
up to 3 leading zeros:  /2010/0010/song 0050 of 0099
string:                /99/0/song 1 of 1000
up to 3 leading zeros:  /0099/0/song 0001 of 1000
string:                1
up to 3 leading zeros:  0001
string:                01
up to 3 leading zeros:  0001
string:                001
up to 3 leading zeros:  0001
string:                0001
up to 3 leading zeros:  0001
string:                /001/
up to 3 leading zeros:  /0001/
string:                "02"
up to 3 leading zeros:  "0002"
string:                0000000000
up to 3 leading zeros:  0000000000


Combined in Xcode:

targetName=[NSString stringWithFormat:@"%05d",number];

Gives 00123 for number 123


A valid Scala program to replace all groups of n digits to 4. $$ escapes the line ending char $, because we are using StringContext (string prefixed by s).

  (f/:(1 to 3)){case (res,i) =>
     res.replaceAll(s"""(?<=[^\\d]|^)(\\d$i)(?=[^\\d]|$$)""", "0"*(4-i)+"$1")
  }


C# version

        string input = "/2009/5/song 01 of 12";
        string regExPattern = @"(\/\d{4}\/)(\d+)(\/song\s+)(\d+)(\s+of\s+)(\d+)";
        string output = Regex.Replace(input, regExPattern, callback =>
        {
            string yearPrefix = callback.Groups[1].Value;
            string digit1 = int.Parse(callback.Groups[2].Value).ToString("0000");
            string songText = callback.Groups[3].Value;
            string digit2 = int.Parse(callback.Groups[4].Value).ToString("0000");
            string ofText = callback.Groups[5].Value;
            string digit3 = int.Parse(callback.Groups[6].Value).ToString("0000");
            return $"{yearPrefix}{digit1}{songText}{digit2}{ofText}{digit3}";
        });


In case anyone is interested in how to do this in R, the package stringr is helpful:

library(stringr)
input<-"/2009/5/song 01 of 12"
str_replace_all(string = input,
                pattern="((?<![0-9])[0-9]*([0-9]{1,3}))",
                replacement=function(x){str_pad(x,width=4,side="left",pad="0")})

"/2009/0005/song 0001 of 0012"

See: https://evoldyn.gitlab.io/evomics-2018/ref-sheets/R_strings.pdf

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜