Using sed command for range of numbers
I have a file with city and numbers. It's a csv file
New York , 23456
chicago, 123,456,789,889981(2-6)
phoenix 123,76(0-3)
Range开发者_运维技巧 number in the file i want to replace it with each number. For example i want to change 889981(2-6) to 8899812,8899813,8899814,8899815,8899816 and insert in the same line. Will i be able to it in sed. It needs to scan the entire file and do the replacement.
sed
is not very good with arithmetic; I suppose it is not impossible, but also not very simple. My recommendation would be to use a proper scripting language, such as awk, perl, or python (if you are not familiar with any of them, perhaps Python; if you want the smallest possible memory footprint, use awk; if you already know Perl, by all means, use Perl).
perl -pe 's/(\d+)\((\d+)-(\d+)\)$/ join (",",
(join ("", $1, $2) .. join ("", $1, $3))) /ge' file
No, this is beyond what you can do with just a regular expression. You will need to add something more powerful, like perl
, python
or awk
, or whatever you feel most at home with.
Requires gawk
for the 3-argument match()
function:
gawk '
BEGIN {OFS = FS = ","}
match($NF, /([0-9]+)\(([0-9]+)-([0-9]+)\)/, ary) {
NF--
for (n=ary[2]; n <= ary[3]; n++) {
$(NF+1) = 10 * ary[1] + n
}
}
{print}
'
I assume (based on the sample) that the range only occurs in the last comma-separated field.
Solution using awk
(@glenn jackman will probably post something
that does this in less than 5 lines):
# join.awk --- join an array into a string
function join(array, start, end, sep, result, i)
{
if (sep == "")
sep = " "
else if (sep == SUBSEP) # magic value
sep = ""
result = array[start]
for (i = start + 1; i <= end; i++)
result = result sep array[i]
return result
}
function range(input) {
split(input, a, "[(-)]")
# [1] is startvalue, [2] is start and stop for range
split(a[2], b, "-")
# [1] is start range, [2] is stop range
# create 1st number by appending start range to start value
c[1] = a[1] b[1]
n=2
for(i=b[1]; i<=b[2]; i++){
c[n] = c[n-1] + 1
n++
}
return join(c, 1, b[2], ",")
}
# a line containing a -
/-/ {
for(i=1;i<=NF;i++){
if ($i ~ /-/) {
printf("%s,", range($i))
}
printf("%s,", $i)
}
print ""
}
!/-/{print}
This might work for you (GNU sed only):
sed 's/^\(.*\)\b\([0-9]\+\)(\([0-9]\)-\([0-9]\))/echo "\1" {\2\3..\2\4}/e;s/\([0-9]\),\? \([0-9]\)/\1,\2/g' file
New York , 23456
chicago, 123,456,789,8899812,8899813,8899814,8899815,8899816
phoenix 123,760,761,762,763
精彩评论