Bash regex finding and replacing
I don't know if th开发者_运维百科is is possible, but can you dynamically alter a find/replace? Basically I have something like this
<3 digit number> <data>
and what I want to do is if data matches the pattern
<word>:<4 digit number>
replace all instances (in the entire file) of <word>:
with the line's 3 digit number I.E:
020 Word
021 Word:0001
Replace with
020 021
021 0210001
Is this doable with AWK or Sed? If not, is it doable in C?
I know this isn't what you asked, but I think the best way to solve this is with a simple Perl script.
#!/usr/bin/perl
$in= "input.txt";
$out= "output.txt";
# Buffer the whole file for replacing:
open(INFILE, $in);
@lines = <INFILE>;
open(INFILE, $in);
# Iterate through each line:
while(<INFILE>) {
# If the line matches "word:number", replace all instances in the file
if (/^(\d{3}) (\w+:)\d{4}$/) {
$num = $1; word = $2;
s/$word/$num/ foreach @lines;
}
}
open(OUTFILE, $out);
print OUTFILE foreach @lines;
It looks a lot longer than it really needs to be, because I made it nice and easy-to-read for you.
I hope this time I got you right.
try the stuff below:
#file name:t
kent$ cat t
020 Word
021 Word:0001
#first we find out the replacement, 021 in this case:
kent$ v=$(grep -oP "(\d{3})(?= Word:\d{4})" t|head -n1)
#do replace by sed:
kent$ sed -r "s/Word[:]?/$v/g" t
020 021
021 0210001
number=$(gawk --posix '/[0-9]{3} '${word}':[0-9]{4}/ { print $1; exit }' $file)
if [ "$number" != "" ]; then
sed -r "s/${word}:?/${number}/" $file
fi
This awk solution takes 2 passes through your file: once to find all the Word
s needing replacement, and once to actually do the replacing:
gawk '
NR == FNR {
if (match($2, /^([^:]+):[0-9][0-9][0-9][0-9]$/, a))
repl[a[1] ":?"] = $1
next
}
{
for (word in repl)
if ($2 ~ word) {
sub(word, repl[word], $2)
break
}
print
}
' filename filename > new.file
Requires gawk
for capturing parentheses.
Here's another sed solution:
# sweep the file and make a lookup table variable
lookup=$(sed -nr 's/(.*) (.*:).*/\2\1/p' <source_file |tr '\n' ' ')
# append the lookup to each line and substitute using a backreference
# N.B. remove the lookup whatever!
sed -r "s/\$/@@${lookup}/;
s/^(... )(.*)$@@.*\2:(\S*).*/\1\3/;
s/^(... )(.*:)(.*)@@.*\2(\S*).*/\1\4\3/;
s/@@.*//" <source_file
精彩评论