How to reference captures in bash regex replacement
How can I include the regex match in the replacement expression in BASH?
Non-working example:
#!/bin/开发者_Python百科bash
name=joshua
echo ${name//[oa]/X\1}
I expect to output jXoshuXa
with \1
being replaced by the matched character.
This doesn't actually work though and outputs jX1shuX1
instead.
Perhaps not as intuitive as sed
and arguably quite obscure but in the spirit of completeness, while BASH will probably never support capture variables in replace (at least not in the usual fashion as parenthesis are used for extended pattern matching), but it is still possible to capture a pattern when testing with the binary operator =~
to produce an array of matches called BASH_REMATCH
.
Making the following example possible:
#!/bin/bash
name='joshua'
[[ $name =~ ([ao].*)([oa]) ]] && \
echo ${name/$BASH_REMATCH/X${BASH_REMATCH[1]}X${BASH_REMATCH[2]}}
The conditional match of the regular expression ([ao].*)([oa])
captures the following values to $BASH_REMATCH
:
$ echo ${BASH_REMATCH[*]}
oshua oshu a
If found we use the ${parameter/pattern/string}
expansion to search for the pattern oshua
in parameter with value joshua
and replace it with the combined string Xoshu
and Xa
. However this only works for our example string because we know what to expect.
For something that functions more like the match all or global regex counterparts the following example will greedy match for any unchanged o
or a
inserting X
from back to front.
#/bin/bash
name='joshua'
while [[ $name =~ .*[^X]([oa]) ]]; do
name=${name/$BASH_REMATCH/${BASH_REMATCH:0:-1}X${BASH_REMATCH[1]}}
done
echo $name
The first iteration changes $name
to joshuXa
and finally to jXoshuXa
before the condition fails and the loop terminates. This example works similar to the look behind expression /(?<!X)([oa])/X\1/
which assumes to only care about the o
or a
characters which don't have a X
prefixed.
The output for both examples:
jXoshuXa
nJoy!
bash> name=joshua
bash> echo $name | sed 's/\([oa]\)/X\1/g'
jXoshuXa
The question bash string substitution: reference matched subexpressions was marked a duplicate of this one, in spite of the requirement that
The code runs in a long loop, it should be a one-liner that does not launch sub-processes.
So the answer is:
If you really cannot afford launching sed in a subprocess, do not use bash ! Use perl instead, its read-update-output loop will be several times faster, and the difference in syntax is small. (Well, you must not forget semicolons.)
I switched to perl, and there was only one gotcha: Unicode support was not available on one of the computers, I had to reinstall packages.
精彩评论