bash script regex matching
In my bash script, I have an array of filenames like
files=( "site_hello.xml" "site_test.xml" "site_live.xml" )
I need to extract the characters between the underscore and the .xml extension so that I can loop through them for use in a function.
If this were python, I might use something like
re.match("site_(.*)\.xml")
and then extract the first matched group.
Unfortunately this project needs to be in bash, so -- Ho开发者_StackOverflow社区w can I do this kind of thing in a bash script? I'm not very good with grep or sed or awk.
Something like the following should work
files2=(${files[@]#site_}) #Strip the leading site_ from each element
files3=(${files2[@]%.xml}) #Strip the trailing .xml
EDIT: After correcting those two typos, it does seem to work :)
xbraer@NO01601 ~
$ VAR=`echo "site_hello.xml" | sed -e 's/.*_\(.*\)\.xml/\1/g'`
xbraer@NO01601 ~
$ echo $VAR
hello
xbraer@NO01601 ~
$
Does this answer your question?
Just run the variables through sed in backticks (``)
I don't remember the array syntax in bash, but I guess you know that well enough yourself, if you're programming bash ;)
If it's unclear, dont hesitate to ask again. :)
I'd use cut to split the string.
for i in site_hello.xml site_test.xml site_live.xml; do echo $i | cut -d'.' -f1 | cut -d'_' -f2; done
This can also be done in awk:
for i in site_hello.xml site_test.xml site_live.xml; do echo $i | awk -F'.' '{print $1}' | awk -F'_' '{print $2}'; done
If you're using arrays, you probably should not be using bash.
A more appropriate example wold be
ls site_*.xml | sed 's/^site_//' | sed 's/\.xml$//'
This produces output consisting of the parts you wanted. Backtick or redirect as needed.
精彩评论