Sed problem creating correct regular expression
A pretty basic question, but one that i can't seem to find the answer to on stackoverflow or elsewhere online that actually solves the issue.
I'm writing a simple bash script to batch process a bunch of files. The script is feed by a directory listing, and then processes them individually. Unfortunately, the format of each filename may vary, and that's where I'm running into my trouble. Below is a sample of the type of filenames I am working with and the script that I am working with.
P.S. I'm sure there is probably some way to do this with awk as well (or any number of unix tools, but for now I'm focusing on sed).
Thanks in advance:
Files:
/home/acct/Foo-Bar.fl
/home/acct/Foo-1.1.fl
/home/acct/Cat-3.4-500.fl
/home/acct/DOG-BEAR-4.4-1.1.fl
/home/acct/DOG-BEAR-4.4-UPDATED.fl
I'm trying to extract the full path, filename, version number, and file prefix from each of these lines. Below is my latest attempt:
DIR_PATH="/home/acct/
for i in `find ${DIR_PATH}`;
do
FILEPATH="$i"
FILENAME=`echo $i | sed -e "s#${DIR_PATH}##g"`
FILEPREFIX=`echo $FILENAME | sed -e "s/\(.*\)-[0-9]\+.*/\1/g"`
FILEVERSION=`echo $FILENAME | sed -e "s/.*-\([0-9]\+.*\)\.fl/\1/g`
echo "$DIR_PATH"
echo "$FILE_PATH"
echo "$FILENAME"
echo "$FILEPREFIX"
echo "$FILEVERSION"
#do something with this file now that I know what is going on with it
done
Trouble 开发者_开发问答comes into play when dealing with version numbers separated by "dashes" and files without a version number. I think i've gotten all the issues with complex version numbers resolved, but am still struggling with the cases where no version number exists at all.
I figure I need to do some sort of either or type expression (or have a second sed statement to do another pass), but am not really sure how to format it.
UPDATE:
Per Axel's comment, determining the filename can be made much easier by using basename instead of trying to match the path. Also, an answer down below involved splitting the filename from the extension, also a change that I think would be worth incorporating.
I would update the script with thsee changes to be similar to:
FILEPATH="$i"
FILENAME=`basename $i`
FILENAMENOSUFFIX=`echo $FILENAME | sed -e "s/\(.*\)\..*/\1/g"`
FILEPREFIX=`echo $FILENAME | sed -e "s/\(.*\)-[0-9]\+.*/\1/g"`
Pure Bash (except for find
):
shopt -s extglob
while read -r file
do
dir=${file%/*}
name=${file##*/}
noext=${name/%.fl}
pre=${noext%%-@([0-9])*}
ver=${noext/#$pre-}
[[ ${#ver} == ${#noext} ]] && ver=
echo "Dir: $dir, Name: $name, Noext: $noext"
echo " Pre: $pre, Ver: $ver"
done < <(find "$dir" -type f)
Output using your example filenames:
Dir: /home/acct, Name: Foo-Bar.fl, Noext: Foo-Bar
Pre: Foo-Bar, Ver:
Dir: /home/acct, Name: Foo-1.1.fl, Noext: Foo-1.1
Pre: Foo, Ver: 1.1
Dir: /home/acct, Name: Cat-3.4-500.fl, Noext: Cat-3.4-500
Pre: Cat, Ver: 3.4-500
Dir: /home/acct, Name: DOG-BEAR-4.4-1.1.fl, Noext: DOG-BEAR-4.4-1.1
Pre: DOG-BEAR, Ver: 4.4-1.1
Dir: /home/acct, Name: DOG-BEAR-4.4-UPDATED.fl, Noext: DOG-BEAR-4.4-UPDATED
Pre: DOG-BEAR, Ver: 4.4-UPDATED
find . -type file | sed 's!\(.*\)/\([^/0-9]*\)-\([0-9][^/]*\)\.\([^./]*\)$!\0 \1 \2 \3 \4!'
This assumes each file is setup like this: {base}/{prefix}-{version-starts-with-number}.{extension}
精彩评论