How to extract substring from a text file in bash?
I have lots of strings in a text file, like this: "/home/mossen/Desktop/jeff's project/Results/FCCY.png" "/tmp/accept/FLWS14UU.png" "/home/tten/Desktop/.wordi/STSMLC.png"
I want to get only the file names from the string as I read the text file line by line, using a bash shell script. The file name will always end in .png and will always have the "/" in front of it. I can get each string into a var, but what is the best way to extract the filenames (FCCY.png, FLWS14UU.png, etc.) into vars? I can't count on the user having Perl, Python, etc, just the standard Unix utils开发者_开发百科 such as awk and sed.
Thanks, mossen
You want basename
:
$ basename /tmp/accept/FLWS14UU.png
FLWS14UU.png
basename works on one file/string at a time. If you have many strings you will be iterating the file and calling external command many times.
use awk
$ awk -F'[/"]' '{print $(NF-1)}' file
FCCY.png
FLWS14UU.png
STSMLC.png
or use the shell
while read -r line
do
line=${line##*/}
echo "${line%\"}"
done <"file"
newlist=$(for file in ${list} ;do basename ${file}; done)
$ var="/home/mossen/Desktop/jeff's project/Results/FCCY.png"
$ file="${var##*/}"
Using basename iteratively has a huge performance hit. It's small and unnoticeable when you're doing it on a file or two but adds up over hundreds of them. Let me do some timing tests for you to exemplify why using basneame (or any system util callout) is bad when an internal feature can do the job -- Dennis and ghostdog74 gave you the more experienced BASH answers.
Sample input files.txt (list of my pics with full path): 3749 entries
external.sh
while read -r line
do
line=`basename "${line}"`
echo "${line%\"}"
done < "files.txt"
internal.sh
while read -r line
do
line=${line##*/}
echo "${line%\"}"
done < "files.txt"
Timed results, redirecting output to /dev/null to get rid of any video lag:
$ time sh external.sh 1>/dev/null
real 0m4.135s
user 0m1.142s
sys 0m2.308s
$ time sh internal.sh 1>/dev/null
real 0m0.413s
user 0m0.357s
sys 0m0.021s
The output of both is identical:
$ sh external.sh | sort > result1.txt
$ sh internal.sh | sort > result2.txt
$ diff -uN result1.txt result2.txt
So as you can see from the timing tests you really want to avoid any external calls to system utilities when you can write the same feature in some creative BASH code/lingo to get the job done, especially when it's going to be called a whole lot of times over and over.
精彩评论