How do I Select Highest Number From Series of <string>_# File Names in Bash Script
I have a directory with files
heat1.conf
heat2.conf
...
heat<n>.conf
minimize.conf
...开发者_开发问答
other files....
I want my Bash script to be able to grab the highest number filename (so I can delete and replace it when I find an error condition).
What's the best way to accomplish this?
Please discuss the speed of your solution and why you think that it is the best approach.
If you are going to list your file only in current directory, then there's no need to use find with maxdepth 1 or use ls
. Just use a for loop with shell expansion. Also, expr
is external. if your number doesn't contain decimals, you can use just bash's own comparison.
max=-1
for file in heat*.conf
do
num=${file:4}
num=${file%.conf}
[[ $num -gt $max ]] && max=$num
done
echo "max is: $max"
What about:
max=$(find . -name 'heat[1-9]*.conf' -depth 1 |
sed 's/heat\([0-9][0-9]*\)\.conf/\1/' |
sort -n |
tail -n 1)
List the possible file names; keep just the non-numeric bit; sort the numbers; select the largest (last) number.
Regarding speed: without falling into a scripting language like Perl (Python, Ruby, ...), this is close to as good as you can get. The use of find
instead of ls
means that the list of file name is generated just once (the first version of this answer used ls
, but that causes the shell to generate the list of file names, and then ls
to echo that list). The sed
command is fairly simple, and generates a list of numbers which have to be sorted. You could argue that a sort in reverse numeric order (sort -nr
) piped into sed 1q
would be faster; the second sed
would read less data, and the sort might not generate all its output before the SIGPIPE from sed
closing its input (as it terminates).
In a scripting language like Perl, you would avoid multiple processes, and the overhead of pipe communication between those processes. This would be faster, but there'd be a lot less shell scripting involved.
I came up with one solution:
highest=-1
current_dir=`pwd`
cd $my_dir
for file in $(ls heat*) ; do #assume I've already checked for dir existence
if [ "${file:4:$(($(expr length $file)-9))}" -gt "$highest" ]; then
highest=${file:4:$(($(expr length $file)-9))}
fi
done
cd $current_dir
....Okay I took your suggestions and edited my solution to scrap the expr and pre-save the variable. In 1000 trials, my method (modified) on average was faster that Jon's but slower than GhostDog's, but the standard deviation was relatively large.
My revised script is seen below in my trial, as are Jon and Ghost Dog's solutions...
declare -a timing
for trial in {1..1000}; do
res1=$(date +%s.%N)
highest=-1
current_dir=`pwd`
cd $my_dir
for file in $(ls heat*) ; do
#assume I've already checked for dir existence
file_no=${file:4:${#file}-9}
if [ $file_no -gt $highest ]; then
highest=$file_no
fi
done
res2=$(date +%s.%N)
timing[$trial]=$(echo "scale=9; $res2 - $res1"|bc)
cd $current_dir
done
average=0
#compile net result
for trial in {1..1000}; do
current_entry=${timing[$trial]}
average=$( echo "scale=9; (($average+$current_entry/1000.0))"|bc)
done
std_dev=0
for trial in {1..1000}; do
current_entry=${timing[$trial]}
std_dev=$(echo "scale=9; (($std_dev + ($current_entry-$average)*($current_entry-$average)))"|bc)
done
std_dev=$(echo "scale=9; sqrt (($std_dev/1000))"|bc)
printf "Approach 1 (Jason), AVG Elapsed Time: %.9F\n" $average
printf "STD Deviation: %.9F\n" $std_dev
for trial in {1..1000}; do
res1=$(date +%s.%N)
highest=-1
current_dir=`pwd`
cd $my_dir
max=$(ls heat[1-9]*.conf |
sed 's/heat\([0-9][0-9]*\)\.conf/\1/' |
sort -n |
tail -n 1)
res2=$(date +%s.%N)
timing[$trial]=$(echo "scale=9; $res2 - $res1"|bc)
cd $current_dir
done
average=0
#compile net result
for trial in {1..1000}; do
current_entry=${timing[$trial]}
average=$( echo "scale=9; (($average+$current_entry/1000.0))"|bc)
done
std_dev=0
for trial in {1..1000}; do
current_entry=${timing[$trial]}
#echo "(($std_dev + ($current_entry-$average)*($current_entry-$average))"
std_dev=$(echo "scale=9; (($std_dev + ($current_entry-$average)*($current_entry-$average)))"|bc)
done
std_dev=$(echo "scale=9; sqrt (($std_dev/1000))"|bc)
printf "Approach 2 (Jon), AVG Elapsed Time: %.9F\n" $average
printf "STD Deviation: %.9F\n" $std_dev
for trial in {1..1000}; do
res1=$(date +%s.%N)
highest=-1
current_dir=`pwd`
cd $my_dir
for file in heat*.conf
do
num=${file:4}
num=${file%.conf}
[[ $num -gt $max ]] && max=$num
done
res2=$(date +%s.%N)
timing[$trial]=$(echo "scale=9; $res2 - $res1"|bc)
cd $current_dir
done
average=0
#compile net result
for trial in {1..1000}; do
current_entry=${timing[$trial]}
average=$( echo "scale=9; (($average+$current_entry/1000.0))"|bc)
done
std_dev=0
for trial in {1..1000}; do
current_entry=${timing[$trial]}
#echo "(($std_dev + ($current_entry-$average)*($current_entry-$average))"
std_dev=$(echo "scale=9; (($std_dev + ($current_entry-$average)*($current_entry-$average)))"|bc)
done
std_dev=$(echo "scale=9; sqrt (($std_dev/1000))"|bc)
printf "Approach 3 (GhostDog), AVG Elapsed Time: %.9F\n" $average
printf "STD Deviation: %.9F\n" $std_dev
... the results are:
Approach 1 (Jason), AVG Elapsed Time: 0.041418086
STD Deviation: 0.177111854
Approach 2 (Jon), AVG Elapsed Time: 0.061025972
STD Deviation: 0.212572411
Approach 3 (GhostDog), AVG Elapsed Time: 0.026292145
STD Deviation: 0.145542801
Good job GhostDog!!! And thanks to both you Jon and the commenters for your tips! :)
You can use sort --version-sort
like this
ls heat*.conf | sort -r --version-sort | head -1
精彩评论