开发者

Bash script for outputting list of specific log files to a text file

I am trying to create a text file that contains a listing of all log files that contain a certain string in the first line. More specifically, SAS log files.

Currently I have a simple script that will search the entire system for "*.log" files and output the entire list to a text file.

Is there a way to only output the log files that contain a certain string?

Here is the current command:

find `pwd` -name "*.log开发者_开发知识库" > sas_log_list.txt

Every SAS log file contains the same string on the very first line.

This string is:

1 The SAS System

So basically I want to search a file system for log files containing the string above, and output those file names to a text file.

Thanks in advance, Jason


The hardest part of this question is searching only within the first line. The most accurate one liner (broken here for readability) I could come up with was:

find . -name '*.log'  -type f  -readable  ! -size 0 \
       -exec sed -n '1{/The SAS System/q0};q1' {} \; \
       -print

Due to the obscure nature of sed syntax, some explanation is in order:

  • The 1{...} will be evaluated for the first line only.
  • The /regex/q0 command will quit with exit code 0 (success) if the regex had been matched (consider /^regex$/ for matching the entire line against that regex).
  • If we didn't quit due to the previous match the next command q1 will quit with exit 1 (fail).

find uses that sed command as a predicate and -print only if it was true. However there is a small snag. Apparently if the file is with -size 0 sed will exit 0 immediately without evaluating its arguments. For that reason we need the ! -size 0 argument to find.

As suggested by @Brandon Horsley, -type f will produce less errors, and while we at it lets verify that the file is -readable as well.


Unless I'm mistaken, you don't need the call to pwd. I think this will get you what you want. You can use the -l flag on grep to get the filenames rather than the matching lines.

find . -name "*.log" -exec grep -l "The SAS System" {} \; > sas_log_list.txt


find `pwd` -name "*.log" -exec grep "The SAS System" {} \;

or

find \`pwd\` -name "\*.log" | grep -i "the sas system"


bash 4

shopt -s globstar
shopt -s nullglob
for logfile in **/*.log
do

     read firstline<"$logfile"
     case "$firstline" in
       *"The SAS System"*) echo "$logfile" >> sas_log_list.txt
     esac

done


I've attempted to make things a bit faster by reading only first line of each file. This prints out file names matching pattern.

( IFS=$'\n' ; for f in $(find `pwd` -name "*log" -type f ) ; do 
   head -n 1 "$f" | grep -q "The SAS System" && echo "$f"
done )

UPDATE 1: Edited to handle path names containing white space using one of the techniques offered by Charles Duffy. I couldn't use the find -exec .. + expression as {} can't appear more than once. Thanks ghostdog74 and Telemachus

UPDATE 2: Add full pathname and last modified time

( IFS=$'\n' ; for f in $(find . -name "*log" -type f ) ; do 
   head -n 1 "$f" | grep -q "The SAS System" && echo $(readlink -f "$f") $(stat -c %y "$f")
done )
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜