How do I perform a recursive directory search for strings within files in a UNIX TRU64 environment?
Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.
Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all th开发者_如何学JAVAe strings to be searched. This list will consist of various directory path names and will include special characters:
ie:
/aaa/bbb/ccc /eee/dddd/ggggggg/ etc..The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.
There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:
eg: find etb -exec grep test {} \;
where 'etb' is the directory and 'test', a hard coded string to be searched.
This should do it:
find dir -type f -exec grep -F -f strings.txt {} \;
dir
is the directory from which searching will commence
strings.txt
is the file of strings to match, one per line
-F
means treat search strings as literal rather than regular expressions
-f strings.txt
means use the strings in strings.txt
for matching
You can add -l
to the grep switches if you just want filenames that match.
Footnote:
Some people prefer a solution involving xargs
, e.g.
find dir -type f -print0 | xargs -0 grep -F -f strings.txt
which is perhaps a little more robust/efficient in some cases.
By reading, I assume we can not use the gnu coreutil, and egrep is not available. I assume (for some reason) the system is broken, and escapes do not work as expected.
Under normal situations, grep -rf patternfile.txt /some/dir/
is the way to go.
a file containing a list of all the strings to be searched
Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.
Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.
Assume : working sed
,one of od
OR hexdump
OR xxd
(from vim package) is available.
Lets call this patternfile.txt
1. Convert list into a regexp that grep likes
Example patternfile.txt contains
/foo/
/bar/doe/
/root/
(example does not print special char, but it's there.) we must turn it into something like
(/foo/|/bar/doe/|/root/)
Assuming echo -en
command is not broken, and xxd
, or od
, or hexdump
is available,
Using hexdump
cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
Using od
cat patternfile.txt |od -A none -t x1|tr -d '\n'
and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.
2. Feed the escaped pattern into broken regexp
Assuming the bare minimum shell escape is available,
we use grep "$(echo -en "ESCAPED_PATTERN" )"
to do our job.
3. To sum it up
Building a escaped regexp pattern (using hexdump as example )
grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"
will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.
4. Recrusive directory lookup
Under normal situations, even when grep -r
is broken, find /dir/ -exec grep {} \;
should work.
Some may prefer xargs
instaed (unless you happen to have buggy xargs).
We prefer find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt'
approach, but since
this is not available (for whatever valid reason),
we need to exec grep
for each file,and this is normaly the wrong way.
But lets do it.
Assume : find -type f
works.
Assume : xargs
is broken OR not available.
First, if you have a buggy pipe, it might not handle large number of files.
So we avoid xargs
in such systems (i know, i know, just lets pretend it is broken ).
find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
done ;
is a nice way to get by. Unfortunetly, some systems do not like that,
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.
A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.
What if my shell doe not handle quotes ?
You may have to escape the file list beforehand.
It can be done much nicer in awk
, perl
, whatever, but since we restrict our selves to
sed
, lets do it.
We assume 0x27, the ' code
will actually work.
cat list-of-all-file-to-search-for.txt |sed 's@['\'']@'\''\\'\'\''@g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.
What if my shell does not handle that ?
xargs
fails , grep -r
fails , shell's for loop fails.
Do we have other things ? YES.
Escape all input suitable for your shell, and make a script.
But you know what, I got board, and writing automated scripts for csh just seems wrong. So I am going to stop here.
Take home note
Use the tool for the right job. Writing a interpreter on bc
is perfectly
capable, but it is just plain wrong. Install coreutils, perl
, a better grep
what ever. makes life a better thing.
精彩评论