How do you feed in an entire directory of input files using cat in Unix?
I'd like to run a program on a directory of files. I know how to do this with one file, using
cat myFile.开发者_运维知识库xml | myProgram.py
.
How can I run myProgram.py over a folder, say myFolder?
Thanks!
I like
ls | xargs cat
for its functional language feel. YMMV.
Assuming your program can accept a filename as its first command line argument, one way is to use find
to find all the files in the folder, and then use xargs
to run your program for each of them:
find myFolder | xargs -n 1 myProgram.py
The -n 1
means "run the program once per file". If your program is happy to receive multiple filenames on its command line, you can omit the -n 1
and xargs
will run your program fewer times with multiple files on its command line.
(find
will do a recursive search, so you'll get all the files in and under myFolder. You can use find myFolder -maxdepth 1
to prevent that.)
(Thanks to @Personman for pointing out that this will run the program for the folder itself as well as the files. You can use find myFolder -type f
to tell find
to only return regular files.)
How about:
for x in myFolder/*
do
cat $x | myProgram.py
done
If you are just trying to execute your data program on a bunch of files, the easiest/least complicated way is to use -exec in find.
Say you wanted to execute data on all txt files in the current directory (and subdirectories). This is all you'd need:
find . -name "*.txt" -exec data {} \;
If you wanted to restrict it to the current directory, you could do this:
find . -maxdepth 1 -name "*.txt" -exec data {} \;
There are lots of options with find.
cat myFolder/* | myProgram.py
Or cat *.xml | myProgram.py
that will produce the output of every .xml file to stdin then piped to your program. This combines all files into one stream.
myProgram.py *.xml
will expand every filename as input to your program like this: myProgram.py file1.xml file2.xml file3.xml ... filen.xml
Each file remains separate and the script can tell one from another.
Python / Perl / sh scripts, base case, usually handle that the same as myProgram.py file1.xml; myProgram.py file2.xml; myProgram.py filen.xml
with the ;
meaning new command.
Play with it and welcome to Unix!
精彩评论