Controlling shell command line wildcard expansion in C or C++
I'm writing a program, foo, in C++. It's typically invoked on the command line like this:
foo *.txt
My main()
receives the arguments in the normal way. On many systems, argv[1]
is literally *.txt
, and I have to call system routines to do the wildcard expansion. On Unix systems, however, the shell expands the wildcard before invoking my program, and all of the matching filenames will be in argv
.
Suppose I wanted to add a switch to foo that causes it to recurse into subdirectories.
foo -a *.txt
would process all开发者_如何学Python text files in the current directory and all of its subdirectories.
I don't see how this is done, since, by the time my program gets a chance to see the -a
, then shell has already done the expansion and the user's *.txt
input is lost. Yet there are common Unix programs that work this way. How do they do it?
In Unix land, how can I control the wildcard expansion?
(Recursing through subdirectories is just one example. Ideally, I'm trying to understand the general solution to controlling the wildcard expansion.)
You program has no influence over the shell's command line expansion. Which program will be called is determined after all the expansion is done, so it's already too late to change anything about the expansion programmatically.
The user calling your program, on the other hand, has the possibility to create whatever command line he likes. Shells allow you to easily prevent wildcard expansion, usually by putting the argument in single quotes:
program -a '*.txt'
If your program is called like that it will receive two parameters -a
and *.txt
.
On Unix, you should just leave it to the user to manually prevent wildcard expansion if it is not desired.
As the other answers said, the shell does the wildcard expansion - and you stop it from doing so by enclosing arguments in quotes.
Note that options -R
and -r
are usually used to indicate recursive - see cp
, ls
, etc for examples.
Assuming you organize things appropriately so that wildcards are passed to your program as wildcards and you want to do recursion, then POSIX provides routines to help:
nftw
- file tree walk (recursive access).fnmatch
,glob
,wordexp
- to do filename matching and expansion
There is also ftw
, which is very similar to nftw
but it is marked 'obsolescent' so new code should not use it.
Adrian asked:
But I can say ls -R *.txt without single quotes and get a recursive listing. How does that work?
To adapt the question to a convenient location on my computer, let's review:
$ ls -F | grep '^m'
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte/
$ ls -R1 m*
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte:
multithread.ec
multithread.ec.original
multithread2.ec
$
So, I have a sub-directory 'mte' that contains three files. And I have six files with names that start 'm'.
When I type 'ls -R1 m*', the shell notes the metacharacter '*' and uses its equivalent of
glob()
orwordexp()
to expand that into the list of names:- makefile
- mapmain.pl
- minimac.group
- minimac.passwd
- minimac_13.terminal
- mkmax.sql.bz2
- mte
Then the shell arranges to run '
/bin/ls
' with 9 arguments (program name, option-R1
, plus 7 file names and terminating null pointer).- The
ls
command notes the options (recursive and single-column output), and gets to work.- The first 6 names (as it happens) are simple files, so there is nothing recursive to do.
- The last name is a directory, so
ls
prints its name and its contents, invoking its equivalent ofnftw()
to do the job. - At this point, it is done.
- This uncontrived example doesn't show what happens when there are multiple directories, and so the description above over-simplifies the processing.
- Specifically,
ls
processes the non-directory names first, and then processes the directory names in alphabetic order (by default), and does a depth-first scan of each directory.
foo -a '*.txt'
Part of the shell's job (on Unix) is to expand command line wildcard arguments. You prevent this with quotes.
Also, on Unix systems, the "find" command does what you want:
find . -name '*.txt'
will list all files recursively from the current directory down.
Thus, you could do
foo `find . -name '*.txt'`
I wanted to point out another way to turn off wildcard expansion. You can tell your shell to stop expanding wildcards with the the noglob
option.
With bash use set -o noglob
:
> touch a b c
> echo *
a b c
> set -o noglob
> echo *
*
And with csh, use set noglob
:
> echo *
a b c
> set noglob
> echo *
*
精彩评论