Python: How to save the output of os.system [duplicate]
In Python , If I am using "wget" to download a file using os.system("wget ), it shows on the screen like:
Resolving...
Connecting to ...
HTTP request sent, awaiting response...
100%[====================================================================================================================================================================>] 19,535,176 8.10M/s in 2.3s
etc on the screen.
What can I do to save this output in some file rather than showing it on the screen ?
Currently I am running the command as follows:
theurl = "< file location >"
downloadCmd = "wget "+theurl
os.system(downloadCmd)
The os.system
functions runs the command via a shell, so you can put any stdio redirects there as well. You should also use the -q
flag (quiet) to wget.
cmd = "wget -q " + theurl + " >/dev/null 2>&1"
However, there are better ways of doing this in python, such as the pycurl wrapper for libcurl, or the "stock" urllib2
module.
To answer your direct question, and as others have mentioned, you should strongly consider using the subprocess module. Here's an example:
from subprocess import Popen, PIPE, STDOUT
wget = Popen(['/usr/bin/wget', theurl], stdout=PIPE, stderr=STDOUT)
stdout, nothing = wget.communicate()
with open('wget.log', 'w') as wgetlog:
wgetlog.write(stdout)
But, no need to call out to the system to download a file, let python do the heavy lifting for you.
Using urllib,
try:
# python 2.x
from urllib import urlretrieve
except ImportError:
# python 3.x
from urllib.request import urlretrieve
urlretrieve(theurl, local_filename)
Or urllib2,
import urllib2
response = urllib2.urlopen(theurl)
with open(local_filename, 'w') as dl:
dl.write(response.read())
local_filename
is the destination path of your choosing. It is sometimes possible to determine this value automatically, but the approach depends on your circumstance.
As others have noted, you can use Python native library modules to do your I/O, or you can modify the command line to redirect the output.
But for full control over the output, the best thing is to use the Python subprocess
module instead of os.system()
. Using subprocess
would let you capture the output and inspect it, or feed arbitrary data into standard input.
When you want a quick-and-dirty way to run something, use os.system()
. When you want full control over how you run something, use subprocess
.
The wget process is just writing to STDOUT (and perhaps STDERR if something bad happens) and these are still "wired" to the terminal.
To get it to stop doing this, redirect (or close) said filehandles. Look at the subprocess module which allows configuring said filehandles when starting a process. (os.system
just leaves the STDOUT/STDERR of the spawned process alone and thus they are inherited but the subprocess module is more flexible.)
See Working with Python subprocess - Shells, Processes, Streams, Pipes, Redirects and More for lots of nice examples and explanations (it introduces the concepts of STDIN/STDOUT/STDERR and works from there).
There are likely better ways to handle this than using wget -- but I'll leave such to other answers.
Happy coding.
精彩评论