开发者

Why does my python script randomly get killed?

Basically, i have a list of 30,000 URLs. The script goes through the URLs and downloads them (with a 3 second delay in between). And then it stores the HTML in a database.

And it loops and loops...

Why does it random开发者_JAVA技巧ly get "Killed."? I didn't touch anything.

Edit: this happens on 3 of my linux machines. The machines are on a Rackspace cloud with 256 MB memory. Nothing else is running.


Looks like you might be running out of memory -- might easily happen on a long-running program if you have a "leak" (e.g., due to accumulating circular references). Does Rackspace offer any easily usable tools to keep track of a process's memory, so you can confirm if this is the case? Otherwise, this kind of thing is not hard to monitor with normal Linux tools from outside the process. Once you have determined that "out of memory" is the likely cause of death, Python-specific tools such as pympler can help you track exactly where the problem is coming from (and thus determine how to avoid those references -- be it by changing them to weak references, or other simpler approaches -- or otherwise remove the leaks).


In cases like this, you should check the log files.

I use Debian and Ubuntu, so the main log file for me is: /var/log/syslog

If you use Red Hat, I think that log is: /var/log/messages

If something happens that is as exceptional as the kernel killing your process, there will be a log event explaining it.

I suspect you are being hit by the Out Of Memory Killer.


Is it possible that it's hitting an uncaught exception? Are you running this from a shell, or is it being run from cron or in some other automated way? If it's automated, the output may not be displayed anywhere.


Are you using some sort of queue manager or process manager of some sort ? I got apparently random killed messages when the batch queue manager I was using was sending SIGUSR2 when the time was up.

Otherwise I strongly favor the out of memory option.


For those who came here with mysql, I found this answers may by helpful:

use SSCursor as suggented by this

conn = MySQLdb.connect(host=DB_HOST, user=DB_USER, db=DB_NAME,
                       passwd=DB_PASSWORD, charset="utf8",
                       cursorclass=MySQLdb.cursors.SSCursor)

and iterate over cursor as suggested by this

cursor = conn.cursor()
cursor.execute("select * from very_big_table;")    
for row in cur:
    # do what you want here
    pass

Do pay attention to what the doc says You MUST retrieve the entire result set and close() the cursor before additional queries can be peformed on the connection., so if you want write and the same time, you should use another connection, or you will get

`_mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now")`
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜