Logging null bytes via apache
There is a problem with logging a binary data to stdout via Apache.
After configuring logging I try to log string '\x31\x00'
:
logging.getLogger().info('\x31\x00')
All going well if I use python console - I see expected:
2011-05-01 22:21:27,430 INFO [test_logging:9][test_logging] 1
But if I use logging via Apache and mod_wsgi I get traceback:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/logging/__init__.py", line 789, in emit
stream.write(fs % msg) TypeError: w开发者_C百科rite() argument 1 must be string without null bytes, not str
Where is the bug? Where should I dig into?
My logging configuration:
[loggers]
keys=root
[formatters]
keys=stdoutFormatter
[handlers]
keys=stdoutHandler
[logger_root]
level=NOTSET
handlers=stdoutHandler
[handler_stdoutHandler]
class=StreamHandler
formatter=stdoutFormatter
args=(sys.stdout,)
[formatter_stdoutFormatter]
format=%(asctime)s %(levelname)s [%(module)s:%(lineno)d][%(funcName)s] %(message)s
Apache version 2.2.16
Python version 2.6.4
Mod_wsgi 2.8
You could just use
logging.getLogger().info('%r', binary_bytes)
and it should do the right thing.
Nothing is wrong with Apache/mod_wsgi - it's just that console output streams are not supposed to be used for binary data.
The error is stated right there: stream.write cannot take a string argument containing null bytes.
Perhaps you should write a function that converts a string which might contain null bytes (or other unprintable characters) and replace them with printable escape sequences. So passing in a string like '\x31\x00'
would result in the string '1\\x00'
being printed.
Or, if the string your're logging is all binary data, simply convert each character into its \xDD
escape code equivalent. Or just print out every character in the string as a simple two-digit hex code, so the entore string is a logged as a sequence of hex codes.
You have provided a byte-string when a character (Unicode) string is expected. Remember, in Python 2.x the "string" type is really a byte-string not a character string. (This follows C, where the "char" type is really a byte and 'A' is really 0x41.) If you use either the u'string' syntax or the unicode() built-in directly before logging, that will ensure that only character strings are logged. In that case, byte-strings that cannot be decoded into character strings using ASCII encoding before logging will get you an exception at that point instead of from within the Apache calls.
To actually log byte-strings, which it seems you are wanting to do, you will first need to encode them somehow into (Unicode) character strings. base64 is simple to do but the data will need to be decoded again to be human-readable. I wrote a hex-dump function which took me a few hours to get exactly like I wanted it.
精彩评论