Handling \r\n vs \n newlines in python on Mac vs Windows
I have a python script that gave different output when run on a Windows machine and when run on a Mac. On digging deeper, I discovered that it was because when Python read in line breaks on the Mac (from a file), it read in \r\n
, while somehow in Windows the \r
disappears.
Thus, if I change every \n
in the script to \r\n
, it works fine on the Mac. But if开发者_如何学JAVA I do that, it stops working on the Windows PC.
Is there an easy way to fix this problem?
Different platforms have different codes for "new line". Windows have \r\n, Unix has \n, Old macs have \r and yes there are some systems that have \n\r too.
When you open a file in text mode in Python 3, it will convert all newlines to '\n' and be done with it.
infile = open("filename", 'r')
Text mode is default, so if you say nothing, it's text mode. But it's always better to be explicit:
infile = open("filename", 'rt')
If you don't want the translation of line endings to happen, open the file in binary mode:
infile = open("filename", 'rb')
In Python 2 it's different. There this conversion would only happen by default on Windows. If you wanted it to happen on other platforms, you could add the universal newline flag:
infile = open("filename", 'rU')
However, you say that you are on Python 3, and there it happens in text mode on all platforms, so adding the U flag should make no difference.
'U'
mode:
Python 2:
I guess it may depend on what you're reading from, but the built-in open() function takes a 'mode' parameter, and if you pass 'U' for the mode, Python 2 will take care of the newlines in a cross-platform way transparently. It requires that Python be built with universal newline support, but test it out!
https://docs.python.org/2/library/functions.html#open
Python 3:
In Python 3, the 'U'
mode is the default behaviour, as the docs explain:
There is an additional mode character permitted,
'U'
, which no longer has any effect, and is considered deprecated. It previously enabled universal newlines in text mode, which became the default behaviour in Python 3.0. Refer to the documentation of the newline parameter for further details.
https://docs.python.org/3/library/functions.html#open
In Python 3, the Open()
method has a newline
parameter:
newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:
When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
The old way of using U
mode specifier has been deprecated in favor of this new way.
'U' universal newlines mode (deprecated)
In Python 3, use the keyword argument "newline='\n'" in open() to use a specified line delimiter when writing text files. For more information, please see:
https://pythonconquerstheuniverse.wordpress.com/2011/05/08/newline-conversion-in-python-3/
http://docs.python.org/3/library/functions.html#open
On windows, they both are working fine if i try writing a file with either of the two(\r or \n) python interprets it as a line break in both the cases. While using "\r\n", it is interpreted as a double line break.(Python 3 on windows)
精彩评论