Reading lines from text file in python (windows)
I am working on a simple import routine that translates a text file to a json file format for our system in python.
import json # Open text file for reading txtFile = open('Boating.Make.txt', 'r') # Create picklist obj picklistObj = dict() picklistObj['name'] = 'Boating.Make' picklistObj['items'] = list() i = 0 # Iterate through each make in text file for line in txtFile: picklistItemObj = dict() picklistItemObj['value'] 开发者_高级运维= str(i) picklistItemObj['text'] = line.strip() picklistItemObj['selectable'] = True picklistObj['items'].append(picklistItemObj) i = i + 1 txtFile.close() picklistJson = json.dumps(picklistObj, indent=4) print picklistJson picklistFile = open('Boating.Make.json', 'w') picklistFile.write(picklistJson) picklistFile.close()
My question is, why do I need the "strip"? I thought that python was supposed to magically know the newline constant for whatever environment I am currently in. Am I missing something?
I should clarify that the text file I am reading from is an ASCII file that contains lines of text separated '\r\n'.
Python keeps the new line characters while enumerating lines. For example, when enumerating a text file such as
foo
bar
you get two strings: "foo\n"
and "bar\n"
. If you don't want the terminal new line characters, you call strip()
.
I am not a fan of this behavior by the way.
See this.
Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'
You need the strip() because "for line in file:" keeps the line terminators on the lines. It's not explicitly stated in the docs (at least in the 2.71 doc I'm looking at). But it functions in a fashion similar to file.readline(), which does explicitly state that it retains the newline.
Try the following in a Python interpreter to see what the language does:
open('test1.txt', 'wb').write(b'Hello\nWorld!')
open('test2.txt', 'wb').write(b'Hello\r\nWorld!')
print(list(open('test1.txt'))) # Shows ['Hello\n', 'World!']
print(list(open('test2.txt'))) # Shows ['Hello\n', 'World!']
Python does recognize the correct newlines. Instead of using strip
on your strings, you might want to write myString.replace('\n', '')
instead. Check the documentation:
>>> help(str.strip)
Help on method_descriptor:
strip(...)
S.strip([chars]) -> str
Return a copy of the string S with leading and trailing
whitespace removed.
If chars is given and not None, remove characters in chars instead.
>>> help(str.replace)
Help on method_descriptor:
replace(...)
S.replace(old, new[, count]) -> str
Return a copy of S with all occurrences of substring
old replaced by new. If the optional argument count is
given, only the first count occurrences are replaced.
精彩评论