开发者

Reading lines from text file in python (windows)

I am working on a simple import routine that translates a text file to a json file format for our system in python.

import json

# Open text file for reading
txtFile = open('Boating.Make.txt', 'r')

# Create picklist obj
picklistObj = dict()
picklistObj['name'] = 'Boating.Make'
picklistObj['items'] = list()

i = 0
# Iterate through each make in text file
for line in txtFile:
    picklistItemObj = dict()
    picklistItemObj['value'] 开发者_高级运维= str(i)
    picklistItemObj['text'] = line.strip()
    picklistItemObj['selectable'] = True
    picklistObj['items'].append(picklistItemObj)
    i = i + 1
txtFile.close()

picklistJson = json.dumps(picklistObj, indent=4)
print picklistJson

picklistFile = open('Boating.Make.json', 'w')
picklistFile.write(picklistJson)
picklistFile.close()

My question is, why do I need the "strip"? I thought that python was supposed to magically know the newline constant for whatever environment I am currently in. Am I missing something?

I should clarify that the text file I am reading from is an ASCII file that contains lines of text separated '\r\n'.


Python keeps the new line characters while enumerating lines. For example, when enumerating a text file such as

foo
bar

you get two strings: "foo\n" and "bar\n". If you don't want the terminal new line characters, you call strip().

I am not a fan of this behavior by the way.


See this.

Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'


You need the strip() because "for line in file:" keeps the line terminators on the lines. It's not explicitly stated in the docs (at least in the 2.71 doc I'm looking at). But it functions in a fashion similar to file.readline(), which does explicitly state that it retains the newline.


Try the following in a Python interpreter to see what the language does:

open('test1.txt', 'wb').write(b'Hello\nWorld!')
open('test2.txt', 'wb').write(b'Hello\r\nWorld!')
print(list(open('test1.txt'))) # Shows ['Hello\n', 'World!']
print(list(open('test2.txt'))) # Shows ['Hello\n', 'World!']

Python does recognize the correct newlines. Instead of using strip on your strings, you might want to write myString.replace('\n', '') instead. Check the documentation:

>>> help(str.strip)
Help on method_descriptor:

strip(...)
    S.strip([chars]) -> str

    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.

>>> help(str.replace)
Help on method_descriptor:

replace(...)
    S.replace(old, new[, count]) -> str

    Return a copy of S with all occurrences of substring
    old replaced by new.  If the optional argument count is
    given, only the first count occurrences are replaced.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜