开发者

Parsing a tab-delimited text file to replace columns with one vertical list (Python)

I'm very new to Python and I know this is a pretty basic question. I have a text file with columns of data. I want to remove the columns and make it one long list.

I have the following code:

for line in open('feddocs_2011.txt', 'r'):
    segme开发者_如何学运维ntedLine = line.split("/t")
    print segmentedLine

This seems to create a separate string for each line, but I think I may need to loop through each of those new strings to split those next. I thought it would have put everything following a tab on a new line. I tried the following, but got an error message that "list" doesn't have a split function.

while segmentedLine:
    item = segmentedLine.split("\t")
    print item

Thanks very much for any input.


You've got the lines split properly in the first loop. What you want to do then is have a second for loop to iterate over each tab-separated item. That'll look like this:

for line in open('feddocs_2011.txt', 'r'):
    segmentedLine = line.split("\t")
    for item in segmentedLine:
        print item

Or more concisely, without the temporary variable:

for line in open('feddocs_2011.txt', 'r'):
    for item in line.split("\t"):
        print item


what about:

x = [line.split('\t') for line in open('file.txt')]

and you can join the lists, if you want:

sum(x, [])

[Edit]

if your file only have tabs (no spaces) you can simply do:

x = open('file.txt').read().split()


So you have

foo<tab>bar<tab>baz
bla<tab>bla<tab>bla

and you want it to be

foo
bar
baz
bla
bla
bla

Right?

Then you can just do

with open("myfile.txt", "r") as f:
    text = f.read().replace("\t", "\n")

Now text is a single string. If you want a list of all the items instead (["foo", "bar", "baz", "bla", "bla", "bla"]), use

text = f.read().replace("\t", "\n").split("\n")


if I understand correctly, what you're after is:

import itertools
print '\n'.join(list(itertools.chain(*[line.strip().split(',') for line in open('feddocs_2011.txt', 'r')])))


put everything following a tab on a new line

If this is all you want, why not just use the str.replace function?

for line in open('feddocs_2011.txt', 'r'):
    segemented_line = line.replace('\t', '\n')
    print(segmented_line)

If, for some reason, you want to keep the tabs:

for line in open('feddocs_2011.txt', 'r'):
    segemented_line = line.replace('\t', '\t\n')
    print(segmented_line)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜