Parsing a tab-delimited text file to replace columns with one vertical list (Python)
I'm very new to Python and I know this is a pretty basic question. I have a text file with columns of data. I want to remove the columns and make it one long list.
I have the following code:
for line in open('feddocs_2011.txt', 'r'):
segme开发者_如何学运维ntedLine = line.split("/t")
print segmentedLine
This seems to create a separate string for each line, but I think I may need to loop through each of those new strings to split those next. I thought it would have put everything following a tab on a new line. I tried the following, but got an error message that "list" doesn't have a split function.
while segmentedLine:
item = segmentedLine.split("\t")
print item
Thanks very much for any input.
You've got the lines split properly in the first loop. What you want to do then is have a second for loop to iterate over each tab-separated item. That'll look like this:
for line in open('feddocs_2011.txt', 'r'):
segmentedLine = line.split("\t")
for item in segmentedLine:
print item
Or more concisely, without the temporary variable:
for line in open('feddocs_2011.txt', 'r'):
for item in line.split("\t"):
print item
what about:
x = [line.split('\t') for line in open('file.txt')]
and you can join the lists, if you want:
sum(x, [])
[Edit]
if your file only have tabs (no spaces) you can simply do:
x = open('file.txt').read().split()
So you have
foo<tab>bar<tab>baz
bla<tab>bla<tab>bla
and you want it to be
foo
bar
baz
bla
bla
bla
Right?
Then you can just do
with open("myfile.txt", "r") as f:
text = f.read().replace("\t", "\n")
Now text
is a single string. If you want a list of all the items instead (["foo", "bar", "baz", "bla", "bla", "bla"]
), use
text = f.read().replace("\t", "\n").split("\n")
if I understand correctly, what you're after is:
import itertools
print '\n'.join(list(itertools.chain(*[line.strip().split(',') for line in open('feddocs_2011.txt', 'r')])))
put everything following a tab on a new line
If this is all you want, why not just use the str.replace function?
for line in open('feddocs_2011.txt', 'r'):
segemented_line = line.replace('\t', '\n')
print(segmented_line)
If, for some reason, you want to keep the tabs:
for line in open('feddocs_2011.txt', 'r'):
segemented_line = line.replace('\t', '\t\n')
print(segmented_line)
精彩评论