开发者

Any idea how this can be read as a CSV? - Python

I have a 开发者_StackOverflow社区CSV formatted in the following way:

ID=123[DLE]KEY=test[DLE]KEY2=VAL123

where [DLE] is the "Data link escape" control character


Any idea how I could use this with the csv standard library?

Do I need to edit each row in order to make it compatible?

Edit: my main problem is the "KEY=VALUE" formatting

Thanks guys


Your data is not actually in CSV format, so I'd give up on trying to use it. What I'd do is write a generator that took each line, did .split('\x10'), and then did .split('=', 1) on each piece, and then yielded the whole thing as a dict.

def dgen(fin):
  for line in fin:
    yield dict(((k, v) for chunk in line.rstrip('\r\n').split('\x10')
        for (k, v) in chunk.split('=', 1)))


You can parse it by simply overriding the delimiter (pretend the snowman is your DLE):

import csv
testdata='ID=123☃KEY=test☃KEY2=VAL123\nID=456☃KEY=test2☃KEY2=val456'
testdataiter=testdata.splitlines()
reader = csv.reader(testdataiter, delimiter='☃')
for row in reader:
    print (','.join(row))

outputs:

ID=123,KEY=test,KEY2=VAL123
ID=456,KEY=test2,KEY2=val456

Check the help for the csv module - any of the dialect parameters can be overriden in the reader constructor.


To parse your KEY=VAL into a dict, change to:

...
for row in reader:
    rowdict = dict([x.split('=',1) for x in row])
    print rowdict

outputs:

{'KEY2': 'VAL123', 'ID': '123', 'KEY': 'test'}
{'KEY2': 'val456', 'ID': '456', 'KEY': 'test2'}


In Python you can pass the csv module a format character (csv module docs):

>>> import csv
>>> spamReader = csv.reader(open('eggs.csv', 'rb'), delimiter=' ', quotechar='|')


Is this what you're after?

>>> fields = ["ID","KEY","KEY2"]
>>> mydict = csv.DictReader(open("csv.txt","rbU"),delimiter=chr(16),fieldnames=field
s)
>>> mylist = [line for line in mydict]
>>> mylist
[{'KEY2': 'KEY2=VAL123', 'ID': 'ID=123', 'KEY': 'KEY=test'}]

Edit: Better answer given above


reader = csv.reader(open(datafile, "rb"), delimiter=chr(16))
data = (dict(tuple(i.split("=") for i in row)) for row in reader)

Will give you a generator. You can turn it into a list or a tuple.

data_list = list(data)

Which would be a list of dicts.

[{'ID': '123', 'KEY': 'test', 'KEY2': 'VAL123'},
 {'ID': '456', 'KEY': 'test2', 'KEY2': 'val456'}]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜