开发者

How to Convert multiple sets of Data going from left to right to top to bottom the Pythonic way?

Following is a sample of sets 开发者_JAVA百科of contacts for each company going from left to right.

ID Company ContactFirst1 ContactLast1 Title1 Email1     ContactFirst2 ContactLast2 Title2 Email2
1  ABC     John          Doe          CEO    jd@abc.com Steve         Bern         CIO    sb@abc.com

How do I get them to go top to bottom as shown?

ID Company Contactfirst ContactLast Title Email
1  ABC     John         Doe         CEO   jd@abc.com
1  ABC     Steve        Bern        CIO   sb@abc.com

I am hoping there is a Pythonic way of solving this task. Any pointers or samples are really appreciated!

p.s : In the actual file, there are 10 sets of contacts going from left to right and there are few thousand such records. It is a CSV file and I loaded into MySQL to manipulate the data.


Here is a somewhat cleaner version of the above. It presumes the data is comma-delimited. The sample data appears to be fixed-width instead; I suggest loading it in Excel or OpenOffice and resaving as actual comma-delimited CSV.

import csv

def main():
    infname  = 'contacts.csv'
    outfname = 'per_contact.csv'

    with open(infname) as inf, open(outfname, 'w') as outf:
        inCsv  = csv.reader(inf)
        outCsv = csv.writer(outf)

        inCsv.next()  # skip header row
        outCsv.writerow(['ID', 'Company', 'ContactFirst', 'ContactLast', 'Title', 'Email'])

        for row in inCsv:
            id_co = row[:2]
            for contact in (row[i:i+4] for i in range(2, len(row), 4)):
                if any(c.strip() for c in contact):  # at least one cell contains data?
                    outCsv.writerow(id_co+contact)

if __name__=="__main__":
    main()


This should do what you want:

import csv

# The character that separates the fields in each row
field_delimiter = '\t'
# The number of fields for each contact
contact_fields = 4

# File in, file out
csv_in = csv.reader(open('foo.txt', 'r'), delimiter=field_delimiter)
csv_out = csv.writer(open('bar.txt', 'w'), delimiter=field_delimiter)

# Iterate through the file, breaking each line into "contact" sized chunks
# and spitting those chunks out as individual lines, into a new file.
for index, fields in enumerate(csv_in):
    # Set aside the company field, since it is shared by all of the contacts
    id = fields.pop(0)
    company = fields.pop(0)

    # Split the line into chunks containing the fields for each contact
    last = 0
    for i in [x for x in range(contact_fields -1, len(fields)+1) if x % contact_fields == 0]:
        # Join the fields back together using the appropriate delimiter, and write it
        # to the output file.
        csv_out.writerow([id, company] + fields[last:i])
        if index == 0:
            # This is the header line, only preserve the first set.
            break
        last = i
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜