开发者

Sequentially Combining Three Lists Created with Regular Expressions - Python

I have a text file I am reading with three regular expressions. I'd like to combine each item from each regex search line by line and print them out using the same format as the last line below. I just cannot get the loop right to combine everything.

Sample Text from three different sources(you can see info is sometimes missing and other times presented in different formats:

  1. Bond Name O/F C/F Cpn MTR FICO CAL WALB 1mCPR 60+ CE CWL 2004-6 2A5 0.95 0.09 L+39 4 49 200 4 28.62 47.69%

  2. Bond Name O/F C/F Cpn FICO CAL WALB 60+ Notes Offer CSMC 06-9 7A1 25.00 12.01 L+45 727 26 577 33.29 FLT,AS,0.0% 50-00

  3. Type CUSIP Bond Name Term Offer Structure PRIME 17312KAB8 CMSI 07-5 1A2 7/7 92.50 LCF

import re

string = open("cusip.txt")
read_string = string.read()

cusip_reg_exp = re.compile('\s[0-9]{3}[a-zA-Z0-9]{6}\s')
cusip_result = cusip_reg_exp.findall(read_string)

bond_name_reg_exp = re.compile('\s[A-Z]{3,5}\s[0-9]{4}\D{1,3}\S{1,3}\s{1,2}\w{1,3}')
bond_name_result = bo开发者_运维技巧nd_name_reg_exp.findall(read_string)

bond_price_name_reg_ex = re.compile('[$]{0,1}[0-9]{1,2}[-]{1}[0-9]{2}')
bond_price_result = bond_price_name_reg_ex.findall(read_string)

print(cusip_result[0],bond_name_result[0],bond_price_result[0])


You can use zip [docs] or itertools.izip [docs]:

for i, j, k in zip(cusip_result, bond_name_result, bond_price_result):
    print i, j, k

Depending on the format of the file, the csv [docs] module might be helpful too (instead of using regular expressions to extract the content).

You could also iterate over each line and extract the relevant information per line.


If all of those lists will be the same length, you can concatenate each corresponding entry (delimited by a space) to create a list of the combined strings, and then concatenate those (delimited by a newline) to create the displayed list of results. I decided to do it with some list comprehension wizardry (no for loops!).

print '\n'.join([' '.join([cusip_item, bond_name_item, bond_price_item]) for (cusip_item, bond_name_item, bond_price_item) in zip(cusip_result, bond_name_result, bond_price_result)])

Hopefully that serves your needs. If not, I'm sure there will be several other interpretations to this question :)

Edit: I realize it's a bit long, but you could shorten the variable names perhaps. Alternatively (or perhaps, in addition), you could define zip(cusip_result, bond_name_result, bond_price_result) prior to the comprehension. I just can't help myself with these things though, I love hot python one liners!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜