In Python, removing thousands comma from numbers in a list where the numbers are separated by commas

2023-03-18 09:00 问答作者：

I have a list of data similar to that below:

a = ['"105', '424"', '"102', '629"', '"104', '307"']

I want th开发者_如何学编程is data to be in a form similar to that of below:

a = ['105424', '102629', '104307']

I am unsure of how to proceed. I thought perhaps removing all the commas then inserting commas only where they should be and then removing the quotations. I am finding this to be quite challenging.

I'm assuming this data was originally in a csv file where data that contains commas is quoted ("105,424","102,629","104,307") and then you are splitting on comma:

>>> '"105,424","102,629","104,307"'.split(',')
['"105', '424"', '"102', '629"', '"104', '307"']

Rather you should let the csv module do the work as it will handle the double quotes:

import csv

with open('u:\\foobar.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        print [x.replace(',','') for x in row]

This prints: ['105424', '102629', '104307']

Does your data look something like:

"123", "123,456", "123,456,789"

If so then try this

input = '"123", "123,456", "123,456,789"'

import re

reg = re.compile('"(\d{1,3}(,\d{3})*)"')

stringValues = [wholematch.replace(',', '') for wholematch, _endmatch 
                                                    in reg.findall(input)]

This regex should also work on thousands with decimal places as well.

re.compile('"(\d{1,3}(,\d{3})*(\.\d*)?)"')

If the source data is CSV, you should use @steven's answer.

Regardless, here's how you could process what you pasted.

As @troutwine stated, this will only work if the number parts are always in pairs.

a = ['"105', '424"', '"102', '629"', '"104', '307"']

from itertools import izip

def pairwise(iterable):
    "s -> (s0,s1), (s2,s3), (s4, s5), ..."
    a = iter(iterable)
    return izip(a, a)

result = []

for x, y in pairwise(a):
    result.append(''.join([x, y]).strip('"'))

print result

Gives:

['105424', '102629', '104307']

Pairwise snippet from here: Iterating over every two elements in a list

If you'll never have an unmatched pair, loop over a range 1/2 the size of the input list, mash the current index plus the next together, do a string substitution and skip to the current index plus two.

Reduce to the rescue:

l = ['"105', '424"', '"102', '629"', '"104', '307"', '"123', '456', '789"', '"123"']

# Concatenate everything and split by ", get non-empties
l2 = [num for num in reduce(lambda x, y: x+y, l).split('"') if num != '']

# Output:
# ['105424', '102629', '104307', '123456789', '123']
print l2

Few caveats though: This code can do numbers beyond thousands (ie, 1,457,664), but also assumes that the whole number was double-quoted.

As others have said though, you should revisit your data retrieval as there are most likely ways to get the values correctly without dealing with the double-quotes. This was a fun little challenge nonetheless.

继续阅读：python quotes

In Python, removing thousands comma from numbers in a list where the numbers are separated by commas

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？