开发者

Python string split decimals from end of string

I use nlst on a ftp server which returns director开发者_如何学Goies in the form of lists. The format of the returned list is as follows:

[xyz123,abcde345,pqrst678].

I have to separate each element of the list into two parts such that part1 = xyz and part2 = 123 i.e split the string at the beginning of the integer part. Any help on this will be appreciated!


>>> re.findall(r'\d+|[a-z]+', 'xyz123')
['xyz', '123']


For example, using the re module:

>>> import re
>>> a = ['xyz123','ABCDE345','pqRst678']
>>> regex = '(\D+)(\d+)'
>>> for item in a:
...    m = re.match(regex, item)
...    (a, b) = m.groups()
...    print a, b

xyz 123
ABCDE 345
pqRst 678


Use the regular expression module re:

import re
def splitEntry(entry):
    firstDecMatch = re.match(r"\d+$", entry)
    alpha, numeric = "",""
    if firstDecMatch:
        pos = firstDecMatch.start(0)
        alpha, numeric = entry[:pos], entry[pos:]
    else # no decimals found at end of string
        alpha = entry
    return (alpha, numeric)

Note that the regular expression is `\d+$', which should match all decimals at the end of the string. If the string has decimals in the first part, it will not count those, e.g: xy3zzz134 -> "xy3zzz","134". I opted for that because you say you are expecting filenames, and filenames can include numbers. Of course it's still a problem if the filename ends with numbers.


Another non-re answer:

>>> [''.join(x[1]) for x in itertools.groupby('xyz123', lambda x: x.isalpha())]
['xyz', '123']


If you don't want to use regex, then you can do something like this. Note that I have not tested this so there could be a bug or typo somewhere.

list = ["xyz123", "abcde345", "pqrst678"]
newlist = []
for item in list:
    for char in range(0, len(item)):
        if item[char].isnumeric():
            newlist.append([item[:char], item[char:]])
            break


>>> import re
>>> [re.findall(r'(.*?)(\d+$)',x)[0] for x in ['xyz123','ABCDE345','pqRst678']]
[('xyz', '123'), ('ABCDE', '345'), ('pqRst', '678')]


I don't think its that difficult without re

>>> s="xyz123"
>>> for n,i in enumerate(s):
...   if i.isdigit(): x=n ; break
...
>>> [ s[:x], s[x:] ]
['xyz', '123']

>>> s="abcde345"
>>> for n,i in enumerate(s):
...   if i.isdigit(): x=n ; break
...
>>> [ s[:x], s[x:] ]
['abcde', '345']
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜