开发者

Dividing a string into list according to the format given

I have a string like "SAB_bARGS_D" . What I want is that the string gets divided into list of characters but whenever there is a _ sign the next character gets appended to the previous one.

So the answer to above should be ['S','A','B_b','A','R','G','S_D']

It can be done by using a for loop traversing through the list but is there an inbuilt function that I can use.....

Thanks a lot


Update

Hello all

Thanks to Robert Rossney,aaronasterling I got the required answer but I have an exactly similar question that I am g开发者_运维知识库oing to ask here only...... Lets say that now my string has critaria that it can have a letter or a letter followed by _ and a number..... How can I seperate the string into list now...... The solutions suggested cannot be used now since S_10 would be seperated into S_1 and 0 ...... It would be helpful if someone can tell how to do so using RE.... Thanks a lot....


I know, I'll use regular expressions:

>>> import re
>>> pattern = "[^_]_[^_]|[^_]"
>>> re.findall(pattern, "SAB_bARGS_D", re.IGNORECASE)
['S', 'A', 'B_b', 'A', 'R', 'G', 'S_D']

The pattern tries to match 3 characters in a row - non-underscore, underscore, non-underscore - and, failing that, tries to match a non-underscore character.


I would probably use a for loop.

def a_split(inp_string):
    res = []
    if not inp_string: return res  # allows us to assume the string is nonempty

    # This avoids taking res[-1] when res is empty if the string starts with _
    # and simplifies the loop.
    inp = iter(inp_string)   
    last = next(inp)
    res.append(last)

    for c in inp:
        if '_' in (c, last): # might want to use (c == '_' or last == '_')
            res[-1] += c
        else:
            res.append(c)
        last = c
    return res

You will be able to get some performance gain my storing res.append in a local variable and referencing that directly instead of referencing a local variable, res and then performing an attribute lookup to get the append method.

If there is a string like 'a_b_c' then it will not be split. No behavior was specified in this case but it wouldn't be to hard to modify it to do something else. Also a string like '_ab' will split into ['_a', 'b'] and similarly for 'ab_'.


Using a regular expression

>>> import re
>>> s="SAB_bARGS_D"
>>> re.findall("(.(?:_.)?)",s)
['S', 'A', 'B_b', 'A', 'R', 'G', 'S_D']
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜