Dividing a string into list according to the format given
I have a string like "SAB_bARGS_D"
. What I want is that the string gets divided into list of characters but whenever there is a _ sign the next character gets appended to the previous one.
So the answer to above should be ['S','A','B_b','A','R','G','S_D']
It can be done by using a for loop traversing through the list but is there an inbuilt function that I can use.....
Thanks a lot
Update
Hello all
Thanks to Robert Rossney
,aaronasterling
I got the required answer but I have an exactly similar question that I am g开发者_运维知识库oing to ask here only...... Lets say that now my string has critaria that it can have a letter or a letter followed by _ and a number..... How can I seperate the string into list now...... The solutions suggested cannot be used now since S_10 would be seperated into S_1 and 0 ...... It would be helpful if someone can tell how to do so using RE.... Thanks a lot....
I know, I'll use regular expressions:
>>> import re
>>> pattern = "[^_]_[^_]|[^_]"
>>> re.findall(pattern, "SAB_bARGS_D", re.IGNORECASE)
['S', 'A', 'B_b', 'A', 'R', 'G', 'S_D']
The pattern tries to match 3 characters in a row - non-underscore, underscore, non-underscore - and, failing that, tries to match a non-underscore character.
I would probably use a for
loop.
def a_split(inp_string):
res = []
if not inp_string: return res # allows us to assume the string is nonempty
# This avoids taking res[-1] when res is empty if the string starts with _
# and simplifies the loop.
inp = iter(inp_string)
last = next(inp)
res.append(last)
for c in inp:
if '_' in (c, last): # might want to use (c == '_' or last == '_')
res[-1] += c
else:
res.append(c)
last = c
return res
You will be able to get some performance gain my storing res.append
in a local variable and referencing that directly instead of referencing a local variable, res
and then performing an attribute lookup to get the append
method.
If there is a string like 'a_b_c'
then it will not be split. No behavior was specified in this case but it wouldn't be to hard to modify it to do something else. Also a string like '_ab'
will split into ['_a', 'b']
and similarly for 'ab_'
.
Using a regular expression
>>> import re
>>> s="SAB_bARGS_D"
>>> re.findall("(.(?:_.)?)",s)
['S', 'A', 'B_b', 'A', 'R', 'G', 'S_D']
精彩评论