开发者

Python name grabber

if I have a string in the format of

(static string) name (different static string ) message (last static s开发者_JAVA百科tring)

(static string) name (different static string ) message (last static string)

(static string) name (different static string ) message (last static string)

(static string) name (different static string ) message (last static string)

what would be the best way of searching through the messages for word and generate an array of all of the name's that had that word in their message?


>>> s="(static string) name (different static string ) message (last static string)"
>>> _,_,s=s.partition("(static string)")
>>> name,_,s=s.partition("(different static string )")
>>> message,_,s=s.partition("(last static string)")
>>> name
' name '
>>> message
' message '


Expecting this string:

Foo NameA Bar MessageA Baz

this regex will match:

Foo\s+(\w+)\s+Bar\s+(\w+)\s+Baz

Group 1 will be the name, group 2 will be the message. FooBarBaz are the static parts.

Here it is using the repl of Python:

Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> s = "Foo NameA Bar MessageA Baz"
>>> m = re.match("Foo\s+(\w+)\s+Bar\s+(\w+)\s+Baz", s)
>>> m.group(0)
'Foo NameA Bar MessageA Baz'
>>> m.group(1)
'NameA'
>>> m.group(2)
'MessageA'
>>> 


Here's a full answer showing how to do it using replace().

strings = ['(static string) name (different static string ) message (last static string)',
           '(static string) name (different static string ) message (last static string)',
           '(static string) name (different static string ) message (last static string)',
           '(static string) name (different static string ) message (last static string)',
           '(static string) name (different static string ) message (last static string)',
           '(static string) name (different static string ) message (last static string)']

results = []
target_word = 'message'
separators = ['(static string)', '(different static string )', '(last static string)']

for s in strings:
    for sep in separators:
        s = s.replace(sep, '')
    name, message = s.split()
    if target_word in message:
        results.append((name, message))

>>> results
[('name', 'message'), ('name', 'message'), ('name', 'message'), ('name', 'message'), ('name', 'message'), ('name', 'message')]

Note that this will match any message that contains the substring target_word. It will not look for word boundaries, e.g. compare a run of this with target_word = 'message' vs. target_word = 'sag' - will produce the same results. You may need regular expressions if your word matching is more complicated.


for line in open("file"):
    line=line.split(")")
    for item in line:
        try:
            print item[:item.index("(")]
        except:pass

output

$ more file
(static string) name (different static string ) message (last static string)
(static string) name (different static string ) message (last static string)
(static string) name (different static string ) message (last static string)
(static string) name (different static string ) message (last static string)
$ python python.py

 name
 message

 name
 message

 name
 message

 name
 message
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜