开发者

Python - Extract important string information

I have the following string

http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342

How in best way to extract id value, 开发者_StackOverflowin this case - 32434242423423234

Regardz, Mladjo


You could just use a regular expression, e.g.:

import re

s = "http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342"

m = re.search(r'controller/id(\d+)\?',s)
if m:    
    print "Found the id:", m.group(1)

If you need the value as an number rather than a string, you can use int(m.group(1)). There are plenty of other ways of doing this that might be more appropriate, depending on the larger goal of your code, but without more context it's hard to say.


>>> import urlparse
>>> res=urlparse.urlparse("http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342")
>>> res.path
'/variable/controller/id32434242423423234'
>>> import posixpath
>>> posixpath.split(res.path)
('/variable/controller', 'id32434242423423234')
>>> directory,filename=posixpath.split(res.path)
>>> filename[2:]
'32434242423423234'

Using urlparse and posixpath might be too much for this case, but I think it is the clean way to do it.


>>> s
'http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342'
>>> s.split("id")
['http://example.com/variable/controller/', '32434242423423234?param1=321&param2=4324342']
>>> s.split("id")[-1].split("?")[0]
'32434242423423234'
>>>


While Regex is THE way to go, for simple things I have written a string parser. In a way, is the (uncomplete) reverse operation of a string formatting operation with PEP 3101. This is very convenient because it means that you do not have to learn another way of specifying the strings.

For example:

>>> 'The answer is {:d}'.format(42)
The answer is 42

The parser does the opposite:

>>> Parser('The answer is {:d}')('The answer is 42') 
42

For your case, if you want an int as output

>>> url = 'http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342'
>>> fmt = 'http://example.com/variable/controller/id{:d}?param1=321&param2=4324342'
>>> Parser(fmt)(url)
32434242423423234

If you want a string:

>>> fmt = 'http://example.com/variable/controller/id{:s}?param1=321&param2=4324342'
>>> Parser(fmt)(url)
32434242423423234

If you want to capture more things in a dict:

>>> fmt = 'http://example.com/variable/controller/id{id:s}?param1={param1:s}&param2={param2:s}'
>>> Parser(fmt)(url)
{'id': '32434242423423234', 'param1': '321', 'param2': '4324342'}

or in a tuple:

If you want to capture more things in a dict:

>>> fmt = 'http://example.com/variable/controller/id{:s}?param1={:s}&param2={:s}'
>>> Parser(fmt)(url)
('32434242423423234', '321', '4324342')

Give it a try, it is hosted here

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜