开发者

Python Regex to parse apart android user agent device name

I'm working on parsing apart Android user agents, with Python 2.5, and so far I've been able to figure out a regex which works for "most" android user agents that gathers the maj开发者_高级运维or and minor version.

(?P<browser>Android) (?P<major_version>\d*).(?P<minor_version>\d*)

The above regex works for the example below:

Mozilla/5.0 (Linux; U; Android 2.2; en-gb; Nexus One Build/FRF50) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1

However, I'd like to also get what type of Android device this is. I'm seeing a common pattern in android user agents for where to find the device name using this reference: http://www.botsvsbrowsers.com/category/6/index.html

Basically it always seems to be after the language, such as "en-gb;" and before "Build/"

So how should I modify my regex so that in the example above i would be able to parse out "Nexus One".

Another android user agent example would be:

Mozilla/5.0 (Linux; U; Android 2.1; en-us; HTC Legend Build/cupcake) AppleWebKit/530.17 (KHTML, like Gecko) Version/4.0 Mobile Safari/530.17

In the above example I'm looking to get "HTC Legend"


Try this:

(?P<browser>Android) (?P<major_version>\d*)\.(?P<minor_version>\d*);[^;]*;(?P<device>[ \w]+) Build\/


(?P<browser>Android)\s(?P<major_version>\d+)\.(?P<minor_version>\d+);[^;]*;\s(?P<device>.+)\sBuild
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜