开发者

Regex matching url authority parts

I need to match these parts of the following string:

(user)@(hostname):(port)

User and port can optionally be matched. First I managed it with this regular expression:

(?:([^@]*)@)?([^\:]+)(?:\:(\d+))?

This matches for foo@bar:80

foo
bar
80

But when it comes to a IPv6 host like foo@[2001:0db8:85a3:08d3:开发者_运维知识库1319:8a2e:0370:7344]:80, the preceding regex won't work as expected:

foo
[2001
0

So now I'm pondering about a regular expression which can also match square bracket enclosed hosts with colons, but without square brackets. :) I've done that with the following regex:

(?:([^@]*)@)(?:\[(.+)\]|([^:]+))(?:\:(\d+))?

foo
2001:0db8:85a3:08d3:1319:8a2e:0370:7344
<empty>
80

But.. this is ugly, because either 2 or 3 will be empty. Is there any way to combine this to only one backreference?

I'm using boost::regex, which uses perl's regex engine as far as I know.

Thanks and regards

reeaal


(?:([^@]*)@)(\[.+\]|([^:]+))(?:\:(\d+))?

But you'll have to strip off the [] if it's an IPv6 addr. Should be fairly trivial though.

You could also do it with optional [ and ] before and after, and then lookaround assertions... but that's REALLY ugly; your fellow programmers will thank you if you just KISS and use the above, but here's the option:

(?:([^@]*)@)\[?((?<=\[).+(?=\])|([^:]+))\]?(?:\:(\d+))?
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜