Regular Expression to extract path + filename from URL
I am trying to extract 'vets/gth/summary.htm' from 'http://www.abc.gh.gov/vets/gth/summary.htm' by using the following regular expression: ^http:\/\/www.abc.gh.gov
I get the following output: 'ets/gth/summary.htm' I do not get the 'v' in the vets.
If I change the url to 'http://www.abc.gh.gov/rets/gth/summary.htm' it works fine. The regex does not work when the first letter after 'http://www.abc.gh.gov/' is one of the following 'httpwwwabcghov'. N开发者_Go百科otice that these letters are present in 'http://www.abc.gh.gov/'.
Please advice.
Change your regex to ^(http:\/\/www.abc.gh.gov)
to force the whole block, nothing less and nothing more.
Why don't you just add a /
to the end of the regex (escaped of course) \/
so you can just search for the slash?
As I mentioned in the comment, I don't know what you mean by "output" since the normal output of a regular expression execution is the part that matched your expression, not the part that didn't.
However, I would recommend the following approach:
- Find the index of the third
/
- Substring from
index + 1
to the end.
精彩评论