Regex not operator
I need to use a regex to pull a value out a url domain that will exclude everything but the host (ex开发者_开发知识库: wordpress) and domain type (ex .com). The urls are dynamic and contain 2-3 values for each result (www.example.com or example.org). I am trying to use this expression, but I am only getting back the first letter of every item I am attempting to exclude:
Expresssion
(?!wordpress|com|www)(\w+|\d+)
String
example.wordpress.com
Results
- example
- ordpress
- om
- Desired Result
example
Any assistance would be greatly appreciated
Anchor your regular expression:
\b(?!wordpress|com|www)(\w+|\d+)\b
You might also want to consider whether (\w+|\d+)
is really what you mean. \w
already includes digits. Also, there are other characters allowed in URLs such as -
. Do you need to handle this?
If I was to do thing like that, I would take advantage of the format of the url: anything (dot) 2nd-level-domain (dot) 1st-level-domain:
^(?<level3>.*)[.]?(?<level2>.+)[.](?<level1>.+)$
Is it so that you are only after what is after the domain part??
(/\/(?!\/).*?\/(.*)/).exec("http://www.google.com/sdfsdf/fdsff")[1]
// returns sdfsdf/fdsff
精彩评论