How to use python regex to match words beginning with hash and question mark?
This should be easy and this regex works fine to search for words beginning wit开发者_JAVA百科h specific characters, but I can't get it to match hashes and question marks.
This works and matches words beginning a:
r = re.compile(r"\b([a])(\w+)\b")
But these don't match: Tried:
r = re.compile(r"\b([#?])(\w+)\b")
r = re.compile(r"\b([\#\?])(\w+)\b")
r = re.compile( r"([#\?][\w]+)?")
even tried just matching hashes
r = re.compile( r"([#][\w]+)?"
r = re.compile( r"([/#][\w]+)?"
text = "this is one #tag and this is ?another tag"
items = r.findall(text)
expecting to get:
[('#', 'tag'), ('?', 'another')]
\b matches the empty space between a \w and \W (or between a \W and \w) but there is no \b before a # or ?.
In other words: remove the first word boundary.
Not:
r = re.compile(r"\b([#?])(\w+)\b")
but
r = re.compile(r"([#?])(\w+)\b")
you are using Python, regex is the last thing to come to mind
>>> text = "this is one #tag and this is ?another tag"
>>> for word in text.split():
...   if word.startswith("#") or word.startswith("?"):
...     print word
...
#tag
?another
The first \b won't match before # or ?, use (?:^|\s) instead.
Also, the \b at the end is unnecessary, because \w+ is a greedy match.
r = re.compile(r"(?:^|\s)([#?])(\w+)")
text = "#head this is one #tag and this is ?another tag, but not this?one"
print r.findall(text)
# Output: [('#', 'head'), ('#', 'tag'), ('?', 'another')]
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论