Find word on given position in text
there is more elega开发者_如何学运维nt (pythonic + effective) way to find word on given position?
FIRST_WORD = re.compile(r'^(\w+)', re.UNICODE)
LAST_WORD = re.compile(r'(\w+)$', re.UNICODE)
def _get_word(self, text, position):
"""
Get word on given position
"""
assert position >= 0
assert position < len(text)
# get second part of word
# slice string and get first word
match = FIRST_WORD.search(text[position:])
assert match is not None
postfix = match.group(1)
# get first part of word, can be empty
# slice text and get last word
match2 = LAST_WORD.search(text[:position])
if match2 : prefix = match2.group(1)
else : prefix = ''
return prefix + postfix
# | 21.
>>> _get_word("Hello, my name is Earl.", 21)
Earl
>>> _get_word("Hello, my name is Earl.", 20)
Earl
Thanks
Here's how I'd do it:
s = "Hello, my name is Earl."
def get_word(text, position):
words = text.split()
characters = -1
for word in words:
characters += len(word)
if characters > = position:
return word
>>> get_word(s, 21)
Earl.
Stripping off the punctuation can be done with ''.strip()
or regular expressions or something hacky like
for c in word:
final += c if c.lower() in 'abcdefghijklmnopqrstuvwxyz'
import string
s = "Hello, my name is Earl."
def get_word(text, position):
_, _, start = text[:position].rpartition(' ')
word,_,_ = text[position:].partition(' ')
return start+word
print get_word(s, 21).strip(string.punctuation)
The following solution is to get the alpha characters around the given position:
def get_word(text, position):
if position < 0 or position >= len(text):
return ''
str_list = []
i = position
while text[i].isalpha():
str_list.insert(0, text[i])
i -= 1
i = position + 1
while text[i].isalpha():
str_list.append(text[i])
i += 1
return ''.join(str_list)
The following is a test case:
get_word("Hello, my name is Earl.", 21) # 'Earl'
get_word("Hello, my name is Earl.", 20) # 'Earl'
I don't think it is a good idea to split the text into words with the split
function here, because position is essential for this problem. If there are continuous blanks in a text, the split
function may cause troubles.
精彩评论