Check if a substring is in a string python
I have two dataframe, I need to check contain substring from first df in each string in second df and get a list of words that are included in the second df
First df(word):
| word |
|---|
| apples |
| dog |
| cat |
| cheese |
Second df(sentence):
| sentence |
|---|
| apples grow on a t开发者_StackOverflow社区ree |
| ... |
| I love cheese |
I tried this one:
tru=[]
for i in word['word']:
if i in sentence['sentence'].values:
tru.append(i)
And this one:
tru=[]
for i in word['word']:
if sentence['sentence'].str.contains(i):
tru.append(i)
I expect to get a list like ['apples',..., 'cheese']
One possible way is to use Series.str.extractall:
import pandas as pd
df_word = pd.Series(["apples", "dog", "cat", "cheese"])
df_sentence = pd.Series(["apples grow on a tree", "i love cheese"])
matches = df_sentence.str.extractall(f"({'|'.join(df_word)})")
matches
Output:
0
match
0 0 apples
1 0 cheese
You can then convert the results to a list:
matches[0].unique().tolist()
Output:
['apples', 'cheese']
加载中,请稍侯......
精彩评论