我想提取df1中与df2匹配的所有单词。
df1 = pd.DataFrame(['Dog has 4 legs.It has 2 eyes.','Fish has fins','Cat has paws.It eats fish','Monkey has tail'],columns=['Description'])
df2 = pd.DataFrame(['Fish','Legs','Eyes'],columns=['Parts'])
Df1 Df2
|---------------------------------| |---------------------------------|
| **Description** | | Parts |
|---------------------------------| |---------------------------------|
| Dog has 4 legs.It has 2 eyes. | | Fish |
|---------------------------------| |---------------------------------|
| Fish has fins | | Legs |
|---------------------------------| |---------------------------------|
| Cat has paws.It eats fish. | | Tail |
|---------------------------------| |---------------------------------| 期望产出:
|---------------------------------|-----------|
| **Description** |Parts |
|---------------------------------|-----------|
| Dog has 4 legs.It has 2 eyes. |Legs,Tail |
|---------------------------------|-----------|
| Fish has fins |Fish |
|---------------------------------|-----------|
| Cat has paws.It eats fish. |Fish |
|---------------------------------|-----------|
| Monkey has tail | |
|---------------------------------|-----------|发布于 2020-05-07 12:23:29
@Datanovice的解决方案更好,因为一切都在Pandas之内。这是另一种选择,而且速度更快(在Pandas中字符串操作不是那么快):
from itertools import product
from collections import defaultdict
res = df2.Parts.str.lower().array
d = defaultdict(list)
for description, word in product(df1.Description, res):
if word in description.lower():
d[description].append(word)
d
defaultdict(list,
{'Dog has 4 legs.It has 2 eyes.': ['legs', 'eyes'],
'Fish has fins': ['fish'],
'Cat has paws.It eats fish': ['fish']})
df1['parts'] = df1.Description.map(d).str.join(',')
Description parts
0 Dog has 4 legs.It has 2 eyes. legs,eyes
1 Fish has fins fish
2 Cat has paws.It eats fish fish
3 Monkey has tail https://stackoverflow.com/questions/61657342
复制相似问题