这是一件容易的事。任务是检查一个列中的字符串是否包含存储在另一个字符串中的所有单词。在这个基础上做点什么。下面是一个简单的例子
import pandas as pd
df = pd.DataFrame({'Strings':["The brown","fox smoked 6", "cigarettes per day", "in his cave"],
'Set': ["Alpha", "Beta", "Gamma", "Delta"]})
... >>> df
Set Strings
0 Alpha The brown
1 Beta fox smoked 6
2 Gamma cigarettes per day
3 Delta in his cave
>>> 现在,我想在df"Strings“的每一行中签入,如果它包含单词”已抽“和数字"6”(这里的第3行是正确的)。如果是这样的话,我需要新列df“结果”等于df"Set“,但是在其中添加了”有害健康“一词。如果不只是复制df"Set“中包含的内容。输出应该如下所示:
... >>> df_final
Set Strings Result
0 Alpha The brown Alpha
1 Beta fox smoked 6 Beta health damaging
2 Gamma cigarettes per day Gamma
3 Delta in his cave Delta
>>> 发布于 2016-02-10 20:54:04
您可以为您的2个条件构造一个掩码,并将其传递给np.where
In [20]:
mask = (df['Strings'].str.contains('6')) & (df['Strings'].str.contains('smoked'))
In [23]:
et
df['Result'] = np.where(mask, df['Set'] + ' health damaging', df['Set'])
df
Out[23]:
Set Strings Result
0 Alpha The brown Alpha
1 Beta fox smoked 6 Beta health damaging
2 Gamma cigarettes per day Gamma
3 Delta in his cave Delta在这里,掩码测试是否存在您的字符串,使用.str.contains和我们以及条件一起制作掩码。
https://stackoverflow.com/questions/35325474
复制相似问题