问在python pandas中搜索整行Dataframe的多个字符串值
EN

Stack Overflow用户

提问于 2018-06-14 04:44:05

回答 5查看 10.2K关注 0票数 2

在pandas数据帧中，我想逐行搜索多个字符串值。如果行包含字符串值，则该函数将添加/打印该行，并将其打印到df 1或0末尾的空列中。

已经有多个教程介绍了如何选择与(部分)字符串匹配的Pandas DataFrame行。

例如：

import pandas as pd

#create sample data
data = {'model': ['Lisa', 'Lisa 2', 'Macintosh 128K', 'Macintosh 512K'],
        'launched': [1983,1984,1984,1984],
        'discontinued': [1986, 1985, 1984, 1986]}

df = pd.DataFrame(data, columns = ['model', 'launched', 'discontinued'])
df

我从这个网站上摘录了上面的例子：https://davidhamann.de/2017/06/26/pandas-select-elements-by-string/

如何对整行进行多值搜索：'int'，'tos'，'198'？

然后打印到列中，然后停止，列int将根据行是否包含该关键字而具有1或0。

python

string

pandas

dataframe

回答 5

Stack Overflow用户

回答已采纳

发布于 2018-06-27 05:14:49

因此，不使用花哨的pandas工具的最简单的方法是使用两个for循环。我希望有人能给出一个更好的解决方案，但我的方法是：

def check_all_for(column_name, search_terms):
    df[column_name] = ''
    for row in df.iterrows():
        flag = 0
        for element in row:
            for search_term in search_terms:
                if search_term in (str(element)).lower():
                    flag = 1
        row[column_name] = flag

假设您已经将dataframe定义为df，并且希望用1和0标记新列

票数 0

Stack Overflow用户

发布于 2018-06-14 04:53:59

如果你有

l=['int', 'tos', '198']

然后，您可以通过连接'|'来使用str.contains，以获得包含这些单词的每个模型

df.model.str.contains('|'.join(l))

0    False
1    False
2     True
3     True

编辑

如果目的是检查所有列都是@jpp解释的，我建议：

from functools import reduce
res = reduce(lambda a,b: a | b, [df[col].astype(str).str.contains(m) for col in df.columns])

0    False
1     True
2     True
3     True

如果您希望它作为包含整数值的列，只需执行以下操作

df['new_col'] = res.astype(int)

     new_col
0    0
1    1
2    1
3    1

票数 4

Stack Overflow用户

发布于 2018-06-14 06:56:28

如果我理解正确的话，您希望检查每行中所有列中是否存在字符串。这并不简单，因为你有混合类型(整型，字符串)。一种方法是将pd.DataFrame.apply与自定义函数一起使用。

我们需要记住的要点是将整个数据帧转换为str类型，因为您不能测试子字符串在整数中的存在。

match = ['int', 'tos', '1985']

def string_finder(row, words):
    if any(word in field for field in row for word in words):
        return True
    return False

df['isContained'] = df.astype(str).apply(string_finder, words=match, axis=1)

print(df)

            model  launched  discontinued  isContained
0            Lisa      1983          1986        False
1          Lisa 2      1984          1985         True
2  Macintosh 128K      1984          1984         True
3  Macintosh 512K      1984          1986         True

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50845987

复制

相似问题

问在python pandas中搜索整行Dataframe的多个字符串值
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在python pandas中搜索整行Dataframe的多个字符串值EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在python pandas中搜索整行Dataframe的多个字符串值
EN