首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >python将变量中的CSV值与CSV文件问题进行匹配

python将变量中的CSV值与CSV文件问题进行匹配
EN

Stack Overflow用户
提问于 2018-07-08 19:19:11
回答 1查看 234关注 0票数 0

使用API查询,我收到了一个包含更多属性的巨大JSON响应。

我试图以逗号分隔的CSV格式仅解析响应中的某些字段。

代码语言:javascript
复制
    >>> import json
    >>> resp = { "status":"success", "msg":"", "data":[ { "website":"https://www.blahblah.com", "severity":"low", "location":"unknown", "asn_number":"AS4134 Chinanet", "longitude":121.3997000000, "epoch_timestamp":1530868957, "id":"c1e15eccdd1f31395506fb85" }, { "website":"https://www.jhonedoe.co.uk/sample.pdf", "severity":"low", "location":"unknown", "asn_number":"AS4134 Chinanet", "longitude":120.1613998413, "epoch_timestamp":1530868957, "id":"933bf229e3e95a78d38223b2" } ] }
    >>> response = json.loads(json.dumps(resp))
    >>> KEYS = 'website', 'asn_number' , 'severity'
    >>> x = []
    >>> for attribute in response['data']:
            csv_response = ','.join(attribute[key] for key in KEYS)
            print csv_response

同时打印给出被查询的键的values的"csv_response“。

代码语言:javascript
复制
https://www.blahblah.com,AS4134 Chinanet,low
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low

现在,我在/tmp/目录中有一个CSV文件。

代码语言:javascript
复制
/tmp$cat 08_july_2018.csv
http://download2.freefiles-10.de,AS24940 Hetzner Online GmbH,high
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low
http://download2.freefiles-11.de,AS24940 Hetzner Online GmbH,high
www.solener.com,AS20718 ARSYS INTERNET S.L.,low
https://www.blahblah.com,AS4134 Chinanet,low
www.telewizjairadio.pl,AS29522 Krakowskie e-Centrum Informatyczne JUMP Dziedzic,high 

我正在尝试检查/匹配我们从JSON响应"csv_response“得到的值是否存在于”/tmp/08_7月_2018.csv“文件中。

在"csv_response“值中,如果任何一个来自08_july_2018.csv的行值匹配,我将把条件标记为”已通过“。

关于如何将变量中的CSV值与/tmp/目录中的文件进行匹配并使条件符合条件,有什么建议吗?

EN

回答 1

Stack Overflow用户

发布于 2018-07-08 19:22:07

你可以使用Pandas (下面的代码来自jupyter notebook)。Pandas将为您提供很大的灵活性来匹配csv中的列。

您需要向想要读取的add文件添加一个头文件,因此添加:

代码语言:javascript
复制
website,asn,severity

添加到08_july_2018.csv文件

代码语言:javascript
复制
import pandas as pd
import json

resp = { "status":"success", "msg":"",
         "data":[ { "website":"https://www.blahblah.com", 
                    "severity":"low",
                    "location":"unknown",
                    "asn_number":"AS4134 Chinanet", 
                    "longitude":121.3997000000, 
                    "epoch_timestamp":1530868957, 
                    "id":"c1e15eccdd1f31395506fb85" },
                  { "website":"https://www.jhonedoe.co.uk/sample.pdf", 
                    "severity":"low",
                    "location":"unknown",
                    "asn_number":"AS4134 Chinanet", 
                    "longitude":120.1613998413, 
                    "epoch_timestamp":1530868957, 
                    "id":"933bf229e3e95a78d38223b2" } ] }


t1 = pd.DataFrame(resp['data'])
t1.set_index('website', inplace=True)
print(t1) 

t2 = pd.read_csv('/tmp/08_july_2018.csv')
t2.set_index('website', inplace=True)
print(t2) 

# You try to check if one is present in the other. You can do that
# by querying the resulting (t3) dataframe holding all records
# mathing on the key (website). By selecting all rows that have
# equal severity you have those records. Extend/modify this query for
# the fields you want to match on. The columns of the first dataframe
# git the extension _1, the other dataframe _2. So, the colums with
# the same name in the original date now have these extension to 
# distinguish them
# If you want all rows that have a equal severity:
#   the query is: (t3['severity_1'] == t3['severity_2'])
# if you only want the 'low' severity:
#   (t3['severity_1'] == t3['severity_2']) & (t3['severity'] == 'low')
t3 = pd.concat([ t1.add_suffix('_1'), t2.add_suffix('_2')], axis=1)
t3['MTCH'] = t3[(t3['severity_1'] == t3['severity_2'])]['asn_number_1']
t3.dropna(inplace=True)
print(t3['MTCH'].values)

提供:

代码语言:javascript
复制
...
['AS4134 Chinanet' 'AS4134 Chinanet']

或iter覆盖所有匹配的记录,并从行中选择所需的字段:

代码语言:javascript
复制
for i, row in t3[(t3['severity_1'] == t3['severity_2'])].iterrows():
   print(i, row['severity_2']) # add other fields from t3           
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51231442

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档