我对使用python的超级漏斗和web属性非常陌生,我正在尝试找到一个解决我的问题的方法。我有两个csv文件(网址和访问)
url.csv
short_code
full_url
time_created
user_id
premium_user
country
visits.csv
short_code
visit_time
browser_type
version
platform
ipaddress
country
我正在编写一段python代码,以获得以下内容
1. Return urls which only have visitors from the same country as the url was created from
2. Get the URL with the shortest time between when the URL was created and when the first visit was recorded
3. Gets a count of visits to each short code by each unique visitor
下面是我的代码,它只是从我的云中导入数据
链接到我的文件https://www.dropbox.com/s/u193iv6ybeges92/url.csv?dl=0 https://www.dropbox.com/s/u3vmdra41p3qjgv/visits.csv?dl=0
import csv
import requests
from pprint import pprint
def same_country_only(visits, urls):
"""Return urls which only have visitors from the same country as the url was created from"""
pass
def shortest_first_visit(visits, urls):
"""Get the URL with the shortest time between when the URL was created and when the first visit was recorded"""
pass
def unique_visitors(visits, urls):
"""Gets a count of visits to each short code by each unique visitor"""
pass
if __name__ == '__main__':
urls_response = requests.get('<<my_url>>').text
urls_dr = csv.DictReader(urls_response.splitlines(), delimiter=',')
urls = [dict(url) for url in urls_dr]
pprint(urls[0]) # example format
print('\n' + '*' * 60 + '\n')
visits_response = requests.get('<<my_url>>').text
visits_dr = csv.DictReader(visits_response.splitlines(), delimiter=',')
visits = [dict(visit) for visit in visits_dr]
pprint(visits[0]) # example format
print('\n' + '*' * 60 + '\n')
pprint(same_country_only(visits, urls))
pprint(shortest_first_visit(visits, urls))
pprint(unique_visitors(visits, urls))
任何帮助都是非常感谢的。
示例Csv(第一列是标题)
url.csv
id short_cod long_url created_ti creator_id premium country
1 GTq6Bl https://w 2018-07-2 78 FALSE CA
2 EmazTI https://as 2018-07-2 124 FALSE GB
3 tT54Bl https://bi 2018-07-2 97 FALSE GBG4
4 6ZTSle https://gi 2018-07-2 98 FALSE US
5 3akWjJ https://e 2018-07-2 11 FALSE JP
6 m7NoUy https://bl 2018-07-2 34 TRUE JP
7 lszSBy https://m 2018-07-2 90 FALSE US
8 PnTavE https://ha 2018-07-2 1 FALSE GB
9 QkXxbV https://d 2018-07-2 109 FALSE CN
visits.csv
browser_t visit_time short_cod country platform ip_address
Chrome 2018-07-2 GTq6Bl IT Windows 78.110.51.215
Firefox 2018-07-2 GTq6Bl IT Linux 27.243.245.232
Chrome 2018-07-2 GTq6Bl JP Mac OS 97.155.155.73
Chrome 2018-07-2 GTq6Bl RU Linux 85.201.130.148
Chrome 2018-07-2 GTq6Bl GB Linux 26.90.189.168
Chrome 2018-07-2 GTq6Bl CN Android 58.203.242.175
Edge 2018-07-2 GTq6Bl KR Windows 84.11.120.228
Safari 2018-07-2 GTq6Bl KR iOS 46.72.81.132
Firefox 2018-07-2 GTq6Bl IT Linux 30.47.125.89F10
Safari 2018-07-2 GTq6Bl CA iOS 85.245.10.160
Firefox 2018-07-2 GTq6Bl RU Windows 43.13.144.48
Chrome 2018-07-2 GTq6Bl IT Android 65.74.182.22
发布于 2018-08-19 00:48:47
使用panda来实现这一点。Panda's有一个基于公共列加入2个csv的功能。merge是用于连接的键。
https://stackoverflow.com/questions/51868723
复制相似问题