文章/答案/技术大牛

发布

社区首页 >问答首页 >熊猫计数其他重复值之间的值

问熊猫计数其他重复值之间的值
EN

Stack Overflow用户

提问于 2022-08-16 14:31:14

回答 2查看 39关注 0票数 0

我正在处理一些车辆检测数据，并正在研究提取绿灯期间检测到的车辆数量和在红灯期间检测到的数量。

什么是最有效的方法提取车辆之间的绿色和红灯检测到的%与所有检测到的车辆？

绿色灯光启动为事件代码= 1，
红灯启动为事件代码= 10，检测到
车辆为事件代码= 82

CSV示例：

Signal Id,Timestamp,Event Code,Event Parameter
14,2022-08-01 13:10:49.600,1,8
14,2022-08-01 13:10:52.500,82,32
14,2022-08-01 13:10:58.000,82,32
14,2022-08-01 13:11:01.200,82,32
14,2022-08-01 13:11:03.700,82,32
14,2022-08-01 13:11:04.200,82,32
14,2022-08-01 13:11:10.100,82,32
14,2022-08-01 13:11:16.000,82,32
14,2022-08-01 13:11:45.500,10,8
14,2022-08-01 13:12:10.200,82,32
14,2022-08-01 13:12:19.300,82,32
14,2022-08-01 13:12:30.300,82,32
14,2022-08-01 13:12:46.600,1,8
14,2022-08-01 13:12:51.400,82,32
14,2022-08-01 13:13:35.600,82,32
14,2022-08-01 13:13:42.800,10,8
14,2022-08-01 13:13:52.000,82,32
14,2022-08-01 13:13:57.000,82,32
14,2022-08-01 13:14:03.300,82,32
14,2022-08-01 13:14:04.500,82,32
14,2022-08-01 13:14:09.300,1,8
14,2022-08-01 13:14:29.800,82,32
14,2022-08-01 13:14:42.200,82,32
14,2022-08-01 13:14:46.000,82,32
14,2022-08-01 13:14:47.400,82,32
14,2022-08-01 13:15:36.800,10,8

对于这个片段，它将是13绿色和7红色，65%的车辆到达绿灯。

我将文件分解成一个单一的方向，因为我的第一个进程涉及添加列、逐行解析csv、每次传递代码1或10时来回翻转布尔值，以及在每次检测旁边的新列中表示值。这似乎非常初级，并认为熊猫可能有一个更好的方法进行计算。我研究了groupby()方法，但我想我需要修改一个检测器编号，这将涉及到逐行解析、修改数字。是否有更好、更有效的方法来提取这些数据？

python-3.x

pandas

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-08-16 14:40:25

您可以使用布尔掩蔽和value_counts。

m = df['Event Code'].ne(82) # or .isin([1, 10])

out = (df['Event Code'].where(m).ffill()[~m]
                       .map({1: 'Green', 10: 'Red'})
                       .value_counts()
       )

产出：

Green    13
Red       7
Name: Event Code, dtype: int64

用.value_counts(normalize=True)

Green    0.65
Red      0.35
Name: Event Code, dtype: float64

票数 1

Stack Overflow用户

发布于 2022-08-16 14:58:41

你可以试试这个。

df = pd.DataFrame(data, columns=columns)

df['Event Type'] = np.NAN
df.loc[df['Event Code'] == 1, 'Event Type'] = 'green light start'
df.loc[df['Event Code'] == 10, 'Event Type'] = 'red light start'
df = df.fillna(method='ffill')

cars_on_green_light = df[(df['Event Type'] == 'green light start') & (df['Event Code'] != 1)].shape[0]
cars_on_red_light = df[(df['Event Type'] == 'red light start') & (df['Event Code'] != 10)].shape[0]

total_cars_arriving = df[df['Event Code'] == 82].shape[0]

percent_green_cars = cars_on_green_light / total_cars_arriving * 100
percent_red_cars = cars_on_red_light / total_cars_arriving * 100


print(f"""
cars_on_green_light : {cars_on_green_light}
cars_on_red_light   : {cars_on_red_light}
total_cars_arriving : {total_cars_arriving}
percent_green_cars  : {percent_green_cars}
percent_red_cars    : {percent_red_cars}
""")

退出：

cars_on_green_light : 13
cars_on_red_light   : 7
total_cars_arriving : 20
percent_green_cars  : 65.0
percent_red_cars    : 35.0

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73375714

复制

相似问题

问熊猫计数其他重复值之间的值
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫计数其他重复值之间的值EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫计数其他重复值之间的值
EN