我有一个要求,我需要找出最受欢迎的开始时间。以下是帮助我找到正确解决方案的代码。
import time
import pandas as pd
import numpy as np
# bunch of code comes
# here
# that help in reaching the following steps
df = pd.read_csv(CITY_DATA[selected_city])
# convert the Start Time column to datetime
df['Start Time'] = pd.to_datetime(df['Start Time'])
# extract hour from the Start Time column to create an hour column
df['hour'] = df['Start Time'].dt.hour
# extract month and day of week from Start Time to create new columns
df['month'] = df['Start Time'].dt.month
df['day_of_week'] = df['Start Time'].dt.weekday_name
# find the most popular hour
popular_hour = df['hour'].mode()[0]
以下是我尝试运行此查询时获得的示例o/p
“打印(df‘小时’)”
0 15
1 17
2 8
3 13
4 14
5 9
6 9
7 17
8 16
9 17
10 7
11 17
Name: hour, Length: 300000, dtype: int64
我使用时得到的o/p
打印(类型(df“小时”))
<class 'pandas.core.series.Series'>
最流行的开始小时的值存储在popular_hour中,它等于"17“(这是正确的值)
但是,我不能理解.mode()
这个.mode()做了什么?为什么?
是否可以使用相同的概念来计算热门月份和热门星期几,而不考虑它们的数据类型
https://stackoverflow.com/questions/52996816
复制相似问题