我想创建一个堆叠的直方图,如下所示。
下面是我的代码:
import numpy as np
import pandas as pd
import datetime
import matplotlib.pyplot as plt
def stackhist(x, y):
grouped = pd.groupby(x, y)
data = [d for _, d in grouped]
labels = [l for l, _ in grouped]
plt.figure(figsize=(20, 10))
plt.hist(data, histtype="bar", stacked="True", label=labels)
plt.legend()
# make data distribution
mu, sigma = 12.2, 1.2
distribution = np.random.normal(mu, sigma, 200)
times = [(datetime.time(hour=int(x), minute=int((x - int(x))*60.0), second=int(((x - int(x)) * 60 - int((x - int(x))*60.0))*60.0))).strftime('%H:%M:%S') for x in distribution]
df = pd.DataFrame(columns=['time', 'department'])
df.time = times
df['department'] = df['department'].fillna(pd.Series(np.random.choice(['Shoes', 'Hats', 'Shirts', 'Pants'],
p=[0.1, 0.15, 0.375, 0.375], size=len(df))))
stackhist(df['time'], df['department'])
plt.show()
下面是输出,请注意,X标签是所有不同时间的堆叠。如何才能使它只是10-11-12-13-14-15-16中的小时,而不是分钟:
感谢您的关注。
发布于 2018-07-25 08:42:52
在这里,您的第一个实际问题是,您没有在datetime.time元素中跟随您的链接数据。你最终得到了时间串,matplotlib会将其视为绝对的,而不是做你想做的事情。
这演示了如何修复您的时间。然后给你这张图。
让我知道这是否有意义。
import numpy as np
import pandas as pd
import datetime
import matplotlib.pyplot as plt
def stackhist(x, y):
grouped = pd.groupby(x, y)
data = [d for _, d in grouped]
labels = [l for l, _ in grouped]
plt.figure(figsize=(20, 10))
plt.hist(data, histtype="bar", stacked="True", label=labels)
plt.legend()
mu, sigma = 12.2, 1.2
distribution = np.random.normal(mu, sigma, 1000)
# only pull the hour from the datetime time
times = [(datetime.time(hour=int(x), minute=int((x - int(x))*60.0), second=int(((x - int(x)) * 60 - int((x - int(x))*60.0))*60.0))).strftime('%H') for x in distribution]
# make data frame since you used one
df = pd.DataFrame(columns=['time', 'department'])
df.time = times
# set times to integer instead of string so they will sort automatically
df['time'] = df['time'].astype(int)
# fill department data
df['department'] = df['department'].fillna(pd.Series(np.random.choice(['Shoes', 'Hats', 'Shirts', 'Pants'],
p=[0.1, 0.15, 0.375, 0.375], size=len(df))))
stackhist(df['time'], df['department'])
plt.show()
https://stackoverflow.com/questions/51508811
复制相似问题