也许你们中的一些人可以帮助我画出下面的图表。
我有一个数据框,其中包含去年出行的人的调查数据(Yes,No)以及他们使用的交通工具(Airplane,Car,Train)
import pandas as pd
import numpy as np
data = {'Travel': ['Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'No'],
'Transporation': ['Airplaine', np.nan, 'Car', 'Train', 'Train', 'Car', 'np.nan']
}
df = pd.DataFrame (data, columns = ['Travel','Transporation'])
Travel Transporation
0 Yes Airplaine
1 No NaN
2 Yes Car
3 Yes Train
4 Yes Train
5 Yes Car
6 No NaN我绘制了第一个问题的计数图,并添加了回答Yes和No的受访者的相对百分比。
import seaborn as sns
ax = sns.countplot(y='Travel', data=df, palette=['green',"red"])
ax.set_yticklabels(ax.get_yticklabels(), rotation=45)
ax.set_title('Travel last year')
ax.set_ylabel('')
total = df.shape[0]
for p in ax.patches:
percentage = '{:.1f}%'.format(100 * p.get_width()/total)
x = p.get_x() + p.get_width()# / 2 - 0.05
y = p.get_y() + p.get_height() / 2 - 0.05
ax.annotate(percentage, (x, y), size = 12)
plt.show()

在同一张图中,我想将指示Yes的条形设置为堆叠的条形,表示回答是的人使用了哪种交通工具。最终的图形应该是这样的:

发布于 2021-08-02 13:39:06
我所知道的最简单的方法是将熊猫数据帧分组为:
df_plot = df.fillna('_Hidden').replace('np.nan', '_Hidden').groupby(['Travel', 'Transporation']).size().reset_index().pivot(columns = 'Transporation', index = 'Travel', values = 0)然后您可以使用以下命令进行绘图:
ax = df_plot.plot(kind = 'barh', stacked = True)最后,您可以添加百分比:
total = df.shape[0]
yes = len(df[df['Travel'] == 'Yes'])/total
no = len(df[df['Travel'] == 'No'])/total
for p in ax.patches:
width, height = p.get_width(), p.get_height()
x, y = p.get_xy()
x = x + width
y = y + height / 2 - 0.05
if x/total == yes:
ax.annotate(f'{round(100*yes, 1)}%', (x, y), size = 12)
if x/total == no:
ax.annotate(f'{round(100*no, 1)}%', (x, y), size = 12)
if width != 0:
x, y = p.get_xy()
if y > 0:
ax.text(x + width/2,
y + height/2,
'{:.0f} %'.format(100*width/(yes*total)),
horizontalalignment = 'center',
verticalalignment = 'center')完整代码
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = {'Travel': ['Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'No'],
'Transporation': ['Airplaine', np.nan, 'Car', 'Train', 'Train', 'Car', 'np.nan']}
df = pd.DataFrame (data, columns = ['Travel','Transporation'])
df_plot = df.fillna('_Hidden').replace('np.nan', '_Hidden').groupby(['Travel', 'Transporation']).size().reset_index().pivot(columns = 'Transporation', index = 'Travel', values = 0)
ax = df_plot.plot(kind = 'barh', stacked = True)
ax.legend(['Airplaine', 'Car', 'Train'])
ax.set_yticklabels(ax.get_yticklabels(), rotation = 45)
ax.set_title('Travel last year')
ax.set_ylabel('')
total = df.shape[0]
yes = len(df[df['Travel'] == 'Yes'])/total
no = len(df[df['Travel'] == 'No'])/total
for p in ax.patches:
width, height = p.get_width(), p.get_height()
x, y = p.get_xy()
x = x + width
y = y + height / 2 - 0.05
if x/total == yes:
ax.annotate(f'{round(100*yes, 1)}%', (x, y), size = 12)
if x/total == no:
ax.annotate(f'{round(100*no, 1)}%', (x, y), size = 12)
if width != 0:
x, y = p.get_xy()
if y > 0:
ax.text(x + width/2,
y + height/2,
'{:.0f} %'.format(100*width/(yes*total)),
horizontalalignment = 'center',
verticalalignment = 'center')
plt.show()

https://stackoverflow.com/questions/68622348
复制相似问题