问如何导入CSV以分离变量以便对Matplotlib条形图进行“计数”
EN

Stack Overflow用户

提问于 2018-06-04 03:34:14

回答 1查看 110关注 0票数 0

我目前正在对CSV文件编写一些描述性分析。

我想做的是；在CSV文件中创建位置出现的条形图，并绘制到条形图。

我想知道是否有一种方法可以将“许多”不同的位置作为单独的变量导入，因为当正则表达式与CSV匹配时，我需要为每个变量加1。

CSV列被称为；(为了方便起见，有些列被缩写为变量)

自治市：

COL = 0
Barnet = 0
Bexley = 0
BAD = 0
Brent = 0
Bromley = 0
Camden = 0
Croydon = 0
Ealing = 0
Enfield = 0
Greenwich = 0
Hackney = 0
HAF = 0
Haringey = 0
Harrow = 0
Havering = 0
Hillingdon = 0
Hounslow = 0
Islington = 0
KAC = 0
KUT = 0
Lambeth = 0
Lewisham = 0
Merton = 0
Newham = 0
Redbridge = 0
RUT = 0
Southwark = 0
Sutton = 0
TowerHamlets = 0
WalthamForest = 0
Wandsworth = 0
Westminster = 0
OuterBorough = 0
InnerBorough = 0*

下面是我的当前代码和下面图像的输出：

#Start of Imports 
import csv 
import sys 
import numpy as np 
import pandas as pd 
import re 
import matplotlib.pyplot as plt 
#End of Imports

#Start of Declarations 
COL = 0
Barnet = 0
Bexley = 0
BAD = 0
Brent = 0
Bromley = 0
Camden = 0
Croydon = 0
Ealing = 0
#This is as far as I got when I thought something was wrong?
Enfield = 0
Greenwich = 0
Hackney = 0
HAF = 0
Haringey = 0
Harrow = 0
Havering = 0
Hillingdon = 0
Hounslow = 0
Islington = 0
KAC = 0
KUT = 0
Lambeth = 0
Lewisham = 0
Merton = 0
Newham = 0
Redbridge = 0
RUT = 0
Southwark = 0
Sutton = 0
TowerHamlets = 0
WalthamForest = 0
Wandsworth = 0
Westminster = 0
OuterBorough = 0
InnerBorough = 0
#End of Declarations

#Starts reading 'csv file'  
csv = pd.read_csv ('land-area-population-density-london.csv') #Not sure what this does, index_col=3)

#Start of IF Statement  
csva = np.array(csv) 
for column in np.arange(0, csva.shape[0]): 
    if re.match(r"Barnet", str(csva[column][2])) is not None:         
        Barnet = Barnet + 1
    elif re.match(r"Bexley", str(csva[column][2])) is not None:         
        Bexley = Bexley + 1
    elif re.match(r"City of London", str(csva[column][2])) is not None:         
        COL = COL + 1
    elif re.match(r"Barking and Dagenham", str(csva[column][2])) is not None:         
        BAD = BAD + 1
    elif re.match(r"Brent", str(csva[column][2])) is not None:         
        Brent = Brent + 1
    elif re.match(r"Bromley", str(csva[column][2])) is not None:         
        Bromley = Bromley + 1
    elif re.match(r"Camden", str(csva[column][2])) is not None:         
        Camden = Camden + 1
    elif re.match(r"Croydon", str(csva[column][2])) is not None:         
        Croydon = Croydon + 1
    elif re.match(r"Ealing", str(csva[column][2])) is not None:         
        Ealing = Ealing + 1
#End of IF Statement

#Start of graph fields
#Below: Places is the labels for the placesvar
places = ('Barnet', 'Bexley', 'City of London', 'Barking and Dagenham', 'Brent', 'Bromley', 'Camden', 'Croydon', 'Ealing')
#Below: placesvar the actual 'places' pulled from CSV
placesvar = [Barnet, Bexley, COL, BAD, Brent, Bromley, Camden, Croydon, Ealing]
#Y Positioning numpy.arange (Again no idea what this does) length 'places pulled from csv'
y_pos = np.arange(len(placesvar)) 
#End of graph fields

#Start of Graph positions and Names 
plt.bar(y_pos, placesvar, align='center') 
plt.xticks(y_pos, places, rotation=60) 
plt.ylabel('No. of occurance') 
plt.xlabel('Locations') 
plt.title('Occurance of Locations') 
#plt.savefig('file.png')(Commented out for testing) 
#End of Graph positions and Names
plt.show()

这是我当前代码的输出。“Borough”列中缺少一些变量。图片：

我为任何明显的问题道歉，我是Python的新手。

下面是添加：print(csv.head())的输出

       Codes             ...             Population per hectare 2011
0  E09000001             ...                               23.405268
1  E05000026             ...                               99.968726
2  E05000027             ...                               76.304188
3  E05000028             ...                               89.914330
4  E05000029             ...                               29.647929

[5 rows x 10 columns]

python

pandas

csv

numpy

matplotlib

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-06-04 05:49:09

我希望我答对了你的问题..。

这是你的问题的一个简化例子(我相信)：

输入数据test.csv：

    Codes   Borough
0   E1  A
1   E2  B
2   E1  A
3   E1  A
4   E3  C
5   E2  B
6   E1  A
7   E3  C

代码：

import pandas as pd
import matplotlib.pyplot as plt

# read-in data
data = pd.read_csv('./test.csv', sep='\t') #adjust sep to your needs

# count occurences
occurrences = data.loc[:, 'Borough'].value_counts()

# plot histogram
plt.bar(occurrences.keys(), occurrences)

除了data.loc，您还可以使用data.iloc[:, 2]来选择第三列，而不是名为Borough的列。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50670263

复制

相似问题

问如何导入CSV以分离变量以便对Matplotlib条形图进行“计数”
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何导入CSV以分离变量以便对Matplotlib条形图进行“计数”EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何导入CSV以分离变量以便对Matplotlib条形图进行“计数”
EN