print(wine_rev.price.dtype)
,float64
wine_rev.dtypes
,整张表,需要加复数s
!!!country object
description object
designation object
points int64
price float64
province object
region_1 object
region_2 object
taster_name object
taster_twitter_handle object
title object
variety object
winery object
critic object
test_id int32
dtype: object
object
astype()
,可以进行类型转换wine_rev.points.astype('float64')
0 87.0
1 87.0
2 87.0
3 87.0
4 87.0
...
129966 90.0
129967 90.0
129968 90.0
129969 90.0
129970 90.0
Name: points, Length: 129971, dtype: float64
wine_rev.index.dtype
,索引的类型是dtype('int64')
缺少值的条目将被赋予值NaN
,是Not a Number
的缩写。这些NaN
值始终为float64
dtype。
要选择NaN
条目,可以使用pd.isnull()
,pd.notnull()
wine_rev[pd.isnull(wine_rev.country)]
wine_rev.region_2.fillna('Unknown')
,原始数据不改变wine_rev.taster_twitter_handle.replace("@kerinokeefe", "@kerino")
,把前者替换成后者rename()
,可以把索引名、列名更改wine_rev.rename(columns={'points':'score'})
index
,wine_rev.rename(index={0:'michael',1:'ming'})
,index={字典}
rename_axis()
,可以更改行索引、列索引名称wine_rev.rename_axis("酒",axis='rows').rename_axis('特征',axis='columns')
concat(),join() 和 merge()
canadian_youtube = pd.read_csv("../input/youtube-new/CAvideos.csv")
british_youtube = pd.read_csv("../input/youtube-new/GBvideos.csv")
pd.concat([canadian_youtube, british_youtube])
left = canadian_youtube.set_index(['title', 'trending_date'])
right = british_youtube.set_index(['title', 'trending_date'])
left.join(right, lsuffix='_CAN', rsuffix='_UK')