我尝试使用以下命令将列从数据类型float64
转换为int64
:
df['column name'].astype(int64)
但是得到了一个错误:
NameError:未定义名称'int64‘
该列有许多人,但格式为7500000.0
,是否知道如何简单地将此float64
更改为int64
发布于 2017-05-14 02:03:11
pandas 0.24+用于转换带有缺失值的数字的解决方案:
df = pd.DataFrame({'column name':[7500000.0,7500000.0, np.nan]})
print (df['column name'])
0 7500000.0
1 7500000.0
2 NaN
Name: column name, dtype: float64
df['column name'] = df['column name'].astype(np.int64)
ValueError:无法将非限定值(NA或inf)转换为整数
#http://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html
df['column name'] = df['column name'].astype('Int64')
print (df['column name'])
0 7500000
1 7500000
2 NaN
Name: column name, dtype: Int64
我觉得你应该换成numpy.int64
df['column name'].astype(np.int64)
示例:
df = pd.DataFrame({'column name':[7500000.0,7500000.0]})
print (df['column name'])
0 7500000.0
1 7500000.0
Name: column name, dtype: float64
df['column name'] = df['column name'].astype(np.int64)
#same as
#df['column name'] = df['column name'].astype(pd.np.int64)
print (df['column name'])
0 7500000
1 7500000
Name: column name, dtype: int64
如果列中的某些NaN
需要通过fillna
将它们替换为某些int
(例如0
),因为NaN
的type
为float
df = pd.DataFrame({'column name':[7500000.0,np.nan]})
df['column name'] = df['column name'].fillna(0).astype(np.int64)
print (df['column name'])
0 7500000
1 0
Name: column name, dtype: int64
另请检查documentation - missing data casting rules
编辑:
使用NaN
%s转换值是错误的:
df = pd.DataFrame({'column name':[7500000.0,np.nan]})
df['column name'] = df['column name'].values.astype(np.int64)
print (df['column name'])
0 7500000
1 -9223372036854775808
Name: column name, dtype: int64
发布于 2017-05-14 02:09:20
您可能需要传入字符串'int64'
>>> import pandas as pd
>>> df = pd.DataFrame({'a': [1.0, 2.0]}) # some test dataframe
>>> df['a'].astype('int64')
0 1
1 2
Name: a, dtype: int64
有一些指定64位整数的替代方法:
>>> df['a'].astype('i8') # integer with 8 bytes (64 bit)
0 1
1 2
Name: a, dtype: int64
>>> import numpy as np
>>> df['a'].astype(np.int64) # native numpy 64 bit integer
0 1
1 2
Name: a, dtype: int64
或者直接对列使用np.int64
(但它会返回一个numpy.array
):
>>> np.int64(df['a'])
array([1, 2], dtype=int64)
发布于 2019-03-05 04:16:52
这在Pandas 0.23.4中似乎有点小问题?
如果存在np.nan值,则会如预期的那样抛出错误:
df['col'] = df['col'].astype(np.int64)
但不会像我预期的那样将任何值从float更改为int,如果使用"ignore“:
df['col'] = df['col'].astype(np.int64,errors='ignore')
如果我首先转换np.nan,它可以工作:
df['col'] = df['col'].fillna(0).astype(np.int64)
df['col'] = df['col'].astype(np.int64)
现在我想不出如何让空值来代替零,因为这会再次将所有内容转换回浮点数:
df['col'] = df['col'].replace(0,np.nan)
https://stackoverflow.com/questions/43956335
复制相似问题