我有以下来自SQL查询的DataFrame:
(Pdb) pp total_rows
ColumnID RespondentCount
0 -1 2
1 3030096843 1
2 3030096845 1
我想像这样旋转它:
total_data = total_rows.pivot_table(cols=['ColumnID'])
(Pdb) pp total_data
ColumnID -1 3030096843 3030096845
RespondentCount 2 1 1
[1 rows x 3 columns]
total_rows.pivot_table(cols=['ColumnID']).to_dict('records')[0]
{3030096843: 1, 3030096845: 1, -1: 2}
但我希望确保将303列转换为字符串,而不是整数,这样我就可以得到以下结果:
{'3030096843': 1, '3030096845': 1, -1: 2}
发布于 2018-11-15 21:53:30
如果您需要将所有列转换为字符串,您可以简单地使用:
df = df.astype(str)
如果您需要除少数列之外的所有列都是字符串/对象,然后返回并将其他列转换为您需要的任何列(在本例中为integer),这是很有用的:
df[["D", "E"]] = df[["D", "E"]].astype(int)
发布于 2021-07-30 18:17:20
有四种方法可以将列转换为字符串
1. astype(str)
df['column_name'] = df['column_name'].astype(str)
2. values.astype(str)
df['column_name'] = df['column_name'].values.astype(str)
3. map(str)
df['column_name'] = df['column_name'].map(str)
4. apply(str)
df['column_name'] = df['column_name'].apply(str)
让我们来看看每种类型的性能。
#importing libraries
import numpy as np
import pandas as pd
import time
#creating four sample dataframes using dummy data
df1 = pd.DataFrame(np.random.randint(1, 1000, size =(10000000, 1)), columns =['A'])
df2 = pd.DataFrame(np.random.randint(1, 1000, size =(10000000, 1)), columns =['A'])
df3 = pd.DataFrame(np.random.randint(1, 1000, size =(10000000, 1)), columns =['A'])
df4 = pd.DataFrame(np.random.randint(1, 1000, size =(10000000, 1)), columns =['A'])
#applying astype(str)
time1 = time.time()
df1['A'] = df1['A'].astype(str)
print('time taken for astype(str) : ' + str(time.time()-time1) + ' seconds')
#applying values.astype(str)
time2 = time.time()
df2['A'] = df2['A'].values.astype(str)
print('time taken for values.astype(str) : ' + str(time.time()-time2) + ' seconds')
#applying map(str)
time3 = time.time()
df3['A'] = df3['A'].map(str)
print('time taken for map(str) : ' + str(time.time()-time3) + ' seconds')
#applying apply(str)
time4 = time.time()
df4['A'] = df4['A'].apply(str)
print('time taken for apply(str) : ' + str(time.time()-time4) + ' seconds')
输出
time taken for astype(str): 5.472359895706177 seconds
time taken for values.astype(str): 6.5844292640686035 seconds
time taken for map(str): 2.3686647415161133 seconds
time taken for apply(str): 2.39758563041687 seconds
与其余两种技术相比,map(str)
和apply(str)
花费的时间更少
发布于 2020-11-14 06:43:20
我通常使用这个:
pd['Column'].map(str)
https://stackoverflow.com/questions/22005911
复制相似问题