文件名修改:
import os
path='D:/ave'
files=os.listdir(path)
for i in files:
original=path+'/'+i
new=path+'/'+str(20180)+i
os.rename(original,new)#真是难得的我会加个缩进
@numba.jit
该装饰器与numpy合用的时候出警告。
os.walk 遍历文件夹(含子文件夹),os.listdir遍历文件(不含子文件夹)
查看内存占用:
import sys
import numpy as np
a=np.linspace(1,10000,10000)
print(sys.getsizeof(a))#用以查看占用内存数量,为大数据节约可怜的内存做准备
a=a.astype('int16')
print(sys.getsizeof(a))#比较占用内存数量
使用掩码数组
两个数组形状一样,把一个数组里的0用另一个数组的对应数据替换
>>> a = np.array([[1, 2, 0], [2, 0, 0], [-3, -1, 0]])
>>> b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
a[a==0]=b[a==0]
矩阵乘法
np.dot
pandas的交集、并集、补集、leftjoin等:
1.场景,对于colums都相同的dataframe做过滤的时候
例如:
df1 = DataFrame([[‘a‘, 10, ‘男‘],
[‘b‘, 11, ‘男‘],
[‘c‘, 11, ‘女‘],
[‘a‘, 10, ‘女‘],
[‘c‘, 11, ‘男‘]],
columns=[‘name‘, ‘age‘, ‘sex‘])
df2 = DataFrame([[‘a‘, 10, ‘男‘],
[‘b‘, 11, ‘女‘]],
columns=[‘name‘, ‘age‘, ‘sex‘])
取交集:print(pd.merge(df1,df2,on=[‘name‘, ‘age‘, ‘sex‘]))
取并集:print(pd.merge(df1,df2,on=[‘name‘, ‘age‘, ‘sex‘], how=‘outer‘))
取差集(从df1中过滤df1在df2中存在的行):
df1 = df1.append(df2)
df1 = df1.append(df2)
df1 = df1.drop_duplicates(subset=[‘name‘, ‘age‘, ‘sex‘],keep=False)
print(df1)
Pandas groupy分组计算
import pandas as pd
import time
group=df_tables.groupby([df_tables['pointxy'],df_tables['doy']])
b=group.mean()
name=a['pointxy']
ak=list(name)
xian=[]
for i in range(len(name)):
akk=ak[i]
tempa=akk[:3]
xian.append(tempa)
xian=np.array(xian).reshape((len(name),1))
a['xian']=xian
a1=np.tile(xian,(8760,1))
dfarr['xian']=a1
a1['c'] = pd.to_datetime(dfarr['date2018'],format='%Y-%m-%d %H:%M:%S')
month=[i.month for i in a1["c"]]
dfarr['month']=month
quarter=[i.quarter for i in a1["c"]] #季度,
dfarr['quarter']=quarter
a1=dfarr[dfarr.o3!=999999]
group=a1.groupby([a1['xian'],a1['month']])
b=group.mean()
b.to_csv('D:/minxinan/temp/o3.csv',encoding='gbk')
##不缩进了