首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >从多个CSV列到带演算的Dataframe列

从多个CSV列到带演算的Dataframe列
EN

Stack Overflow用户
提问于 2021-10-01 10:52:25
回答 1查看 43关注 0票数 1

我有10个got文件,如下所示:

我想用vwap计算在我的dataframe中添加10列。我尝试创建列,然后将其连接到dataframe中,但根本不起作用。我尝试了很多方法,主要的问题是我不能创建包含计算行的新列:

代码语言:javascript
运行
复制
import pandas as pd
import os
import glob
from IPython.display import display, HTML
import csv
# use glob to get all the csv files 
# in the folder

path = os.getcwd()
csv_files = glob.glob(os.path.join("*.csv"))

""" 
#To change the name of every columns
liste1 = []
header_list = []
for f in csv_files:
    liste1.append(f)
header_list = [a.strip(".csv") for a in liste1]
 """
def add(f):
    df = pd.read_csv(f, header=0)
    df["timestamp"] = pd.to_datetime(df["timestamp"])
    df = df.groupby(pd.Grouper(key = "timestamp", freq = "h")).agg("mean").reset_index()
    price = df["price"]
    amount = df["amount"]
    return df.assign(vwap  = (price * amount).cumsum() / amount.cumsum())

for f in csv_files:
    df = pd.read_csv(f, header=0)
    df2 = pd.concat(add(f))
    df2.to_csv(r"C:\Users\vion1\Ele\Engie\Sorbonne\resultat\resultat_projet_4.csv", encoding='utf-8', index=False, mode = "a")

谢谢你的帮忙

回溯:

代码语言:javascript
运行
复制
TypeError                                 
Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_16732/557098648.py in <module>
     31 for f in csv_files:
     32     df = pd.read_csv(f, header=0)
---> 33     df2 = pd.concat(add(f))
     34     df2.to_csv(r"C:\Users\vion1\Ele\Engie\Sorbonne\resultat\resultat_projet_4.csv", encoding='utf-8', index=False, mode = "a")
     35 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

~\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    292     ValueError: Indexes have overlapping values: ['a']
    293     """
--> 294     op = _Concatenator(
    295         objs,
    296         axis=axis,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    327     ):
    328         if isinstance(objs, (ABCSeries, ABCDataFrame, str)):
--> 329             raise TypeError(
    330                 "first argument must be an iterable of pandas "
    331                 f'objects, you passed an object of type "{type(objs).__name__}"'

TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"
EN

Stack Overflow用户

回答已采纳

发布于 2021-10-01 11:10:56

如果只需要在输出中聚合值:

代码语言:javascript
运行
复制
def add(df):
    #Removed read_csv 
    df["timestamp"] = pd.to_datetime(df["timestamp"])
    df = df.groupby(pd.Grouper(key = "timestamp", freq = "h")).agg("mean").reset_index()
    price = df["price"]
    amount = df["amount"]
    return (price * amount).cumsum() / amount.cumsum()

out = []
for f in csv_files:
    df = pd.read_csv(f, header=0)
    #added aggregate DataFrame with new column to list of DataFrames
    out.append(add(df))
    
#joined all dfs together
df2 = pd.concat(out, ignore_index=True, axis=1)  
#removed append mode
df2.to_csv(r"C:\Users\vion1\Ele\Engie\Sorbonne\resultat\resultat_projet_4.csv", 
             encoding='utf-8')
票数 1
EN
查看全部 1 条回答
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69404671

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档