我有一个包含N个键列的CSV文件,还有一个包含表达式的列,其中包含对1到N个键列的引用,我希望用该行的每个键列中的值替换这些键列。希望下面的例子能澄清我的意思。
下面的关键列是A、B、C

所需输出:
20_A
20_B
30_A
30_B
40_C_4
40_C_5我的解决方案是:
keys = ['Age','Type','Delay']
df = pd.read_csv(csv_path)
for index, row in df.iterrows():
key1_list = row[keys[0]].split(",")
key2_list = row[keys[1]].split(",")
key3_list = row[keys[2]].split(",")
expression = row['Expression']
# Iterate over all combinations of key column values and export a chart for each one
for KEY1 in key1_list:
for KEY2 in key2_list:
for KEY3 in key3_list:
string = expression
string = string.replace("<" + keys[0] + ">", KEY1)
string = string.replace("<" + keys[1] + ">", KEY2)
string = string.replace("<" + keys[2] + ">", KEY3)
print(string)但是,我想将我的代码概括为适用于任意数量的键列,并且只需要在开始时更新键列表。这将需要循环到深度镜头(关键点)。但我不知道如何用平面代码将循环泛化到任何深度,我查看了itertools,但找不到我需要的东西。我认为递归可能会起作用,但我更喜欢避免这种情况。
发布于 2018-08-25 02:50:53
递归当然可以帮你解决这个问题,但在走这条路之前,你应该再看看itertools。您需要的是密钥的乘积,以生成所有可能的密钥组合。
实现这一目标的一种方法如下:
import pandas as pd
import itertools
csv_path = "path/to/file"
df = pd.read_csv(csv_path)
# Find available keys from data frame instead of manually input it:
keys = list(df.keys()[:-1]) # Do not include "Expression" as it is not a key.
for index, row in df.iterrows():
# Add list of keys to a list of lists
# (The order needs to be preserved, therefore avoiding dict)
key_list = []
for key in keys:
# The code uses ',' as value separator in each cell.
# Does this work in a csv file?
key_list.append(list(row[key].split(',')))
expression = row['Expression']
# All key combinations are then generated with 'itertools.product'
combos = itertools.product(*key_list)
# Each combo is then handled separately
for combo in combos:
string = expression
# Replace each key in order
# Must be done sequentially since depth is not known/variable
for key, value in zip(keys, combo):
string = string.replace('<' + key + '>', value)
print(string)希望这段代码是可以理解的,并且可以做你想做的事情。否则,请让我知道,我会尝试进一步澄清。
https://stackoverflow.com/questions/52007263
复制相似问题