目标是深化现有的多指标df。
这样,给定如下所示df
col1 col2
mylevelA_caseA__VAR_A bar one -1.012046 0.808332
mylevelA_caseA__VAR_B bar two -0.558629 -0.358550
mylevelA_caseB__VAR_A baz one 1.514448 -1.045073
mylevelA_caseB__VAR_B baz two 1.268511 -1.100705
mylevelB_caseC__VAR_C foo one -2.108172 -1.694602
mylevelB_caseC__VAR_C_D foo two -0.629493 -0.005071
mylevelB_caseC__VAR_E qux one 0.596771 -0.964429
mylevelB_caseD__VAR_A qux two 0.257154 -0.248278我想把多级索引扩展成类似这样的东西。

在这个阶段,请注意,在第一个索引级别,在关键字VAR之前有双__。
为了实现与上图类似的功能,我们起草了以下代码
import pandas as pd
import numpy as np
arrays = [["mylevelA_caseA__VAR_A", "mylevelA_caseA__VAR_B", "mylevelA_caseB__VAR_A",
"mylevelA_caseB__VAR_B", "mylevelB_caseC__VAR_C", "mylevelB_caseC__VAR_C_D",
"mylevelB_caseC__VAR_E", "mylevelB_caseD__VAR_A"],
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
["one", "two", "one", "two", "one", "two", "one", "two"]]
df = pd.DataFrame(np.random.randn(8, 2), index=arrays,columns=['col1','col2'])
# print(df)
idx_ls=df.index.values.tolist()
new_multiindex=[]
for x in idx_ls:
b=x[0]
vv=b.split('_')
c=[]
new_data=[]
mvar=[]
for xx in vv:
if not c:
if xx:
new_data.append(xx)
else:
c=1
else:
if xx:
mvar.append(xx)
ntuple=(*new_data,"_ ".join(mvar),*x )
new_multiindex.append(ntuple)
t=1
df=df.reindex(ne
w_multiindex,copy=True)
print(df)产生了
col1 col2
mylevelA caseA VAR_ A mylevelA_caseA__VAR_A bar one NaN NaN
VAR_ B mylevelA_caseA__VAR_B bar two NaN NaN
caseB VAR_ A mylevelA_caseB__VAR_A baz one NaN NaN
VAR_ B mylevelA_caseB__VAR_B baz two NaN NaN
mylevelB caseC VAR_ C mylevelB_caseC__VAR_C foo one NaN NaN
VAR_ C_ D mylevelB_caseC__VAR_C_D foo two NaN NaN
VAR_ E mylevelB_caseC__VAR_E qux one NaN NaN
caseD VAR_ A mylevelB_caseD__VAR_A qux two NaN NaN有两个问题。
首先:col1和col2返回nan
第二:我想知道是否有更紧凑的方法来最小化for循环中的代码行。
发布于 2021-08-23 15:12:26
你也可以试试这个:
df.set_index(
df.index.get_level_values(0).str.split("_", n=3, expand=True), append=True
).droplevel(5).reorder_levels([3, 4, 5, 0, 1, 2])输出:
col1 col2
mylevelA caseA VAR_A mylevelA_caseA__VAR_A bar one 2.925263 0.065379
VAR_B mylevelA_caseA__VAR_B bar two -1.544370 0.383090
caseB VAR_A mylevelA_caseB__VAR_A baz one -0.260279 -0.264885
VAR_B mylevelA_caseB__VAR_B baz two 0.071172 -0.201748
mylevelB caseC VAR_C mylevelB_caseC__VAR_C foo one -0.319578 -0.909871
VAR_C_D mylevelB_caseC__VAR_C_D foo two -1.058169 -0.465444
VAR_E mylevelB_caseC__VAR_E qux one -0.432982 -1.999376
caseD VAR_A mylevelB_caseD__VAR_A qux two -0.704989 -0.298849https://stackoverflow.com/questions/68894537
复制相似问题