文章/答案/技术大牛

发布

社区首页 >问答首页 >python中父和子id的扁平层次结构

问python中父和子id的扁平层次结构
EN

Stack Overflow用户

提问于 2022-02-11 13:48:01

回答 1查看 295关注 0票数 2

我有两个与监督层次is相关的列，一个是父列，另一个是子列。因此，想象一下，父母从首席执行官开始，然后有一个孩子身份，比如首席营销官或任何一位CEO直接报告。我正在尝试基于这两列构建整个sup组织层次结构。

Sup Org Name | superior org ID.  | sup org ID
CEO          |                   | 111
CPO          | 111               | 222
CTO          | 111               | 221 
Engineering  | 221               | 223
PM           | 222               | 224
Backend Dev. | 223               | 228

我的目标是跨多个列创建层次结构，这样我们就可以构建出整个层次结构。

Level 1. | Level 2 | Level 3 | Level 4
111      | 222     | 224
111      | 221     | 223 
111      | 221     | 223     | 228

诸若此类。

我还想看到上面用sup组织名替换id的情况。这两种方式对我的数据都有帮助。

我试过很多自我加入，但必须有一个更干净的方法.

python

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-02-12 18:40:00

使用以下玩具数据帧：

import pandas as pd

df = pd.DataFrame(
    {
        "name": {
            0: "CEO",
            1: "CPO",
            2: "CTO",
            3: "Engineering",
            4: "PM",
            5: "Backend_Dev",
            6: "COO",
        },
        "ID": {0: 111, 1: 222, 2: 221, 3: 223, 4: 224, 5: 228, 6: 220},
        "superior_ID": {0: "", 1: 111, 2: 111, 3: 221, 4: 222, 5: 223, 6: 111},
    }
)

你可以试试这个：

# Setup
df = df.sort_values(by=["superior_ID", "ID"]).reset_index(drop=True)
highest_ID = df.loc[df["superior_ID"] == "", "ID"].values[0]
number_of_paths = df["superior_ID"].value_counts()[highest_ID]
paths = {i + 1: [] for i in range(number_of_paths)}

# Iterate to find all paths (111/222/224 is one path)
for i, row in df.iterrows():
    if row["superior_ID"] == highest_ID:
        paths[i + 1].append(highest_ID)
        paths[i + 1].append(row["ID"])
        continue
    for path in paths.values():
        if row["superior_ID"] in path:
            path.append(row["ID"])

# Create new df concatenating found paths as rows
new_df = pd.DataFrame()
for path in paths.values():
    s = pd.Series(path)
    new_df = pd.concat([new_df, s], axis=1)

# Transpose and cleanup
new_df = (
    new_df
    .T
    .pipe(lambda x: x.iloc[x.isna().sum(axis=1).mul(-1).argsort()])
    .fillna(999)
    .astype(int)
    .pipe(lambda x: x.set_axis([f"level_{i+1}" for i in x.columns], axis="columns"))
    .replace(999, "")
    .reset_index(drop=True)
)

因此：

print(new_df)
# Output
   level_1  level_2 level_3 level_4
0      111      220
1      111      222     224        
2      111      221     223     228

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71081165

复制

相似问题

问python中父和子id的扁平层次结构
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问python中父和子id的扁平层次结构EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问python中父和子id的扁平层次结构
EN