首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >根据计算将Pandas列展开为分隔行

根据计算将Pandas列展开为分隔行
EN

Stack Overflow用户
提问于 2021-04-30 19:08:41
回答 2查看 101关注 0票数 1

我有一个像这样的DataFrame。

代码语言:javascript
复制
df1 = pd.DataFrame(columns=['ID', 'Divide', 'Object', 'List'], data=[ ['A, B', 2, 20, [0, 5]], ['C, D', 2, 40, [10, 15, 35]], ['E, F', 2, 20, [11, 15]], ['G', 1, 10, [1, 5]], ['H', 1, 10, ''], ['I, J', 2, 20, ''] ])

|    | ID   |   Divide |   Object | List         |
|---:|:-----|---------:|---------:|:-------------|
|  0 | A, B |        2 |       20 | [0, 5]       |
|  1 | C, D |        2 |       40 | [10, 15, 35] |
|  2 | E, F |        2 |       20 | [11, 15]     |
|  3 | G    |        1 |       10 | [1, 5]       |
|  4 | H    |        1 |       10 |              |
|  5 | I, J |        2 |       20 |              |

每个ID都需要有自己的行。但是,List列的数据属于每个ID。逻辑如下:

  1. 如果列中有单个ID (没有),则不需要更改。如果有两个ID( )
  2. Then,第一个ID包含,1
  3. Second First ID有来自E 1250: Object/Divided 1
  4. SecondE 129IDE 230列表E 232的E 232对象/拆分: Object - 1

的条目。

最后一张表如下所示:

代码语言:javascript
复制
|    | ID   |   Divide |   Object | List   |
|---:|:-----|---------:|---------:|:-------|
|  0 | A    |        2 |       20 | 0, 5   |
|  1 | B    |        2 |       20 |        |
|  2 | C    |        2 |       40 | 10, 15 |
|  3 | D    |        2 |       40 | 35     |
|  4 | E    |        2 |       20 |        |
|  5 | F    |        2 |       20 | 11, 15 |
|  6 | G    |        1 |       10 | 1, 5   |
|  7 | H    |        1 |       10 |        |
|  8 | I    |        2 |       20 |        |
|  9 | J    |        2 |       20 |        |

如果是列表,那么 lists 就可以用来将列表夷为平地。但是,我不知道该由谁来应用DataFrame中的计算逻辑来解析细节。谢谢

EN

回答 2

Stack Overflow用户

发布于 2021-04-30 19:29:16

你可以试试这个:

代码语言:javascript
复制
import pandas as pd
df = pd.DataFrame(columns=['ID', 'Divide', 'Object', 'List'], data=[ ['A, B', 2, 20, [0, 5]], ['C, D', 2, 40, [10, 15, 35]], ['E, F', 2, 20, [11, 15]], ['G', 1, 10, [1, 5]], ['H', 1, 10, ''], ['I, J', 2, 20, ''] ])

def split_list(lst, limit):
    l1 = list()
    l2 = list()
    for e in lst:
        if e <= limit:
            l1.append(e)
        else:
            l2.append(e)
    return l1, l2

df['ID'] = df['ID'].str.split(', ')
df['Limit'] = df['Object'] / df['Divide']
df['List'] = df.apply(lambda row: dict(zip(row['ID'], split_list(row['List'], row['Limit']))), axis=1)
df = df.explode('ID')
df['List'] = df.apply(lambda row: row['List'].get(row['ID']), axis=1)

print(df)


# Out[192]:
#   ID  Divide  Object      List  Limit
# 0  A       2      20    [0, 5]   10.0
# 0  B       2      20        []   10.0
# 1  C       2      40  [10, 15]   20.0
# 1  D       2      40      [35]   20.0
# 2  E       2      20        []   10.0
# 2  F       2      20  [11, 15]   10.0
# 3  G       1      10    [1, 5]   10.0
# 4  H       1      10        []   10.0
# 5  I       2      20        []   10.0
# 5  J       2      20        []   10.0
票数 0
EN

Stack Overflow用户

发布于 2021-04-30 20:50:39

尝试同时爆炸ID和列表,然后根据ID的顺序进行有条件筛选。

代码语言:javascript
复制
import pandas as pd

df1 = pd.DataFrame(columns=['ID', 'Divide', 'Object', 'List'],
                   data=[['A, B', 2, 20, [0, 5]],
                         ['C, D', 2, 40, [10, 15, 35]],
                         ['E, F', 2, 20, [11, 15]],
                         ['G', 1, 10, [1, 5]],
                         ['H', 1, 10, ''],
                         ['I, J', 2, 20, '']])

# Split and Explode ID
df1['ID'] = df1['ID'].str.split(', ')

# Group By Each ID and set index so that First and Second IDs are tracked
df1 = df1.explode('ID') \
    .groupby(level=0) \
    .apply(lambda x: x.reset_index()) \
    .droplevel(0)
# Calculate Cap For Later
df1['cap'] = df1['Object'] // df1['Divide'] - 1


def split_lists(g):
    # If more than 1 row and non-empty list
    if len(g) > 1 and not g['List'].empty:
        # Check if is the First ID
        if g['level_0'].iloc[0] == 0:
            # Filter Less Than Equal To Cap
            g['List'] = g['List'][g['List'] <= g['cap']]
        else:
            # Filter Greater Than Cap
            g['List'] = g['List'][g['List'] > g['cap']]
    return g


# Explode Lists Group By ID filter using function
# Regroup and convert back to lists
df2 = df1 \
    .explode('List') \
    .reset_index() \
    .groupby('ID') \
    .apply(split_lists) \
    .groupby('ID')['List'] \
    .apply(lambda x: x.dropna().tolist())

# Drop Extra Columns from df1 and merge back
out = df1.drop(columns=['List', 'index', 'cap']) \
    .merge(df2, left_on='ID', right_index=True, how='left') \
    .reset_index(drop=True)

print(out)

退出:

代码语言:javascript
复制
  ID  Divide  Object      List
0  A       2      20    [0, 5]
1  B       2      20        []
2  C       2      40  [10, 15]
3  D       2      40      [35]
4  E       2      20        []
5  F       2      20  [11, 15]
6  G       1      10    [1, 5]
7  H       1      10        []
8  I       2      20        []
9  J       2      20        []

带有附加列的DF1

代码语言:javascript
复制
   index ID  Divide  Object          List  cap
0      0  A       2      20        [0, 5]    9
1      0  B       2      20        [0, 5]    9
0      1  C       2      40  [10, 15, 35]   19
1      1  D       2      40  [10, 15, 35]   19
0      2  E       2      20      [11, 15]    9
1      2  F       2      20      [11, 15]    9
0      3  G       1      10        [1, 5]    9
0      4  H       1      10                  9
0      5  I       2      20                  9
1      5  J       2      20                  9

滤波重组后的DF2

代码语言:javascript
复制
ID
A      [0, 5]
B          []
C    [10, 15]
D        [35]
E          []
F    [11, 15]
G      [1, 5]
H          []
I          []
J          []
Name: List, dtype: object
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67339269

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档