I have a csv file that look like this.
MONTHS A B C MONTHS A B C
0 15024 15700 27795 0 22536 23550 41692
1 12415 15700 23001 1 18623 23550 34501
2 10620 15700 19434 2 15930 23550 29151
3 9304 15700 16696 3 13957 23550 25045
4 8296 15700 14542 4 12444 23550 21813
我正在寻找的是附加公共列,它看起来像这样。
MONTHS A B C D
0 15024 15700 27795 1
1 12415 15700 23001 1
2 10620 15700 19434 1
3 9304 15700 16696 1
4 8296 15700 14542 1
0 22536 23550 41692 2
1 18623 23550 34501 2
2 15930 23550 29151 2
3 13957 23550 25045 2
4 12444 23550 21813 2
最后一列D只表示它出现的顺序。长的方法是读取特定的列,然后以某种方式重塑这些列。有什么简单的方法可以绕过它吗?
发布于 2021-05-18 21:15:49
尝试:
from io import StringIO
import pandas as pd
import numpy as np
csvfile = StringIO("""MONTHS A B C MONTHS A B C
0 15024 15700 27795 0 22536 23550 41692
1 12415 15700 23001 1 18623 23550 34501
2 10620 15700 19434 2 15930 23550 29151
3 9304 15700 16696 3 13957 23550 25045
4 8296 15700 14542 4 12444 23550 21813""")
df = pd.read_csv(csvfile, sep='\s\s+', engine='python')
df.columns = df.columns.str.split('.', expand=True)
df_out = df.stack(1).rename_axis(['Month', 'D']).reset_index()
df_out['D'] = df_out['D'].fillna(0).astype(int).add(1)
df_out = df_out.sort_values(['D','Month'])
df_out
输出:
Month D A B C MONTHS
0 0 1 15024 15700 27795 0
2 1 1 12415 15700 23001 1
4 2 1 10620 15700 19434 2
6 3 1 9304 15700 16696 3
8 4 1 8296 15700 14542 4
1 0 2 22536 23550 41692 0
3 1 2 18623 23550 34501 1
5 2 2 15930 23550 29151 2
7 3 2 13957 23550 25045 3
9 4 2 12444 23550 21813 4
https://stackoverflow.com/questions/67593787
复制