专栏首页用户3288143的专栏【数据分析可视化】Concatenate和Combine

【数据分析可视化】Concatenate和Combine

import numpy as np
import pandas as pd
from pandas import Series,DataFrame

Concatenate

矩阵:Concatenate Series和DataFrame:concat

# 创建矩阵
arr1 = np.arange(9).reshape(3,3)
arr1
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
# 创建矩阵
arr2 = np.arange(9).reshape(3,3)
arr2
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
# 链接两矩阵 默认竖着链接到下边
np.concatenate([arr1,arr1])
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8],
       [0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
# 链接两矩阵 横向链接
np.concatenate([arr1,arr1],axis=1)
array([[0, 1, 2, 0, 1, 2],
       [3, 4, 5, 3, 4, 5],
       [6, 7, 8, 6, 7, 8]])
# 创建Series
s1 = Series([1,2,3],index=['x','y','z'])
s1
x    1
y    2
z    3
dtype: int64
# 创建Series
s2 = Series([4,5],index=['a','b'])
s2
a    4
b    5
dtype: int64
# concat 竖着连接
pd.concat([s1,s2])
x    1
y    2
z    3
a    4
b    5
dtype: int64
# concat 横着连接 (生成新的DataFrame)
pd.concat([s1,s2],axis=1)
/Users/bennyrhys/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.

To retain the current behavior and silence the warning, pass 'sort=True'.

0

1

a

NaN

4.0

b

NaN

5.0

x

1.0

NaN

y

2.0

NaN

z

3.0

NaN

# 创建DataFrame
df1 = DataFrame(np.random.rand(4,3), columns=['x','y','z'])
df1

x

y

z

0

0.118006

0.976428

0.286200

1

0.554356

0.739202

0.441234

2

0.987343

0.032884

0.963760

3

0.730118

0.617397

0.943546

# 创建DataFrame
df2 = DataFrame(np.random.randn(3,3), columns=['x','y','a'])
df2

x

y

a

0

0.792735

0.927720

1.960326

1

-1.015684

0.524749

1.002970

2

-0.676568

0.378511

0.103341

# 连接 默认竖着
pd.concat([df1,df2])
/Users/bennyrhys/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.

To retain the current behavior and silence the warning, pass 'sort=True'.

a

x

y

z

0

NaN

0.118006

0.976428

0.286200

1

NaN

0.554356

0.739202

0.441234

2

NaN

0.987343

0.032884

0.963760

3

NaN

0.730118

0.617397

0.943546

0

1.960326

0.792735

0.927720

NaN

1

1.002970

-1.015684

0.524749

NaN

2

0.103341

-0.676568

0.378511

NaN

Combine

combine_first特点 两组数据,当前一组nan时,后组填充。 合并后组比前组少的数据

s1 = Series([2,np.nan,4,np.nan], index=['A','B','C','D'])
s1
A    2.0
B    NaN
C    4.0
D    NaN
dtype: float64
s2 = Series([1,2,3,4], index=['A','B','C','D'])
s2
A    1
B    2
C    3
D    4
dtype: int64
# 后往前填充value值(当nan时后填充前)
s1.combine_first(s2)
A    2.0
B    2.0
C    4.0
D    4.0
dtype: float64
# 新建DataFrame
df1 = DataFrame({
    'x':[1,np.nan,3,np.nan],
    'y':[5,np.nan,7,np.nan],
    'z':[9,np.nan,11,np.nan]
})
df1

x

y

z

0

1.0

5.0

9.0

1

NaN

NaN

NaN

2

3.0

7.0

11.0

3

NaN

NaN

NaN

# 新建DataFrame
df2 = DataFrame({
    'z':[np.nan,10,np.nan,12],
    'a':[1,2,3,4]
})
df2

z

a

0

NaN

1

1

10.0

2

2

NaN

3

3

12.0

4

df1.combine_first(df2)

a

x

y

z

0

1.0

1.0

5.0

9.0

1

2.0

NaN

NaN

10.0

2

3.0

3.0

7.0

11.0

3

4.0

NaN

NaN

12.0

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 【数据分析可视化】Series和Dataframe的Reindexing

    瑞新
  • 【数据分析可视化】谈一谈NaN

    瑞新
  • 【数据分析与可视化】DataFrame的Selecting和indexing

    瑞新
  • Pandas-15.window函数

    悠扬前奏
  • 一次假期故障引发的性能优化思考

    在假期某个夜黑风高的晚上,商家正在直播间如火如荼的做着直播,突然间屏幕卡顿,随后屏幕上出现大大的“404”,紧接着大量的客诉、告警扑面而来。好在有赞教育的技术团...

    测试开发社区
  • 一次假期故障引发的性能优化思考

    在假期某个夜黑风高的晚上,商家正在直播间如火如荼的做着直播,突然间屏幕卡顿,随后屏幕上出现大大的“404”,紧接着大量的客诉、告警扑面而来。好在有赞教育的技术团...

    用户1278550
  • 怎么样描述你的数据——用python做描述性分析

    一般在数据分析的过程中,拿到数据不会去直接去建模,而是先做描述性分析来对数据有一个大致的把握,很多后续的建模方向也是通过描述性分析来进一步决定的。那么除了在Ex...

    刘早起
  • Json.NET API-Linq to Json

    [翻译]Json.NET API-Linq to Json Basic Operator(基本操作)2010-01-02 03:02 by chenkai, 2...

    DougWang
  • Python 实现将numpy中的nan和inf,nan替换成对应的均值

    np.count_nonzero() 返回的是数组中的非0元素个数;true的个数。

    砸漏
  • 手残党福利:SmartPrompt Pan智能炒锅

    镁客网

扫码关注云+社区

领取腾讯云代金券