专栏首页浪淘沙hive 窗口分析函数

hive 窗口分析函数

hive 窗口分析函数

0: jdbc:hive2://localhost:10000> select * from t_access; ±---------------±--------------------------------±----------------------±-------------±-+ | t_access.ip | t_access.url | t_access.access_time | t_access.dt | ±---------------±--------------------------------±----------------------±-------------±-+ | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 20170804 | | 192.168.33.3 | http://www.edu360.cn/teach | 2017-08-04 15:35:20 | 20170804 | | 192.168.33.4 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 20170804 | | 192.168.33.4 | http://www.edu360.cn/job | 2017-08-04 16:30:20 | 20170804 | | 192.168.33.5 | http://www.edu360.cn/job | 2017-08-04 15:40:20 | 20170804 | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 20170805 | | 192.168.44.3 | http://www.edu360.cn/teach | 2017-08-05 15:35:20 | 20170805 | | 192.168.33.44 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 20170805 | | 192.168.33.46 | http://www.edu360.cn/job | 2017-08-05 16:30:20 | 20170805 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-05 15:40:20 | 20170805 | | 192.168.133.3 | http://www.edu360.cn/register | 2017-08-06 15:30:20 | 20170806 | | 192.168.111.3 | http://www.edu360.cn/register | 2017-08-06 15:35:20 | 20170806 | | 192.168.34.44 | http://www.edu360.cn/pay | 2017-08-06 15:30:20 | 20170806 | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 20170806 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 20170806 | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 20170806 | | 192.168.33.25 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 20170806 | | 192.168.33.36 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 20170806 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 20170806 | ±---------------±--------------------------------±----------------------±-------------±-+

LAG函数

select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, lag(access_time,1,0) over(partition by ip order by access_time)as last_access_time from t_access;

±---------------±--------------------------------±---------------------±----±---------------------±-+ | ip | url | access_time | rn | last_access_time | ±---------------±--------------------------------±---------------------±----±---------------------±-+ | 192.168.111.3 | http://www.edu360.cn/register | 2017-08-06 15:35:20 | 1 | 0 | | 192.168.133.3 | http://www.edu360.cn/register | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.33.25 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 1 | 0 | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | 0 | | 192.168.33.3 | http://www.edu360.cn/teach | 2017-08-04 15:35:20 | 2 | 2017-08-04 15:30:20 | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 3 | 2017-08-04 15:35:20 | | 192.168.33.36 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 1 | 0 | | 192.168.33.4 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | 0 | | 192.168.33.4 | http://www.edu360.cn/job | 2017-08-04 16:30:20 | 2 | 2017-08-04 15:30:20 | | 192.168.33.44 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 1 | 0 | | 192.168.33.46 | http://www.edu360.cn/job | 2017-08-05 16:30:20 | 1 | 0 | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-05 16:30:20 | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 3 | 2017-08-06 16:30:20 | | 192.168.33.5 | http://www.edu360.cn/job | 2017-08-04 15:40:20 | 1 | 0 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-05 15:40:20 | 1 | 0 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 2 | 2017-08-05 15:40:20 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 3 | 2017-08-06 15:40:20 | | 192.168.34.44 | http://www.edu360.cn/pay | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.44.3 | http://www.edu360.cn/teach | 2017-08-05 15:35:20 | 1 | 0 | ±---------------±--------------------------------±---------------------±----±---------------------±-+

LEAD函数

select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, lead(access_time,1,0) over(partition by ip order by access_time)as last_access_time from t_access; ±---------------±--------------------------------±---------------------±----±---------------------±-+ | ip | url | access_time | rn | last_access_time | ±---------------±--------------------------------±---------------------±----±---------------------±-+ | 192.168.111.3 | http://www.edu360.cn/register | 2017-08-06 15:35:20 | 1 | 0 | | 192.168.133.3 | http://www.edu360.cn/register | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.33.25 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 1 | 0 | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 15:35:20 | | 192.168.33.3 | http://www.edu360.cn/teach | 2017-08-04 15:35:20 | 2 | 2017-08-05 15:30:20 | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 3 | 0 | | 192.168.33.36 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 1 | 0 | | 192.168.33.4 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 16:30:20 | | 192.168.33.4 | http://www.edu360.cn/job | 2017-08-04 16:30:20 | 2 | 0 | | 192.168.33.44 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 1 | 0 | | 192.168.33.46 | http://www.edu360.cn/job | 2017-08-05 16:30:20 | 1 | 2017-08-06 16:30:20 | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-06 16:30:20 | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 3 | 0 | | 192.168.33.5 | http://www.edu360.cn/job | 2017-08-04 15:40:20 | 1 | 0 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-05 15:40:20 | 1 | 2017-08-06 15:40:20 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 2 | 2017-08-06 15:40:20 | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 3 | 0 | | 192.168.34.44 | http://www.edu360.cn/pay | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.44.3 | http://www.edu360.cn/teach | 2017-08-05 15:35:20 | 1 | 0 | ±---------------±--------------------------------±---------------------±----±---------------------±-+

FIRST_VALUE 函数

例:取每个用户访问的第一个页面 select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, first_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time from t_access; ±---------------±--------------------------------±---------------------±----±--------------------------------±-+ | ip | url | access_time | rn | last_access_time | ±---------------±--------------------------------±---------------------±----±--------------------------------±-+ | 192.168.111.3 | http://www.edu360.cn/register | 2017-08-06 15:35:20 | 1 | http://www.edu360.cn/register | | 192.168.133.3 | http://www.edu360.cn/register | 2017-08-06 15:30:20 | 1 | http://www.edu360.cn/register | | 192.168.33.25 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 1 | http://www.edu360.cn/job | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | http://www.edu360.cn/stu | | 192.168.33.3 | http://www.edu360.cn/teach | 2017-08-04 15:35:20 | 2 | http://www.edu360.cn/stu | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 3 | http://www.edu360.cn/stu | | 192.168.33.36 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 1 | http://www.edu360.cn/excersize | | 192.168.33.4 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | http://www.edu360.cn/stu | | 192.168.33.4 | http://www.edu360.cn/job | 2017-08-04 16:30:20 | 2 | http://www.edu360.cn/stu | | 192.168.33.44 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 1 | http://www.edu360.cn/stu | | 192.168.33.46 | http://www.edu360.cn/job | 2017-08-05 16:30:20 | 1 | http://www.edu360.cn/job | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 2 | http://www.edu360.cn/job | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 3 | http://www.edu360.cn/job | | 192.168.33.5 | http://www.edu360.cn/job | 2017-08-04 15:40:20 | 1 | http://www.edu360.cn/job | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-05 15:40:20 | 1 | http://www.edu360.cn/job | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 2 | http://www.edu360.cn/job | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 3 | http://www.edu360.cn/job | | 192.168.34.44 | http://www.edu360.cn/pay | 2017-08-06 15:30:20 | 1 | http://www.edu360.cn/pay | | 192.168.44.3 | http://www.edu360.cn/teach | 2017-08-05 15:35:20 | 1 | http://www.edu360.cn/teach | ±---------------±--------------------------------±---------------------±----±--------------------------------±-+

LAST_VALUE 函数

例:取每个用户访问的最后一个页面 select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, last_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time from t_access; ±---------------±--------------------------------±---------------------±----±--------------------------------±-+ | ip | url | access_time | rn | last_access_time | ±---------------±--------------------------------±---------------------±----±--------------------------------±-+ | 192.168.111.3 | http://www.edu360.cn/register | 2017-08-06 15:35:20 | 1 | http://www.edu360.cn/register | | 192.168.133.3 | http://www.edu360.cn/register | 2017-08-06 15:30:20 | 1 | http://www.edu360.cn/register | | 192.168.33.25 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 1 | http://www.edu360.cn/job | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | http://www.edu360.cn/stu | | 192.168.33.3 | http://www.edu360.cn/teach | 2017-08-04 15:35:20 | 2 | http://www.edu360.cn/stu | | 192.168.33.3 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 3 | http://www.edu360.cn/stu | | 192.168.33.36 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 1 | http://www.edu360.cn/excersize | | 192.168.33.4 | http://www.edu360.cn/stu | 2017-08-04 15:30:20 | 1 | http://www.edu360.cn/stu | | 192.168.33.4 | http://www.edu360.cn/job | 2017-08-04 16:30:20 | 2 | http://www.edu360.cn/stu | | 192.168.33.44 | http://www.edu360.cn/stu | 2017-08-05 15:30:20 | 1 | http://www.edu360.cn/stu | | 192.168.33.46 | http://www.edu360.cn/job | 2017-08-05 16:30:20 | 1 | http://www.edu360.cn/job | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 2 | http://www.edu360.cn/job | | 192.168.33.46 | http://www.edu360.cn/excersize | 2017-08-06 16:30:20 | 3 | http://www.edu360.cn/job | | 192.168.33.5 | http://www.edu360.cn/job | 2017-08-04 15:40:20 | 1 | http://www.edu360.cn/job | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-05 15:40:20 | 1 | http://www.edu360.cn/job | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 2 | http://www.edu360.cn/job | | 192.168.33.55 | http://www.edu360.cn/job | 2017-08-06 15:40:20 | 3 | http://www.edu360.cn/job | | 192.168.34.44 | http://www.edu360.cn/pay | 2017-08-06 15:30:20 | 1 | http://www.edu360.cn/pay | | 192.168.44.3 | http://www.edu360.cn/teach | 2017-08-05 15:35:20 | 1 | http://www.edu360.cn/teach | ±---------------±--------------------------------±---------------------±----±--------------------------------±-+

/* 累计报表–分析函数实现版 */ – sum() over() 函数 select id ,month ,sum(amount) over(partition by id order by month rows between unbounded preceding and current row) from (select id,month, sum(fee) as amount from t_test group by id,month) tmp;

打序号

score  rownumber  rankover  dense_rank  ntile 89    1      1     1       1 90    2      2     2       1 90    3     2      2      1 91    4      4    3      2 92    5      5    4      2

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 前端学习笔记

    1.css简介 用来修饰html样式的一种语言,层叠样式表 增强复用性 方便后期维护 2.css样式引入方式: (1)内嵌...

    曼路
  • SparkCore 编程

    2.创建一个数组,根据数据创建一个Bean对象,继承Order,实现序列化(Serializable).从而对数组进行排序。

    曼路
  • 关于数组的算法

    3.给定一个数组和一个数num,把小于num的书放在数组左边,大于num的书放在数组右边

    曼路
  • 20篇最值得一读的深度推荐系统与CTR预估论文

    微博上近日流传一个段子,“2020年曾是各大科幻片中遥远的未来,但是现在离这个遥远的未来也只有6个月时间了”。只是借此感慨一下2019年转瞬之间半年的时间已经过...

    AI科技大本营
  • Exchange 2007迁移2010后部分手机邮箱不能使用解决办法

    近期一个客户Exchange 2007迁移到了2010后,出现了一个问题,部分的手机邮箱无法使用,经过测试发现如下问题:

    李珣
  • 说实话,你的API接口在高并发面前不堪一击!

    衡量一个接口是否能够支撑高并发,我们最先想到的就是他的QPS是多少,影响QPS大小的关键因素是你的系统是否支持分布式、高并发,当提起这几三个词的时候,是不是很多...

    猿天地
  • C++核心准则C.100:定义容器时遵从STL标准‍

    C.100: Follow the STL when defining a container

    面向对象思考
  • python基础教程:内置函数(一)

    Python解释器内置了很多函数和类型,可以随时随地在编程中使用。其实,我们在前面的学习中已经接触过好多内置函数。

    一墨编程学习
  • PostgreSQL 与大小写的“坑”

    其实每种数据库都有自己的特色,PostgreSQL 也不例外,其中如果你留心PostgreSQL被最常问及的问题之一,就是大小写的问题。今天的讨论不涉及数据库名...

    AustinDatabases
  • 静态文件与APP

    郭大侠

扫码关注云+社区

领取腾讯云代金券

玩转腾讯云 有奖征文活动