文明网i中国精神文明建设门户网站,昵图网免费图片大全 图库 背景墙,电商网站建设培训学校,网站 被黑注#xff1a;参考文章#xff1a;
SQL连续增长问题--HQL面试题35_sql判断一个列是否连续增长-CSDN博客文章浏览阅读2.6k次#xff0c;点赞6次#xff0c;收藏30次。目录0 需求分析1 数据准备3 小结0 需求分析假设我们有一张订单表shop_order shop_id,order_id,order_time…注参考文章
SQL连续增长问题--HQL面试题35_sql判断一个列是否连续增长-CSDN博客文章浏览阅读2.6k次点赞6次收藏30次。目录0 需求分析1 数据准备3 小结0 需求分析假设我们有一张订单表shop_order shop_id,order_id,order_time,order_amt 我们需要计算过去至少3天销售金额连续增长的商户shop_id。数据如下shop_idorder_amtorder_time11002021-05-10 10:03:5411012021-05-10 10:04:5413002021-0_sql判断一个列是否连续增长https://blog.csdn.net/godlovedaniel/article/details/119080882
0 需求分析 现有一张订单表shop_order ,含有字段shop_id,order_id,order_time,order_amt 需要统计过去至少连续3天销售金额连续增长的商户shop_id。
1 数据准备
create table shop_order(shop_id int,order_amt int,order_time string
)
row format delimited fields terminated by \t;
load data local inpath /opt/module/hive_data/shop_order.txt into table shop_order;2 数据分析 完整的代码如下
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)
selectshop_id
from (select *,-- 判断日期是否连续date_sub(dt, row_number() over (partition by shop_id order by dt )) as order_date_difffrom (selectshop_id,dt,amt,--判断销售额是否增长-- 当前行的销售金额与上一行的销售金额之间的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_diff from tmp) t1-- 差值大于0的代表销售额增长where order_amt_diff 0) t2
group by shop_id, order_date_diff
having count(1) 3;
输出结果为 shop_id 为2
上述代码分析 step1: 求出每家商户销售金额连续增长的记录
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)select *
from (selectshop_id,dt,amt,--判断销售额是否增长-- 当前行的销售金额与上一行的销售金额之间的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_difffrom tmp) t1-- 差值大于0的代表销售额增长
where order_amt_diff 0 step2: 求出每家商户至少连续3天销售金额连续增长在step1的基础上还要求dt是连续的
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)select *,-- 判断日期是否连续date_sub(dt, row_number() over (partition by shop_id order by dt )) as order_date_diff
from (selectshop_id,dt,amt,--判断销售额是否增长-- 当前行的销售金额与上一行的销售金额之间的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_difffrom tmp) t1-- 差值大于0的代表销售额增长
where order_amt_diff 0 step3: 对商户shop_id以及日期差值order_date_diff这两个字段分组求出最终结果
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)
selectshop_id
from (select *,-- 判断日期是否连续date_sub(dt, row_number() over (partition by shop_id order by dt )) as order_date_difffrom (selectshop_id,dt,amt,--判断销售额是否增长-- 当前行的销售金额与上一行的销售金额之间的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_diff --判断是否增长from tmp) t1-- 差值大于0的代表销售额增长where order_amt_diff 0) t2
group by shop_id, order_date_diff
having count(1) 3; 3 小结 date_sub(日期减少函数) 语法date_sub(string startdate,int days)返回值string说明返回 开始日期startdate 减去days天后的日期举例select date_sub(2024-02-01,3) ---2024-01-29 lag 语法lag(column,n,default) over(partition by ....order by....)说明取得column列前边的第n行数据如果存在则返回如果不存在返回默认值default 针对【日期连续】等类型的题型一般处理思路先计算date_sub(dt, row_number() over (partition by shop_id order by dt )) as dt_diff 再对dt_diff 分组求count()值 针对【xx连续增长】等类型的题型一般处理思路利用前后函数lag或者lead往前/往后取一行计算两者的差值diff再利用 if( diff 0,1,0) as flag 等条件判断函数 进行打标签基于标签再进行后续的分组计算.......