Great Video👍 My sol- With t1 as( select date_value as d, state, Row_number() over(partition by state order by date_value) as r, Row_number() over(order by date_value) as r2 from tasks )
select min(d) as start_date, max(d) as end_date , min(state) from t1 group by (r2-r) Order by start_date;
you will get wrong answer if you have the status fail in 7th and success in 8th of that month, so it's always better to group by dates(as ankit told in vedio) instead group on numbers.
i was asked the reverse of this question ,ie output here was input and i need to generate the exploded form which is input in your case, it was in expedia interview@@ankitbansal6
Hi Ankit, Thanks for your valuable resources ;with cte_1 as ( select *,ROW_NUMBER() over (order by date_value) - ROW_NUMBER() over (partition by state order by date_value) differ from tasks ) select min(date_value) start_date,max(date_value) end_date ,state,differ from cte_1 group by state,differ order by start_date,end_date
Thanks Ankit for the amazing sql course . with cte as ( select * , rank() over(partition by state order by date_value) as rn , ROW_NUMBER() over(order by date_value ) as row from tasks ) select min(date_value) as start_date , max(date_value) as end_date , state from cte group by (rn-row) , state order by min(date_value)
This was the simplest trick I have come across.. I did the same in my project but found the last date first for consecutive dates and then found the min date fir every group. It looks messy but works.. this one is awesome..
with cte as ( SELECT *, lag(state, 1,state) over (order by date_value) as prev from tasks ), cte2 as( SELECT *, sum(case when prev = state then 0 else 1 end) over (order by date_value) as flag from cte ) SELECT min(date_value) as start, max(date_value) as endDate, min(state) as state from cte2 GROUP BY flag
Awesome. My approach. with cte as( select *, row_number() over(order by state, date_value) rn, date_value - row_number() over(order by state, date_value)::int as dt from tasks) select min(date_value) as start_dt, max(date_value) as end_dt, dt from cte group by dt, state;
select min(date_value) as start_date, max(date_value) as end_date, state from (select *, date_value - (rn || 'days'):: interval as daydiff from (select *, row_number() over (partition by state order by date_value) rn from tasks3)a)v group by state, daydiff; Hey Ankit, I really forgot this trick, thanks a lot for reminding !
;with cte as( select *,DAY(date_value)- ROW_NUMBER() over (partition by state order by date_value) rnk from tasks ) select MIN(date_value) as start_date, MAX(date_value) as end_date, state from cte group by state,rnk
with cte as (select *,day(date_value) -ROW_NUMBER() over(partition by state order by date_value) as r from tasks) select min(date_value),max(date_value),max(state) from cte group by r
Thanks Ankit for the rescue, I was literally searching for such simple solution.You are really doing an amazing job. 👏👏 If possible please make a video on how to handle such scenarios in case of non continuous range of dates.
@@ankitbansal6 with cte as( select date_value,state,lead(state) over( order by date_value)as statenext ,lag(state) over( order by date_value)as stateprev ,iif(lead(state) over( order by date_value)=state,1,0) as flagnext ,iif(lag(state) over( order by date_value)=state,1,0) as flagprev from tasks), cte2 as (select date_value,lead(date_value,1,date_value) over (order by date_value) as end_date ,state,flagnext,flagprev from cte where flagnext=0 or flagprev=0) select date_value,end_date,state from cte2 where flagnext=1 or(flagnext=0 and flagprev=0)
@@ankitbansal6 Hi Ankit! This is your colleague from your last firm "Prashant Priyadarshi". You are doing an amazing job. I love to get my hands dirty by solving the amazing problems you post. Keep up the great work you are doing. 😀
Banty - The below solution will help you with the non-continuous dates as well. Happy Learning! with prev_state as ( select * , LAG(state,1,state) over (partition by null order by date_value) as prev_state from tasks ) , cal_state_change as ( select * ,sum(case when state = prev_state then 0 else 1 end) over (partition by null order by date_value) as state_change_flag from prev_state) select start_date, end_date, state from ( select state_change_flag, min(state) as state, MIN(date_value) as start_date, MAX(date_value) as end_date from cal_state_change group by state_change_flag ) src;
Thanks a lot Ankit for posting such a interesting question.Please find my solution below : with groups as (Select * ,row_number() over ( partition by state order by date_value) as rn ,dateadd(day , -1*row_number() over ( partition by state order by date_value) ,date_value) as group_date from tasks ) select min(date_value) as start_date ,max(date_value) as end_date,state from groups group by group_date,state order by min(date_value);
with cte as ( select *, rank() over(partition by state order by date_value) as rk, row_number() over(order by date_value) as rn from tasks ) select state, min(date_value), max(date_value) from cte group by rn-rk;
thanks a lot bro ,i am not able to understand any solution,even the video solution But fortunately i got your comment and Boom 💣 I understand Thanks again Bro
a similar question was asked to me in interview instead of state it was price. I had to create window of dates when ever price changes. I could not solve it that time. it really looks simple in first instance but it is not. :)
Thanks for the awesome video. This method works on any date_value whether continuos or not. Here is the output I tried - "2019-01-11" "2019-01-12" "fail" "2019-01-01" "2019-01-02" "success" "2019-01-10" "2019-01-10" "success" "2019-01-13" "2019-01-13" "success"...
@@ankitbansal6 Input table has below data :- "2019-01-01" "success" "2019-01-02" "success" "2019-01-10" "success" "2019-01-11" "fail" "2019-01-12" "fail" "2019-01-13" "success" Solution for it is :- with grp_dt as( SELECT *,row_number() over (partition by state order by date_value) as rn, dateadd( day,-1*row_number() over (partition by state order by date_value),date_value) as group_date from tasks2) SELECT min(date_value) as start_date,max(date_value) as end_date,state from grp_dt group by state,group_date order by state,group_date
We can also simply use the lag function to see which row has its state different from the previous one. This is a really cool trick though. Thanks a lot!
with required_data as ( select date_value, state, lag1, sum(lag1) over (order by date_value) as grp from ( select date_value, state, case when lag(state) over (order by date_value) = state then 0 else 1 end as lag1 from tasks ) ) select min(date_value) start_date, max(date_value) end_date, state from required_data group by state, grp order by start_date
I used the CTE and a different approach then yours.. as that what came to my mind at the time of solving the problem statement.. Please share your feedback. :) with cte1 as ( Select *, case when lag(state) over(order by date_value) != State then 1 else 0 end as is_Change from tasks) , groups as ( Select date_value, state, sum(is_change) over(order by date_value) as grp_number from cte1 ) Select State, min(Date_Value) as Time_Start, Max(Date_Value) as Time_End from groups group by grp_number, state order by Time_Start;
there is a chance that some intermediate dates could be missing, there is no guarantee for cosecutive dates. My solution : MS SQL with cte1 as ( select *, (case when lag(state) over (order by date_value)=state then 0 else 1 end) flag from tasks ), cte2 as ( select *, sum(flag) over (order by date_value) as state_group from cte1 ) select state,min(date_value) as start_date, max(date_value) as end_date from cte2 group by state,state_group
my approach:: with cte as ( select *, ROW_NUMBER() OVER(order by date_value) - ROW_NUMBER() OVER(partition by state order by date_value) as rn2 from tasks ) select min(date_value) as start_date ,max(date_value) as end_date ,state from cte group by rn2, state order by 1
--solution with temp as ( --calculate current and previous state select date_value,state,coalesce(lag(state) over(order by date_value),'XX') prev_state from tasks ), temp1 as ( select date_value,state,prev_state, case when state prev_state then 1 else 0 end flag -- set flag to 1 when state changes and 0 if state doesn't change from temp ), temp2 as ( select date_value,state, sum(flag) over( order by date_value rows between unbounded preceding and 0 preceding) flag1 -- every change in state will get new number from temp1 order by date_value ) select min(date_value),max(date_value),state from temp2 group by state,flag1
with tmp as (select *,lag(state) over (order by date_value )lag_st, lead(state) over (order by date_value) lead_st from tasks) ,tmp1 as (select *,lead(DATEADD(Day,-1,date_value)) over (order by date_value) prev from tmp where lag_st is null or lead_st is null or tmp.statelag_st ) select date_value as start_date, coalesce(prev,date_value) as end_date ,state from tmp1 select * from tasks
@@ankitbansal6 @Gulshan Kumar I ran the query by taking these as input data: insert into tasks values ('2019-01-01','success'), ('2019-01-02','success'), ('2019-01-10','success'), ('2019-01-11','fail') ,('2019-01-12','fail'), ('2019-01-13','success') So for date 2019-01-10 , group_date is coming 2019-01-07 (as day is 10 and row_number will 3 so 10-3 = 7) When we do groupby in outer query, it will treat it as different group! Hope you are getting my point.
I used another strategy that you showed in the 'on-off' question video. with cte as ( select date_value,state,leaded,sum(case when state='fail' and leaded='success' then 1 when state='success' and leaded='fail' then 1 else 0 end) over (order by date_value) as grouped from ( select date_value,state,lag(state,1,state) over(order by date_value) as leaded from tasks )A ) select min(date_value) as start_date,max(date_value) as end_date,state from cte group by grouped;
with cte as (select *,lag(date_value,1,date_value) over(partition by state order by date_value) as dd from tasks), cte2 as (select *,case when datediff(day,dd,date_value)
Another approach: with cte as(select date_value,state,prev_state,sum(case when state='fail' and prev_state='success' then 1 else 0 end) over(order by date_value) as grp from(select date_value,state,lag(state,1,state) over(order by date_value) as prev_state from tasks) A) ,cte2 as(select grp,state, min(date_value) as start_date,max(date_value) as end_date from cte group by grp,state order by start_date) select start_date,end_date,state from cte2
my approach: with cte as( select *,day(date_value)- rank() over(partition by state order by date_value) as diff from tasks) select min(date_value) as start_Date,max(date_value) as end_date,max(state) as state from cte group by diff order by start_date
with base as (Select date_value, lag(state,1,state) over() as prev_state, state from tasks), base2 as (Select *, sum(case when (prev_state = 'success' and state = 'fail') or (prev_state = 'fail' and state = 'success') then 1 else 0 End) over(order by date_value) as flag from base) Select min(date_value) as start_date, max(date_value) as end_date, state from base2 group by flag
Good solution!! Had few issues in query, corrected it and below is updated one ;with base as (Select date_value, lag(state,1,state) over(order by date_value) as prev_state, state from #tasks), base2 as (Select date_value,state, sum(case when prev_state != state then 1 else 0 End) over(order by date_value) as flag from base GROUP BY prev_state,date_value,state) Select min(date_value) as start_date, max(date_value) as end_date, state from base2 group by flag,state order by start_date
My approach with cte as(select *,row_number() over(order by date_value) - row_number() over(partition by state order by date_value) as grup from tasks) select min(date_value) as start_date,max(date_value) as end_date,max(state) as state from cte group by grup;
with cte as ( Select *, datepart(day,date_value )- row_number()over(partition by state order by date_value) as rn from tasks ) Select min(date_value) as start_date, max(date_value) as end_date ,state from cte group by rn, state order by min(date_value)
Hello sir hope you are doing well: My solution to the problem is: with cte as ( select *,ROW_NUMBER() over(order by date_value) as id from task ) ,cteone as ( select *,ROW_NUMBER() over(order by date_value) as ide from cte where state = 'success' ) ,ctetwo as ( select date_value,state,id-ide as final from cteone ) ,ctethree as ( select min(date_value) as min,max(date_value) as max,'sucess' as status from ctetwo group by final ) ,ctefour as ( select *,ROW_NUMBER() over(order by date_value) as valuation from cte where state = 'fail' ) select min(date_value) as min,max(date_value) as max,'fail' as status from ctefour group by id-valuation union select * from ctethree order by min
with cte as ( select date_value, (DAY(date_value) - row_number() over (partition by state order by date_value)) as bucket, state from tasks) select MIN(date_value) as start_date , MAX(date_value) as end_date , state from cte group by state, bucket order by start_date
#This query is for MYSQL select min(date_value) as start_date,max(date_value) as end_date,state from ( select date_value,state,row_number() over(partition by state order by date_value) as rn from tasks order by date_value )a group by date_sub(date_value,interval rn day)
with cte_next_state as( select *,lead(state) over(order by date_value) next_state from tasks ), next_null as( select date_value,state, if(state!=next_state or next_state is null,date_value,null) nxt from cte_next_state ) select * from next_null where nxt is not null
with cte as ( select rn,state,date_value, coalesce( (rn- lag (rn) over(partition by state order by rn asc)),1) as diff from (select *, row_number() over( order by date_value) as rn from tasks))
select min (date_value) as start, max(date_value) as end, state from cte group by state,diff order by rn asc
with cte as( select * , rank() over(partition by state order by date_value) as rn1 , extract(day from date_value) - rank() over(partition by state order by date_value) as group_ from tasks ) select state, min(date_value) as start_date , max(date_value) as end_date from cte group by 1,group_
select min(date_value) as start_date , max(date_value) as end_date ,max(state) as state from( select * ,row_number() over (order by date_value ) - row_number() over ( partition by state order by date_value) as r2 from tasks ) group by r2
with cte as (select *,row_number()over(partition by state order by date_value) as rn from tasks order by date_value) ,b as (select first_value(date_value)over(partition by extract(day from date_value)-rn ) as start_date , last_value(date_value)over(partition by extract(day from date_value)-rn ) as end_date,state from cte order by date_value) select start_date,end_date,state from b group by start_date,end_date,state
Easier method (Tabitosan): select min(cast(date_value as date)),max(cast(date_value as date)),max(state) from ( select date_value,state,rn1-rn as diff from ( select date_value,state,row_number() over(partition by state order by cast(date_value as date)) rn, row_number() over(order by cast(date_value as date)) rn1 from d )q1 )q1 group by diff
my solution- gaps and island problem select min(date_value) as start_date, max(date_value)as end_date, max(state) as state from (select date_value, state,prev_state,state_ind , sum(state_ind) over(order by date_value) as state_islands from ( select date_value, state,prev_state, (case when prev_state = state then 0 else 1 end) as state_ind from ( select date_value, state, lag(state,1) over(order by date_value) as prev_state from tasks )tmp )tmp2 ) group by state_islands
(select * from tasks order by date_value limit 1) union select t1.date_value,t1.state from tasks t1 inner join tasks t2 where (t1.date_value-t2.date_value)=1 and t1.state!=t2.state
%sql with cte as ( select datediff(lead(date_value) over (order by date_value),date_value) as diff,state,date_value from tasks) select min(date_value),max(date_value),state from cte group by state,diff;
with cte as ( select date_value, state, row_number()over(order by date_value) as rn, row_number()over(partition by state order by date_value) as rn_s from tasks), cte1 as ( select date_value, state, (rn - rn_s) as difference from cte) select min(date_value), max(date_value), state from cte1 group by difference, state;
with cte as( select date_value,state, row_number() over(partition by state order by date_value) as rows_number, dateadd(day,-1*row_number() over(partition by state order by date_value),date_value) as bs from tasks) select min(date_value) as start_dt,max(date_value) as end_dt,state from cte group by state,bs order by start_dt
Thanks for the video, please share for non-contiguous dates as well. Here's my solution without DATE_ADD: WITH CTE as (SELECT t.*, row_number() over(partition by state order by date_value) as parts_rank, row_number() over(order by date_value) as whole_rank FROM tasks t order by date_value) SELECT MIN(date_value) as min_date, MAX(date_value) as max_date, state FROM CTE GROUP BY whole_rank - parts_rank;
Alternate Solution : with grp_data as (select *, sum(lag1) over (order by date_value) as grp from (select *, case when lag(state) over () = state then 0 else 1 end as lag1 from tasks) a) select state, min(date_value) as start_date, max(date_value) as end_date from grp_data group by state,grp order by start_date
Another Approach to solve this Problem . with cte_1 as (select date_value,state, nvl(lag(state) over(order by date_value),'0') prev from tasks), cte_2 as (select date_value,state, case when state prev then 1 else 0 end as flag from cte_1), cte_3 as (select date_value, state , sum(flag) over (order by date_value) groups from cte_2) select state , groups, min(date_value) , max(date_value) from cte_3 group by state , groups
My solution!!!!! ;with cte as( select*, NTILE(2)over(order by date_value)as num from tasks) select MIN(date_value)as start_date,MAX(date_value)as end_date,state from cte group by state,num
MYSQL Solution with base as (select*, day(date_value) - row_number() over (partition by state order by date_value) as group_value from tasks order by date_value ) select min(date_value) as start_range, max(date_value) as end_range ,max(state) as state from base group by group_value;
we use recursive approach : with recursive cte as ( select min(start)as minn , max(end_date) as maxx from (select min(date_value) as start , max(date_value) as end_date , state from (select *,date_add(date_value , interval -1*cast(row_number() over(partition by state order by date_value) as signed) day) as rn from que9_tasks) m group by rn) n union all select date_add(minn , interval 1 day) , maxx from cte where minn
MYSQL Solution with base as (select *,row_number() over(partition by state order by date_value) as rnk from tasks), base_modified as ( select *,day(date_sub(date_value,interval rnk day)) as grp from base ) select min(date_value) as start_date, max(date_value) as end_date, max(state) as state from base_modified group by state,grp order by start_date,end_date
with abc as ( select date_value ,state, case when state = lag(state) over(order by date_value) then 0 else 1 end as rnk from tasks ), bcd as (select a.* , sum(rnk)over(order by date_value) sm from abc a) select min(date_value) start_date,max(date_value) end_date, max(state) as state from bcd group by state , sm
Here is my solution. Creating the 2 new columns with create series of number (say Rownumber) firstly by datewise, then datewise + statewise (partition by) and then substract the second from first which will give you the groupwise same number. Then the logic remains same. ;WITH CTE AS ( select *, SUM(1) OVER (ORDER BY date_value) as RN_datewise, SUM(1) OVER (PARTITION BY state ORDER BY date_value, state) as RN_dateWise_statewise from tasks ), CTE1 AS ( select date_value, state, (RN_datewise - RN_dateWise_statewise) as state_group FROM CTE ) select MIN(date_value) as start_date, MAX(date_value) as end_date, state FROM CTE1 GROUP BY state, state_group ORDER BY start_date
with temp as ( select * , dateadd(day, -1 * ROW_NUMBER() over(partition by state order by date_value), date_value) group_date from tasks ) select min(date_value) start_date, max(date_value) end_date, state from temp group by group_date, state order by start_date
My approach: with cte AS ( Select *,lag(state,1,'dedault') over(order BY date_value) AS prev_val FROM tasks ), cte2 AS ( Select *,SUM(case when state!=prev_val then 1 else 0 end) over(order by date_value) AS ky FROM cte ) Select min(date_value) as start_date,max(date_value) AS end_date,max(state) AS state From cte2 Group BY ky
solution using LAG window function : with temp1 as( SELECT t.*, IF(LAG(state) OVER (ORDER BY date_value) = state, 0, 1) AS rk FROM tasks t), temp2 as( select date_value,state,sum(rk) over (order by date_value) rk1 from temp1) select min(date_value) as start_date,max(date_value) as end_date,state from temp2 group by rk1,state;
select min(date_value),max(date_value),state from( select *,row_number() over(order by date_value) as rn ,row_number() over(partition by state order by date_value ) as r , row_number() over(order by date_value)-row_number() over(partition by state order by date_value ) as diff from task order by rn)p group by diff
with cte as ( select *, row_number() over(partition by state order by date_value) as rnk, rank() over(order by date_value) as rnk_1, rank() over(order by date_value)-row_number() over(partition by state order by date_value) as final_rnk from tasks) Select min(date_value) as min_date, max(date_value) as max_date, state from cte where state='success' or state='fail' group by final_rnk, state order by min_date My solution.
with cte as ( select *, row_number() over (partition by state order by date_value) as rn, date_add(date_value, interval -1*cast(row_number() over (partition by state order by date_value) as signed)day) as reference_date from tasks order by date_value) select state,reference_date, min(date_value), max(date_value) from cte group by reference_date,state
Used the row number to create the group without date add ********************** WITH A AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY state ORDER BY date_value) AS r, ROW_NUMBER() OVER (ORDER BY date_value) AS rn FROM tasks ORDER BY date_value) SELECT MIN(date_value) AS start_date, MAX(date_value) AS end_date, state FROM (SELECT date_value, state, rn - r AS parts FROM A) B GROUP BY state, parts
It cannot be applied to this script: create table tasks ( date_value date, state varchar(10) ); insert into tasks values ('2019-01-01','success'),('2019-01-02','success'),('2019-01-03','success'),('2019-01-04','fail') ,('2019-01-05','fail'),('2019-01-06','fail'), ('2019-01-07','success')
with cte1 as ( select * , ROW_NUMBER() over(partition by state order by date_value) as r, DATEADD(day, -1*ROW_NUMBER() over(partition by state order by date_value), date_value) as group_date from tasks ) select MIN(date_value) as start, MAX(date_value) as stop, state from cte1 group by group_date, state order by start;
with cte as( select date_value, state, row_number() over( partition by state order by date_value ) as rn , DATEPART(DAY ,date_value) as d ,DATEPART(DAY ,date_value) -row_number() over( partition by state order by date_value ) as diff from tasks) select min(date_value) as start_d ,max(date_value) as end_d , cte.state from cte group by diff, state order by min(date_value)
My Solutionin mysql with cte as ( select date_value, state, sum(partition_flag) OVER(order by date_value) as rolling_sum_partition from ( select date_value, state, case when state = lag(state) over() then 0 else 1 end as partition_flag from tasks)temp ) select min(date_value) as start_date, max(date_value) as end_date, state from cte group by rolling_sum_partition, state;
with cte as( select date_value,state,abs(ROW_NUMBER() over(partition by state order by date_value asc)-ROW_NUMBER() over(order by date_value)) diff, ROW_NUMBER() over(partition by state order by date_value asc) rn,ROW_NUMBER() over(order by date_value) rn2 from tasks) select min(date_value),max(date_value),state from cte group by state,diff
Great Video👍
My sol-
With t1 as(
select date_value as d, state,
Row_number() over(partition by state order by date_value) as r,
Row_number() over(order by date_value) as r2
from tasks )
select min(d) as start_date, max(d) as end_date , min(state)
from t1
group by (r2-r) Order by start_date;
good one
group by was amazing loved it
you will get wrong answer if you have the status fail in 7th and success in 8th of that month, so it's always better to group by dates(as ankit told in vedio) instead group on numbers.
I realize my self today am not average performer, I need some one who can teach like you 🙏🙏🙏 . Keep on going bro
More power to you 🔥🔥
i was asked the reverse of this question ,ie output here was input and i need to generate the exploded form which is input in your case, it was in expedia interview@@ankitbansal6
Hi Ankit,
Thanks for your valuable resources
;with cte_1 as
(
select *,ROW_NUMBER() over (order by date_value) - ROW_NUMBER() over (partition by state order by date_value) differ
from tasks
)
select min(date_value) start_date,max(date_value) end_date ,state,differ from cte_1
group by state,differ
order by start_date,end_date
this will work even if there is missing dates in bw great thanks
Thanks Ankit for the amazing sql course .
with cte as (
select * , rank() over(partition by state order by date_value) as rn , ROW_NUMBER() over(order by date_value ) as row from tasks )
select min(date_value) as start_date , max(date_value) as end_date , state from cte group by (rn-row) , state
order by min(date_value)
This was the simplest trick I have come across.. I did the same in my project but found the last date first for consecutive dates and then found the min date fir every group. It looks messy but works.. this one is awesome..
Actual ultimate thing is providing the create table and insert data in the description.... Huge thanks to Guruji.
I was asked similar question using status - "Online / Offline" in one product based interview. This is quite good! Keep up the good work Ankit :)
with cte as (
SELECT *, lag(state, 1,state) over (order by date_value) as prev
from tasks
), cte2 as(
SELECT *, sum(case when prev = state then 0 else 1 end) over (order by date_value) as flag
from cte
)
SELECT min(date_value) as start, max(date_value) as endDate, min(state) as state
from cte2
GROUP BY flag
Your videos are very good, simply said, easy to understand. I also learnt this technique about an year ago and see this being asked in many interviews
Awesome. My approach.
with cte as(
select *,
row_number() over(order by state, date_value) rn,
date_value - row_number() over(order by state, date_value)::int as dt
from tasks)
select min(date_value) as start_dt,
max(date_value) as end_dt,
dt
from cte
group by dt, state;
Thanks for posting 👏
Mind blowing trick. Never thought this way of solving any problem.
select min(date_value) as start_date, max(date_value) as end_date, state from
(select *, date_value - (rn || 'days'):: interval as daydiff from
(select *, row_number() over (partition by state order by date_value) rn from tasks3)a)v
group by state, daydiff;
Hey Ankit, I really forgot this trick, thanks a lot for reminding !
;with cte as(
select *,DAY(date_value)-
ROW_NUMBER() over (partition by state order by date_value) rnk
from tasks
)
select MIN(date_value) as start_date, MAX(date_value) as end_date, state from cte
group by state,rnk
with cte as (select *,day(date_value) -ROW_NUMBER() over(partition by state order by date_value) as r
from tasks)
select min(date_value),max(date_value),max(state) from cte group by r
Superb logic.. very helful trick while working with continuous date..
Keep watching
Thanks Ankit for the rescue, I was literally searching for such simple solution.You are really doing an amazing job. 👏👏
If possible please make a video on how to handle such scenarios in case of non continuous range of dates.
Sure will do
@@ankitbansal6
with cte as(
select date_value,state,lead(state) over( order by date_value)as statenext
,lag(state) over( order by date_value)as stateprev
,iif(lead(state) over( order by date_value)=state,1,0) as flagnext
,iif(lag(state) over( order by date_value)=state,1,0) as flagprev
from tasks),
cte2 as (select date_value,lead(date_value,1,date_value) over (order by date_value) as end_date
,state,flagnext,flagprev from cte where flagnext=0 or flagprev=0)
select date_value,end_date,state from cte2 where flagnext=1 or(flagnext=0 and flagprev=0)
@@ankitbansal6 Hi Ankit! This is your colleague from your last firm "Prashant Priyadarshi". You are doing an amazing job. I love to get my hands dirty by solving the amazing problems you post. Keep up the great work you are doing. 😀
Banty - The below solution will help you with the non-continuous dates as well. Happy Learning!
with prev_state as (
select *
, LAG(state,1,state) over (partition by null order by date_value) as prev_state
from tasks
)
, cal_state_change as (
select *
,sum(case when state = prev_state then 0 else 1 end) over (partition by null order by date_value) as state_change_flag
from prev_state)
select start_date, end_date, state from (
select state_change_flag, min(state) as state, MIN(date_value) as start_date, MAX(date_value) as end_date
from cal_state_change
group by state_change_flag
) src;
@@prashantpriyadarshi7665 Glad you are enjoying 😊
great video bro, excellent solution. Mind-blowing, thank you so much for providing such precious content
This Exact logic was asked in an interview which I couldnt answer. Wish I had seen this . :-) .. Great explanation
Thank you 😊
Thanks a lot Ankit for posting such a interesting question.Please find my solution below :
with groups as
(Select
*
,row_number() over ( partition by state order by date_value) as rn
,dateadd(day , -1*row_number() over ( partition by state order by date_value) ,date_value) as group_date
from tasks )
select
min(date_value) as start_date
,max(date_value) as end_date,state
from
groups
group by
group_date,state
order by
min(date_value);
Thanks for posting 👏
with cte as (
select *, rank() over(partition by state order by date_value) as rk,
row_number() over(order by date_value) as rn
from tasks
)
select state, min(date_value), max(date_value)
from cte
group by rn-rk;
thanks a lot bro ,i am not able to understand any solution,even the video solution
But fortunately i got your comment
and
Boom
💣
I understand
Thanks again Bro
❤
a similar question was asked to me in interview instead of state it was price. I had to create window of dates when ever price changes. I could not solve it that time. it really looks simple in first instance but it is not. :)
True.
Thanks for the awesome video. This method works on any date_value whether continuos or not.
Here is the output I tried -
"2019-01-11" "2019-01-12" "fail"
"2019-01-01" "2019-01-02" "success"
"2019-01-10" "2019-01-10" "success"
"2019-01-13" "2019-01-13" "success"...
I don't see your query
@@ankitbansal6
Input table has below data :-
"2019-01-01" "success"
"2019-01-02" "success"
"2019-01-10" "success"
"2019-01-11" "fail"
"2019-01-12" "fail"
"2019-01-13" "success"
Solution for it is :-
with grp_dt as(
SELECT *,row_number() over (partition by state order by date_value) as rn,
dateadd( day,-1*row_number() over (partition by state order by date_value),date_value) as group_date
from tasks2)
SELECT min(date_value) as start_date,max(date_value) as end_date,state from grp_dt
group by state,group_date
order by state,group_date
We can also simply use the lag function to see which row has its state different from the previous one. This is a really cool trick though. Thanks a lot!
kindly share the soln by lead/lag, i couldnt do it
with required_data as (
select date_value, state, lag1,
sum(lag1) over (order by date_value) as grp
from
(
select date_value,
state,
case when lag(state) over (order by date_value) = state then 0 else 1 end as lag1
from tasks
)
)
select min(date_value) start_date,
max(date_value) end_date,
state
from required_data
group by state,
grp
order by start_date
we can just use this trick only if all date is consecutive, right? I mean if date 2019-01-01 then 2019-01-03, we can not use this trick anymore?
I used the CTE and a different approach then yours..
as that what came to my mind at the time of solving the problem statement..
Please share your feedback. :)
with cte1 as (
Select *, case when lag(state) over(order by date_value) != State then 1 else 0 end as is_Change
from tasks)
, groups as (
Select date_value, state, sum(is_change) over(order by date_value) as grp_number
from cte1
)
Select State, min(Date_Value) as Time_Start, Max(Date_Value) as Time_End
from groups
group by grp_number, state
order by Time_Start;
there is a chance that some intermediate dates could be missing, there is no guarantee for cosecutive dates.
My solution : MS SQL
with cte1 as (
select *, (case when lag(state) over (order by date_value)=state then 0 else 1 end) flag from tasks
),
cte2 as (
select *, sum(flag) over (order by date_value) as state_group from cte1
)
select state,min(date_value) as start_date, max(date_value) as end_date from cte2 group by state,state_group
suppose we have start and end date and we want the first manuall table?
How do we need to achive if the date column has timestamp values in it and each row has difference in seconds, hours and its not consistant.?
HI Ankit, But there could be a diff which is common for 2 streaks. I dont think that is taken care of in the code.
Agar bhai ham direct subtract use kare with rn and date_value jab b to bhai yehi aaega ??
Great Video, similar question asked in my interview
Great!
my approach::
with cte as (
select *, ROW_NUMBER() OVER(order by date_value) -
ROW_NUMBER() OVER(partition by state order by date_value) as rn2
from tasks
)
select min(date_value) as start_date ,max(date_value) as end_date ,state
from cte group by rn2, state
order by 1
mazza aaya.
In oracle its not working,
when we write partition by state, fail comes at the top @Ankit Bansal
--solution
with temp as ( --calculate current and previous state
select date_value,state,coalesce(lag(state) over(order by date_value),'XX') prev_state
from tasks
),
temp1 as ( select date_value,state,prev_state,
case when state prev_state then 1 else 0 end flag -- set flag to 1 when state changes and 0 if state doesn't change
from temp ),
temp2 as (
select date_value,state,
sum(flag) over( order by date_value rows between unbounded preceding and 0 preceding) flag1 -- every change in state will get new number
from temp1 order by date_value
) select min(date_value),max(date_value),state
from temp2
group by state,flag1
with tmp as (select *,lag(state) over (order by date_value )lag_st,
lead(state) over (order by date_value) lead_st from tasks)
,tmp1 as (select *,lead(DATEADD(Day,-1,date_value)) over (order by date_value) prev from tmp
where lag_st is null or lead_st is null or tmp.statelag_st
)
select date_value as start_date, coalesce(prev,date_value) as end_date ,state from tmp1
select * from tasks
What would be the logic if date won't be continues? Like 2020-01-01,2020-01-02,2020-01-09?
Assuming that the data is sorted, we can create a column Row_Number like 1, 2, 3, 4 and then apply the same logic.
Agree with Gulshan.
@@ankitbansal6 @Gulshan Kumar
I ran the query by taking these as input data: insert into tasks values
('2019-01-01','success'),
('2019-01-02','success'),
('2019-01-10','success'),
('2019-01-11','fail')
,('2019-01-12','fail'),
('2019-01-13','success')
So for date 2019-01-10 , group_date is coming 2019-01-07 (as day is 10 and row_number will 3 so 10-3 = 7)
When we do groupby in outer query, it will treat it as different group!
Hope you are getting my point.
@@harshSingh-if4zb you need to take first row number on full table and use that column instead of date column when doing minus to create group
How to create date range if the the dates are not contentious (difference between dates can be more than 2)?
tricky question with amazing solution
great one!! it just blew my mind sir. thanks for making this :)
Glad you liked it!
I used another strategy that you showed in the 'on-off' question video.
with cte as
(
select date_value,state,leaded,sum(case when state='fail' and leaded='success' then 1 when state='success' and leaded='fail' then 1 else 0 end) over (order by date_value) as grouped from
(
select date_value,state,lag(state,1,state) over(order by date_value) as leaded
from tasks
)A
)
select min(date_value) as start_date,max(date_value) as end_date,state
from cte
group by grouped;
Great stuff 👍
amazing, quite a innovative approach
bro could you explain this..
@@Vaibha293 He used running total concept then used that total as group by
with cte as
(select *,lag(date_value,1,date_value) over(partition by state order by date_value) as dd from tasks),
cte2 as
(select *,case when datediff(day,dd,date_value)
Can you please tell me how to do reverse ... Input table as as start_date,end_date,state... Output result as date_value and state
this solution works when there is no miss of a day in the date_value column?
I had the same doubt...
Another approach:
with cte as(select date_value,state,prev_state,sum(case when state='fail' and prev_state='success' then 1 else 0 end) over(order by date_value) as grp
from(select date_value,state,lag(state,1,state) over(order by date_value) as prev_state from tasks) A)
,cte2 as(select grp,state, min(date_value) as start_date,max(date_value) as end_date from cte group by grp,state order by start_date)
select start_date,end_date,state from cte2
Thanks for posting 👍
I really liked this question! Mind blowing :D
Thanks for sharing this video
my approach:
with cte as(
select *,day(date_value)- rank() over(partition by state order by date_value)
as diff from tasks)
select min(date_value) as start_Date,max(date_value) as end_date,max(state) as state
from cte group by diff order by start_date
Excellent logic 🧠👌
with base as
(Select date_value,
lag(state,1,state) over() as prev_state,
state
from tasks),
base2 as
(Select *,
sum(case
when (prev_state = 'success' and state = 'fail') or (prev_state = 'fail' and state = 'success') then 1
else 0
End) over(order by date_value) as flag
from base)
Select
min(date_value) as start_date,
max(date_value) as end_date,
state
from base2
group by flag
Good solution!! Had few issues in query, corrected it and below is updated one
;with base as
(Select date_value,
lag(state,1,state) over(order by date_value) as prev_state,
state
from #tasks),
base2 as
(Select date_value,state,
sum(case
when prev_state != state then 1
else 0
End) over(order by date_value) as flag
from base
GROUP BY prev_state,date_value,state)
Select
min(date_value) as start_date,
max(date_value) as end_date,
state
from base2
group by flag,state
order by start_date
My approach
with cte as(select *,row_number() over(order by date_value) - row_number() over(partition by state order by date_value) as grup from tasks)
select min(date_value) as start_date,max(date_value) as end_date,max(state) as state from cte group by grup;
with cte as
(
Select *,
datepart(day,date_value )- row_number()over(partition by state order by date_value) as
rn from tasks
)
Select min(date_value) as start_date, max(date_value) as end_date ,state
from cte
group by rn, state
order by min(date_value)
Hello sir hope you are doing well:
My solution to the problem is:
with cte as
(
select *,ROW_NUMBER() over(order by date_value) as id from task
)
,cteone as
(
select *,ROW_NUMBER() over(order by date_value) as ide from cte
where state = 'success'
)
,ctetwo as
(
select date_value,state,id-ide as final from cteone
)
,ctethree as
(
select min(date_value) as min,max(date_value) as max,'sucess' as status from ctetwo
group by final
)
,ctefour as
(
select *,ROW_NUMBER() over(order by date_value) as valuation from cte
where state = 'fail'
)
select min(date_value) as min,max(date_value) as max,'fail' as status from ctefour
group by id-valuation
union
select * from ctethree
order by min
Neatly explained. Thank you.
Thanks a lot for this video
There is catch in the explained solution , what if dates are not consecutive in the groups
with cte as (
select
date_value,
(DAY(date_value) - row_number() over (partition by state order by date_value)) as bucket,
state
from tasks)
select MIN(date_value) as start_date ,
MAX(date_value) as end_date ,
state
from
cte
group by state, bucket
order by start_date
Make a video for non continous dates please
Okay.
Please make more videos on this Island and Gaps problem
Sure
#This query is for MYSQL
select min(date_value) as start_date,max(date_value) as end_date,state
from
(
select date_value,state,row_number() over(partition by state order by date_value) as rn
from tasks order by date_value
)a group by date_sub(date_value,interval rn day)
A good approach to learn...thanks
Glad you liked it!
with cte_next_state as(
select *,lead(state) over(order by date_value) next_state from tasks
),
next_null as(
select date_value,state,
if(state!=next_state or next_state is null,date_value,null) nxt
from cte_next_state
)
select * from next_null where nxt is not null
with cte as (
select rn,state,date_value,
coalesce( (rn- lag (rn) over(partition by state order by rn asc)),1) as diff
from
(select *,
row_number() over( order by date_value) as rn
from tasks))
select min (date_value) as start,
max(date_value) as end,
state
from cte
group by state,diff
order by rn asc
with cte as(
select * , rank() over(partition by state order by date_value) as rn1
, extract(day from date_value) - rank() over(partition by state order by date_value) as group_
from tasks
)
select state, min(date_value) as start_date , max(date_value) as end_date from cte
group by 1,group_
select min(date_value) as start_date , max(date_value) as end_date ,max(state) as state from(
select * ,row_number() over (order by date_value ) - row_number() over ( partition by state order by date_value) as r2 from tasks )
group by r2
Ankit sir thank you for sharing
My pleasure
with cte as (select *,row_number()over(partition by state order by date_value) as rn from tasks order by date_value)
,b as (select first_value(date_value)over(partition by extract(day from date_value)-rn ) as start_date ,
last_value(date_value)over(partition by extract(day from date_value)-rn ) as end_date,state from cte order by date_value)
select start_date,end_date,state from b group by start_date,end_date,state
Easier method (Tabitosan):
select min(cast(date_value as date)),max(cast(date_value as date)),max(state) from
(
select date_value,state,rn1-rn as diff from
(
select date_value,state,row_number() over(partition by state order by cast(date_value as date)) rn,
row_number() over(order by cast(date_value as date)) rn1
from d
)q1
)q1
group by diff
my solution- gaps and island problem
select min(date_value) as start_date, max(date_value)as end_date, max(state) as state
from
(select date_value, state,prev_state,state_ind , sum(state_ind) over(order by date_value) as state_islands
from
(
select date_value, state,prev_state, (case when prev_state = state then 0 else 1 end) as state_ind
from
(
select date_value, state,
lag(state,1) over(order by date_value) as prev_state
from tasks
)tmp
)tmp2
)
group by state_islands
(select * from tasks order by date_value limit 1)
union
select
t1.date_value,t1.state
from tasks t1
inner join tasks t2
where (t1.date_value-t2.date_value)=1 and t1.state!=t2.state
%sql
with cte as (
select datediff(lead(date_value) over (order by date_value),date_value) as diff,state,date_value
from tasks)
select min(date_value),max(date_value),state from cte
group by state,diff;
with cte as (
select date_value, state,
row_number()over(order by date_value) as rn,
row_number()over(partition by state order by date_value) as rn_s
from tasks),
cte1 as (
select date_value, state, (rn - rn_s) as difference
from cte)
select min(date_value), max(date_value), state from cte1
group by difference, state;
very useful and interesting
mind blowing trick
with cte as(
select date_value,state,
row_number() over(partition by state order by date_value) as rows_number,
dateadd(day,-1*row_number() over(partition by state order by date_value),date_value) as bs
from tasks)
select min(date_value) as start_dt,max(date_value) as end_dt,state
from cte
group by state,bs
order by start_dt
Thanks for the video, please share for non-contiguous dates as well.
Here's my solution without DATE_ADD:
WITH CTE as (SELECT t.*, row_number() over(partition by state order by date_value) as parts_rank,
row_number() over(order by date_value) as whole_rank
FROM tasks t
order by date_value)
SELECT MIN(date_value) as min_date, MAX(date_value) as max_date, state
FROM CTE
GROUP BY whole_rank - parts_rank;
Sure. Thanks for posting 🙏
Alternate Solution :
with grp_data as
(select *,
sum(lag1) over (order by date_value) as grp
from
(select *,
case when lag(state) over () = state then 0 else 1 end as lag1
from tasks) a)
select state, min(date_value) as start_date, max(date_value) as end_date
from grp_data group by state,grp order by start_date
Awesome solution 👍
Another Approach to solve this Problem .
with cte_1 as (select date_value,state,
nvl(lag(state) over(order by date_value),'0') prev from tasks),
cte_2 as (select date_value,state,
case when state prev then 1 else 0 end as flag
from cte_1),
cte_3 as (select date_value, state ,
sum(flag) over (order by date_value) groups from cte_2)
select state , groups, min(date_value) , max(date_value)
from cte_3 group by state , groups
My solution!!!!!
;with cte as(
select*,
NTILE(2)over(order by date_value)as num
from
tasks)
select
MIN(date_value)as start_date,MAX(date_value)as end_date,state
from
cte
group by state,num
Great explanation
MYSQL Solution
with base as (select*,
day(date_value) - row_number() over (partition by state order by date_value) as group_value
from tasks order by date_value )
select min(date_value) as start_range, max(date_value) as end_range ,max(state) as state
from base group by group_value;
how we get the reverse of the table
we use recursive approach :
with recursive cte as (
select min(start)as minn , max(end_date) as maxx from
(select min(date_value) as start , max(date_value) as end_date , state from
(select *,date_add(date_value , interval -1*cast(row_number() over(partition by state order by date_value) as signed) day) as rn from que9_tasks) m
group by rn) n
union all
select date_add(minn , interval 1 day) , maxx from cte
where minn
MYSQL Solution
with base as (select *,row_number() over(partition by state order by date_value) as rnk from tasks),
base_modified as (
select *,day(date_sub(date_value,interval rnk day)) as grp from base )
select min(date_value) as start_date,
max(date_value) as end_date,
max(state) as state from base_modified group by state,grp order by start_date,end_date
non continous date ranges cries in corner .
with abc as (
select date_value ,state, case when state = lag(state) over(order by date_value) then 0 else 1 end as rnk from tasks
),
bcd as (select a.* , sum(rnk)over(order by date_value) sm from abc a)
select min(date_value) start_date,max(date_value) end_date, max(state) as state from bcd group by state , sm
unable to perform this in mysql, getting syntax error because of -1
What is the error ?
@@ankitbansal6 syntax error
Here is my solution. Creating the 2 new columns with create series of number (say Rownumber) firstly by datewise, then datewise + statewise (partition by) and then substract the second from first which will give you the groupwise same number. Then the logic remains same.
;WITH CTE
AS
(
select *,
SUM(1) OVER (ORDER BY date_value) as RN_datewise,
SUM(1) OVER (PARTITION BY state ORDER BY date_value, state) as RN_dateWise_statewise
from tasks
),
CTE1
AS
(
select date_value, state, (RN_datewise - RN_dateWise_statewise) as state_group
FROM CTE
)
select MIN(date_value) as start_date, MAX(date_value) as end_date, state
FROM CTE1
GROUP BY state, state_group
ORDER BY start_date
with temp as
(
select *
, dateadd(day, -1 * ROW_NUMBER() over(partition by state order by date_value), date_value) group_date
from tasks
)
select min(date_value) start_date, max(date_value) end_date, state
from temp
group by group_date, state
order by start_date
My approach:
with cte AS
(
Select *,lag(state,1,'dedault') over(order BY date_value) AS prev_val FROM tasks
),
cte2 AS
(
Select *,SUM(case when state!=prev_val then 1 else 0 end) over(order by date_value) AS ky
FROM cte
)
Select min(date_value) as start_date,max(date_value) AS end_date,max(state) AS state
From cte2
Group BY ky
solution using LAG window function :
with temp1 as(
SELECT t.*, IF(LAG(state) OVER (ORDER BY date_value) = state, 0, 1) AS rk
FROM tasks t),
temp2 as(
select date_value,state,sum(rk) over (order by date_value) rk1
from temp1)
select min(date_value) as start_date,max(date_value) as end_date,state
from temp2
group by rk1,state;
select min(date_value),max(date_value),state from(
select *,row_number() over(order by date_value) as rn ,row_number() over(partition by state order by date_value ) as r ,
row_number() over(order by date_value)-row_number() over(partition by state order by date_value ) as diff
from task
order by rn)p
group by diff
with cte as (
select *,
row_number() over(partition by state order by date_value) as rnk,
rank() over(order by date_value) as rnk_1,
rank() over(order by date_value)-row_number() over(partition by state order by date_value) as final_rnk
from tasks)
Select min(date_value) as min_date, max(date_value) as max_date, state
from cte
where state='success' or state='fail'
group by final_rnk, state
order by min_date
My solution.
with cte as (
select *, row_number() over (partition by state order by date_value) as rn,
date_add(date_value, interval -1*cast(row_number() over (partition by state order by date_value) as signed)day)
as reference_date
from tasks order by date_value)
select state,reference_date,
min(date_value), max(date_value) from cte group by reference_date,state
Oh man ! Superb !
Thanks a lot!
Used the row number to create the group without date add
**********************
WITH A
AS (SELECT
*,
ROW_NUMBER() OVER (PARTITION BY state ORDER BY date_value) AS r,
ROW_NUMBER() OVER (ORDER BY date_value) AS rn
FROM tasks
ORDER BY date_value)
SELECT
MIN(date_value) AS start_date,
MAX(date_value) AS end_date,
state
FROM (SELECT
date_value,
state,
rn - r AS parts
FROM A) B
GROUP BY state,
parts
Great 😊
Same thought!
@Koushik Super answer Koushik & Thanks to @Ankit 🙏🏼🙏🏼
hi Koushik , can you please tell the logic behind rn-r .
@@AshutoshSharma-df7kg its to get the unique groups
It cannot be applied to this script:
create table tasks (
date_value date,
state varchar(10)
);
insert into tasks values ('2019-01-01','success'),('2019-01-02','success'),('2019-01-03','success'),('2019-01-04','fail')
,('2019-01-05','fail'),('2019-01-06','fail'), ('2019-01-07','success')
with cte1 as (
select * ,
ROW_NUMBER() over(partition by state order by date_value) as r,
DATEADD(day, -1*ROW_NUMBER() over(partition by state order by date_value), date_value) as group_date
from tasks )
select MIN(date_value) as start, MAX(date_value) as stop, state
from cte1
group by group_date, state
order by start;
with cte as(
select date_value, state, row_number() over( partition by state order by date_value ) as rn ,
DATEPART(DAY ,date_value) as d ,DATEPART(DAY ,date_value) -row_number() over( partition by state order by date_value ) as diff
from tasks)
select min(date_value) as start_d ,max(date_value) as end_d , cte.state
from cte group by diff, state order by min(date_value)
My Solutionin mysql
with cte as (
select date_value, state, sum(partition_flag) OVER(order by date_value) as rolling_sum_partition from (
select date_value, state,
case when state = lag(state) over() then 0 else 1 end as partition_flag
from tasks)temp )
select min(date_value) as start_date, max(date_value) as end_date, state
from cte group by rolling_sum_partition, state;
with cte as(
select date_value,state,abs(ROW_NUMBER() over(partition by state order by date_value asc)-ROW_NUMBER() over(order by date_value)) diff,
ROW_NUMBER() over(partition by state order by date_value asc) rn,ROW_NUMBER() over(order by date_value) rn2
from tasks)
select min(date_value),max(date_value),state
from cte
group by state,diff