Forum Stats

  • 3,826,361 Users
  • 2,260,636 Discussions
  • 7,896,916 Comments

Discussions

Current and previous years sales self JOIN

User_G3NSO
User_G3NSO Member Posts: 34 Green Ribbon

I have the following test CASE, which works fine and produces the expected results when I use the lag function.


I'm curious, Is there a way this query can be rewritten to use a self JOIN? Any help would be greatly appreciated.


CREATE TABLE sales (yr,total_sale) AS
SELECT 2015, 23000 FROM DUAL UNION ALL 
SELECT 2016, 25000 FROM DUAL UNION ALL 
SELECT 2017, 34000 FROM DUAL UNION ALL 
SELECT 2018, 32000 FROM DUAL UNION ALL 
SELECT 2019, 33000 FROM DUAL; 

/* works perfectly */

select yr, total_sale, 
lag(total_sale) over (order by yr) as previous_total_sale,
total_sale - lag(total_sale) over (order by yr) difference
from sales;


Tagged:

Best Answer

  • L. Fernigrini
    L. Fernigrini Data Engineer Sr Consultant Member Posts: 3,836 Silver Crown
    edited Jun 14, 2022 5:52PM Answer ✓

    Yes, you can use LEFT JOIN for this:

    select curr.yr, curr.total_sale, prev.total_sale as previous_total_sale, 
    curr.total_sale - prev.total_saleas difference
    from sales curr
    LEFT JOIN sales prev ON curr.yr = prev.yr + 1
    ORDER BY curr.yr;
    

    But LAG is shorter to write and usually better for performance.


    <Edit: Added ORDER BY to make it clearer>

Answers

  • L. Fernigrini
    L. Fernigrini Data Engineer Sr Consultant Member Posts: 3,836 Silver Crown
    edited Jun 14, 2022 5:52PM Answer ✓

    Yes, you can use LEFT JOIN for this:

    select curr.yr, curr.total_sale, prev.total_sale as previous_total_sale, 
    curr.total_sale - prev.total_saleas difference
    from sales curr
    LEFT JOIN sales prev ON curr.yr = prev.yr + 1
    ORDER BY curr.yr;
    

    But LAG is shorter to write and usually better for performance.


    <Edit: Added ORDER BY to make it clearer>

  • User_G3NSO
    User_G3NSO Member Posts: 34 Green Ribbon

    @L. Fernigrini thanks for your help and expertise. I plan on sticking with the LAG. The self JOIN was for academic purposes

    L. Fernigrini
  • mathguy
    mathguy Member Posts: 10,539 Blue Diamond

    Another option you may want to consider:

    select  yr, total_sale, prior total_sale as previous_total_sale,
            total_sale - prior total_sale as difference
    from    sales
    start   with yr = (select min(yr) from sales)
    connect by   yr = prior yr + 1;
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,052 Red Diamond

    Hi, @User_G3NSO

    The solutions above are good answers to a different problem; they happen to get the same results with the sample data you posted, but get different results with other data. The query you posted compares a year to the last available year, no matter if that was 1, 2, 3 or more years earlier. The queries above compare year N to year N-1. If your table has missing years (e.g. if you include this row:

    SELECT 2013, 20000 FROM DUAL UNION ALL 
    

    ) then the results will be different. You can adapt the solutions above: don't use yr to compare, but use ROW_NUMBER based on your instead. For example:

    WITH  got_r_num  AS
    (
    	SELECT  yr, total_sale
    	,	ROW_NUMBER () OVER (ORDER BY yr) AS r_num
    	FROM	sales
    )
    SELECT    curr.yr
    , 	  curr.total_sale
    , 	  prev.total_sale                   AS previous_total_sale
    ,	  curr.total_sale - prev.total_sale AS difference
    FROM 	  got_r_num curr
    LEFT JOIN got_r_num prev ON curr.r_num = prev.r_num + 1
    ORDER BY  curr.yr;