Forum Stats

  • 3,770,480 Users
  • 2,253,121 Discussions
  • 7,875,481 Comments

Discussions

Count across multiple columns

nagornyi
nagornyi Member Posts: 1,056
edited Mar 9, 2020 9:38AM in SQL & PL/SQL

Oracle DB 11.2.

The table:

ID

PA

PM

1

TWO

2

TWO

ONE

3

THREE

TWO

4

ONE

TWO

5

TWO

Need to count the values in both PA and PM columns, like

VAL

PA

PM

ONE

1

1

TWO

2

3

THREE

1

But can do it only for one column, like

WITH T(ID, PA, PM) AS

(

SELECT 1, 'TWO', '' FROM DUAL UNION ALL

SELECT 2, 'TWO', 'ONE' FROM DUAL UNION ALL

SELECT 3, 'THREE', 'TWO' FROM DUAL UNION ALL

SELECT 4, 'ONE', 'TWO' FROM DUAL UNION ALL

SELECT 5, '', 'TWO' FROM DUAL

)

SELECT PA VAL, COUNT(PA) PA FROM T WHERE PA IS NOT NULL
GROUP BY PA

How to add the second column, please?

nagornyi

Best Answer

  • mNem
    mNem Member Posts: 1,380 Gold Trophy
    edited Mar 6, 2020 12:41PM Accepted Answer

    WITH T(ID, PA, PM) AS

    (

    SELECT 1, 'TWO', '' FROM DUAL UNION ALL

    SELECT 2, 'TWO', 'ONE' FROM DUAL UNION ALL

    SELECT 3, 'THREE', 'TWO' FROM DUAL UNION ALL

    SELECT 4, 'ONE', 'TWO' FROM DUAL UNION ALL

    SELECT 5, '', 'TWO' FROM DUAL

    )

    select * from

    (

      select val, col from t

      unpivot

      (

        val for col in (pa, pm)

      )

    )

    pivot

    (

      count(*) for (col) in ('PA' pa, 'PM' pm)

    )

    ;

    VAL           PA         PM

    ----- ---------- ----------

    ONE            1          1

    TWO            2          3

    THREE          1          0

    nagornyinagornyi

Answers

  • mNem
    mNem Member Posts: 1,380 Gold Trophy
    edited Mar 6, 2020 12:41PM Accepted Answer

    WITH T(ID, PA, PM) AS

    (

    SELECT 1, 'TWO', '' FROM DUAL UNION ALL

    SELECT 2, 'TWO', 'ONE' FROM DUAL UNION ALL

    SELECT 3, 'THREE', 'TWO' FROM DUAL UNION ALL

    SELECT 4, 'ONE', 'TWO' FROM DUAL UNION ALL

    SELECT 5, '', 'TWO' FROM DUAL

    )

    select * from

    (

      select val, col from t

      unpivot

      (

        val for col in (pa, pm)

      )

    )

    pivot

    (

      count(*) for (col) in ('PA' pa, 'PM' pm)

    )

    ;

    VAL           PA         PM

    ----- ---------- ----------

    ONE            1          1

    TWO            2          3

    THREE          1          0

    nagornyinagornyi
  • nagornyi
    nagornyi Member Posts: 1,056
    edited Mar 6, 2020 12:49PM

    Thank you, that works.

    I had a feeling there should be simpler solution, but... couldn't find it

  • mNem
    mNem Member Posts: 1,380 Gold Trophy
    edited Mar 6, 2020 1:13PM

    I think someone (cannot remember who it was) once showed a more efficient method in terms of performance for large number of rows, but cannot recall how it was done.

  • mathguy
    mathguy Member Posts: 10,167 Blue Diamond
    edited Mar 9, 2020 9:38AM

    If you have a lot of data and performance is an issue, you may want to try something like below.

    The strategy is to count the pairs of values, grouped by (PA, PM), but use the CUBE extension to GROUP BY. In addition to the individual counts, this also gives you subtotals rolled up any way you want. Here you want subtotals for each value in PA and separately subtotals for each value in PM. Using GROUP BY CUBE you can get both kinds of subtotals in a single aggregation over the base rows.

    Then, with that in hand, you can join the result to itself to get the final output. This strategy uses that extra self-join at the end, which you don't need in mNem's solution. That is the cost you pay for the following benefit: In the UNPIVOT plus PIVOT approach, the PIVOT operation is also an aggregation (like COUNT), but it is over twice as many rows as in the base data. This may not be an issue if your base table has 10,000 rows, but it may be if it has three million rows. In the latter case, grouping by CUBE may be faster, even with the overhead of CUBE and the extra join at the end. (Note that the self-join at the end is for a - presumably - much smaller number of rows, equal to the number of distinct values from the two columns.)

    @mNem - I don't know if this is what you were referring to in Reply 3. If it is, I didn't show it to you; I only thought about it now, and only because you suggested there may be a faster way. My immediate instinct was to do it exactly as you did it. (With one minor, mostly cosmetic difference: you are allowed to have a PIVOT and an UNPIVOT operation in the same query; here the ID is getting in the way, so we would still need a subquery to select only PA and PM from T, but I would do that upfront, and then use PIVOT and UNPIVOT at the same level.)

    So, here is the GROUP BY CUBE solution:

    with  t (id, pa, pm) as (    select 1, 'TWO'  , ''    from dual union all    select 2, 'TWO'  , 'ONE' from dual union all    select 3, 'THREE', 'TWO' from dual union all    select 4, 'ONE'  , 'TWO' from dual union all    select 5, ''     , 'TWO' from dual), c (pa, pm, ct) as (    select pa, pm, count(*)    from   t    group  by cube (pa, pm)    having grouping_id(pa) + grouping_id(pm) = 1   --  we only need the (0, 1) and (1, 0) combinations of GROUPING_ID
      )select nvl(c1.pa, c2.pm) as value, c1.ct as pa, c2.ct as pmfrom   (select * from c where pa is not null) c1 full outer join       (select * from c where pm is not null) c2                                              on c1.pa = c2.pm;VALUE         PA         PM----- ---------- ----------ONE            1          1TWO            2          3THREE          1        
    nagornyinagornyi
  • nagornyi
    nagornyi Member Posts: 1,056
    edited Mar 9, 2020 9:20AM

    Also very good solution. Thanks!