8 Replies Latest reply: Dec 7, 2012 12:34 PM by Frank Kulash RSS

    Grouping Counts - Can You Group WIDTH_BUCKET Counts?

    978455
      Hi,
      I'm using Oracle 11g.

      I am trying to get a count of customers who have a certain count of orders. I have a parent to child relationship between customer (parent) and orders (child). The output should be something like:
      0-100; 35 (35 customers have zero -100 orders)
      101-200; 55 (55 customers have this many orders)
      201-300; 12 (12 customers have this many orders)

      I tried using the WIDTH_BUCKET function but it doesn't 'group' the counts, and I don't know how it works over the join between the parent/child tables.

      Any help is GREATLY appreciated. My skill level in SQL is uh, well on the low end honestly.

      I posted my crude attempts below
      Thanks
      Alex

      ~~~~~~
      select rownum, width_bucket(order_count, 0, 2500, 11) from (
      select c.customer_id, count(o.order_id) as order_count
      from CUSTOMER c LEFT JOIN ORDERS o
      using (customer_id)
      where c.last_update_user = 'MDM_ETL' and
      o.last_update_user = 'MDM_ETL'
      group by c.customer_id
      order by order_count desc)
        • 1. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
          sb92075
          How do I ask a question on the forums?
          SQL and PL/SQL FAQ
          • 2. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
            Frank Kulash
            Hi,

            Welcome to the forum!
            user10183256 wrote:
            Hi,
            I'm using Oracle 11g.
            Thanks for including that. Sometimes, the difference between 11.1 and 11.2 can be significant. Why not post your complete version number, such as 11.2.0.3.0?
            I am trying to get a count of customers who have a certain count of orders. I have a parent to child relationship between customer (parent) and orders (child). The output should be something like:
            0-100; 35 (35 customers have zero -100 orders)
            101-200; 55 (55 customers have this many orders)
            201-300; 12 (12 customers have this many orders)
            Whenever you have a questtion, please post CREATE TABLE and INSERT statements for your sample data, so that the people who want to help you can re-create the problem and test their ideas. See the forum FAQ {message:id=9360002}
            I tried using the WIDTH_BUCKET function but it doesn't 'group' the counts, and I don't know how it works over the join between the parent/child tables.

            Any help is GREATLY appreciated. My skill level in SQL is uh, well on the low end honestly.

            I posted my crude attempts below
            Thanks; posting your efforts is often very helpful.
            Thanks
            Alex

            ~~~~~~
            select rownum, width_bucket(order_count, 0, 2500, 11) from (
            select c.customer_id, count(o.order_id) as order_count
            from CUSTOMER c LEFT JOIN ORDERS o
            using (customer_id)
            where c.last_update_user = 'MDM_ETL' and
            o.last_update_user = 'MDM_ETL'
            group by c.customer_id
            order by order_count desc)
            Here's one way:
            WITH     all_buckets     AS
            (
                 SELECT     0 AS low_val, 100 AS high_val  FROM dual  UNION ALL
                 SELECT     101,            200                FROM dual  UNION ALL
                 SELECT     201,           300            FROM dual
            )
            ,     got_order_cnt     AS
            (
                 SELECT       COUNT (o.order_id)     AS order_cnt
                 FROM       customer  c
                 LEFT JOIN orders    o  ON  o.customer_id  = c.customer_id
                 GROUP BY  c.customer_id
            )
            SELECT       ab.low_val
            ,       ab.high_val
            ,       COUNT (oc.order_cnt)     AS bucket_cnt
            FROM       all_buckets          ab
            LEFT JOIN got_order_count  oc  ON  oc.order_cnt  BETWEEN  ab.low_val
                                                            AND       ab.high_val
            GROUP BY  ab.low_val, ab.high_val
            ORDER BY  ab.low_val
            ;
            You might consider creating a real table like all_buckets.
            On the other hand, if there's a very regular pattern (as in this case, every bucket has a width of 100) then you could derive the bucket number on the fly (in this case, CEIL (order_cnt / 100), or using WIDTH_BUCKET), without a table or sub-query like all_buckets.

            A solution involving WIDTH_BUCKET might be
            WITH     got_bucket_num     AS
            (
                 SELECT       WIDTH_BUCKET ( COUNT (o.order_id)
                                    , 0
                                  , 300
                                  , 3
                                  )     AS bucket_num
                 FROM       customer  c
                 LEFT JOIN orders    o  ON  o.customer_id  = c.customer_id
                 GROUP BY  c.customer_id
            )
            SELECT       bucket_num
            ,       COUNT (*)     AS bucket_cnt
            FROM       got_bucket_num
            GROUP BY  bucket_num
            ORDER BY  bucket_num
            ;
            Of course, I can't test anything, because I don't have sample versions of your tables.
            • 3. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
              Syed Ullah
              If you want to try it by using only FLOOR and CEIL function, this could be a possible solution (may not be the most efficient though):
              select case when FLOOR(order_count/100) = 0 then '0 - 100'
                          else (FLOOR((order_count - 1)/100) * 100 + 1) || ' - '   || (CEIL(order_count/100) * 100)
                     end as range,
                     count(*) as total   
              from (
                      select c.customer_id, count(o.order_id) as order_count
                      from CUSTOMER c LEFT JOIN ORDERS o
                      using (customer_id)
                      where c.last_update_user = 'MDM_ETL' and
                      o.last_update_user = 'MDM_ETL' 
                      group by c.customer_id
                  )
              group by case when FLOOR(order_count/100) = 0 then '0 - 100'
                            else (FLOOR((order_count - 1)/100) * 100 + 1) || ' - ' || (CEIL(order_count/100) * 100)
                       end
              order by range;
              • 4. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
                Frank Kulash
                Hi,

                If you don't want to post sample data, then post a question that invloves commonly available tables, such as those in the hr or scott schemas. You still need to post your desired results, of course, and explain how you get them from the sample data.

                For example, the 4 departments in scott.dept have 0, 3, 5 and 6 employees. Say we want to group these in buckets of 4, then we can modify the last query I posted before to use the scott tables and the 0-12 range like this:
                WITH     got_bucket_num     AS
                (
                     SELECT       WIDTH_BUCKET ( COUNT (e.deptno)
                                        , 0
                                      , 12
                                      , 3
                                      )     AS bucket_num
                     FROM       scott.dept  d
                     LEFT JOIN scott.emp   e  ON  e.deptno  = d.deptno
                     GROUP BY  d.deptno
                )
                SELECT       bucket_num
                ,       COUNT (*)     AS bucket_cnt
                FROM       got_bucket_num
                GROUP BY  bucket_num
                ORDER BY  bucket_num
                ;
                Output:
                BUCKET_NUM BUCKET_CNT
                ---------- ----------
                         1          2
                         2          2
                Which is correct, but not as informative as it could be. Also, it doesnt show the highest bucket, which has a 0 count. This can be fixed, but I think the best solution in this case is to forget about WIDTH_BUCKET, and use a table or sub-query like all_buckets in my first query.
                • 5. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
                  978455
                  I apologize for not following the protocol. It does make answering questions easier. I am thankful for the answers. I will try some of them out, I can see it's not an easy creature to tackle.
                  • 6. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
                    978455
                    HI Everyone,
                    Thanks so much for your help - both in using the forum, and with query suggestions. I came up with an answer that
                    - let's YOU (OK, your business users) define the number and size of buckets in a frequency distribution

                    Here's my solution.
                    Thanks Again for Your Help!!
                    Alex


                    with starter as(
                    select cust.customer_id, count(o.oder_id) as row_count
                    from CUSTOMER cust LEFT JOIN ORDER o
                    using (customer_id)
                    where cust.last_update_user = 'MDM_ETL' and
                    o.last_update_user = 'MDM_ETL' -- and
                    group by cust.customer_id
                    order by row_count desc)
                    select 5 , '2000 or more' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count >= 2000
                    union
                    select 4, '1500 - 1999' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 1500 and 1999
                    union
                    select 3, '1000 - 1499' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 1000 and 1499
                    union
                    select 2, '500 - 999' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 500 and 999
                    union
                    select 1, '0 - 499' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 0 and 499
                    • 7. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
                      978455
                      Hi Everyone,
                      Thanks so much for your help in answering the question, and using the forum properly. I worked last night and the Grace of God - found a pretty simple solution - that give frequency distribution a boost.
                      This lets YOU (OK, your business partner) define the number of buckets and the bucket range.

                      Thanks Again!!
                      Alex



                      with starter as(
                      select cust.customer_id, count(o.order_id) as row_count
                      from CUSTOMER cust LEFT JOIN ORDER o
                      using (customer_id)
                      where cust.last_update_user = 'MDM_ETL' and
                      o.last_update_user = 'MDM_ETL' -- and
                      group by cust.customer_id
                      order by row_count desc)
                      select 5 , '2000 or more' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count >= 2000
                      union
                      select 4, '1500 - 1999' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 1500 and 1999
                      union
                      select 3, '1000 - 1499' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 1000 and 1499
                      union
                      select 2, '500 - 999' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 500 and 999
                      union
                      select 1, '0 - 499' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 0 and 499
                      • 8. Re: Grouping Counts - Can You Group WIDTH_BUCKET Counts?
                        Frank Kulash
                        Hi, Alex,

                        I'm glad you got it working.
                        user10183256 wrote:
                        HI Everyone,
                        Thanks so much for your help - both in using the forum, and with query suggestions. I came up with an answer that
                        - let's YOU (OK, your business users) define the number and size of buckets in a frequency distribution

                        Here's my solution.
                        Thanks Again for Your Help!!
                        Alex


                        with starter as(
                        select cust.customer_id, count(o.oder_id) as row_count
                        from CUSTOMER cust LEFT JOIN ORDER o
                        ORDER is an Oracle reserved word (as in "ORDER BY ..."); that means it's not a good table name. You should call your table something else, such as ORDERS with an S at the end.
                        using (customer_id)
                        where cust.last_update_user = 'MDM_ETL' and
                        o.last_update_user = 'MDM_ETL' -- and
                        group by cust.customer_id
                        order by row_count desc)
                        There's no point in having an ORDER BY clause in this sub-query. An ORDER BY clause in a sub-query does not guarantee the order of output rows in the final result set.
                        select 5 , '2000 or more' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count >= 2000
                        union
                        select 4, '1500 - 1999' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 1500 and 1999
                        union
                        select 3, '1000 - 1499' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 1000 and 1499
                        union
                        select 2, '500 - 999' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 500 and 999
                        union
                        select 1, '0 - 499' as cnt_of_orders, count(starter.customer_id) as nmb_customers_with_this_cnt from starter where starter.row_count between 0 and 499
                        All these UNIONs will be inefficient. Why not define the buckets in a table, or in a sub-query, like I did earlier, and join that to starter?
                        If you actually create a table that defines how many buckets there are, and what their ranges are, then an authorized user (who doesn't need to know anything about SQL programming) can maintain it.
                        If you make the bucket table a global temporary table, then users can change the bucket sizes at run-time. Several users could run the exact same query, each with their own customized bucket table.