6 Replies Latest reply: Feb 15, 2013 4:19 PM by 708388 RSS

    Summary by year sum or analytic

    708388
      I would like to find out how to summarize some data and do calculations without hardcoding it. Trying to sum counts by year and then show counts for 4 years and 6 years without hard code if possible. The 4 yr would be calculated by adding 390 to the cohortterm (200540 + 390 = 200930) The 6 yr would be calculated by adding 590 to the cohortterm (200540 + 590 = 201130).

      The last 2 cols, which I haven't done yet would be a percentage based on Gradyr4 divided by HeadCount and another for Gradyr6 divided by headcount. If the pcts are too tricky, I can copy the results to excel to do that part. Eventually, I will need to further break down the counts by Gender and Ethnicity, but hopefully, I can figure it out once I know the syntax for doing the overall numbers.

      BANNER
      Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

      Sample output based on the query I have built so far using sum and hard coding for one year only. The actual results would show counts rather than 0's in the year rows displayed below:
      COHORTTERM     HEADCOUNT     GRADYR4     GRADYR6     PCT4Yr  PCT6Yr
      200340          0          0     0
      200440          0          0     0
      200540          600          403     539     (403/600)
      200640          0          0     0
      200740          0          0     0
      SQL
      select cohortterm, 
               sum(case when cohortterm = '200540' then 1 else 0 end) as headcount,
               sum(
                      CASE WHEN cohortterm = '200540' and gradtermban < ''200930' then 1 else 0 end
                      ) as gradyr4,
                      
               sum(
                      CASE WHEN cohortterm = '200540' and gradtermban < '201130' then 1 else 0 end
                      ) as gradyr6
               null as pct4yr, null as pct6yr       
                     
      from cohorts
      where cohortterm between '200340' and '201240'
      and substr(ftpt,1,1) = 'F'
      group by cohortterm
      order by 1
        • 1. Re: Summary by year sum or analytic
          rp0428
          Since you are already grouping on COHORTTERM why are you hardcoding the values (e.g. sum(case when cohortterm = '200540' then 1 else 0 end) as headcount0 in the case statements?

          If you really have different conditions for those values
          CASE WHEN cohortterm = '200540' and gradtermban < '200930' then 1 else 0 end
                          ) as gradyr4,
          then just add more conditions for the different COHORTTERM values. How many distinct values of COHORTTERM are there?

          Also explain your calculations like this one: 'The 4 yr would be calculated by adding 390 to the cohortterm (200540 + 390 = 200930) '

          What do those different values indicate?
          • 2. Re: Summary by year sum or analytic
            708388
            Thanks for responding. I would like to get away from hard coding. I posted that query to show what I have done so far when I realized that there probably ought to be a more flexible method to summarize the data. The terms are made up of a year and a two digit code indicating spring summer and fall. We are looking to identify the start term and count how many graduated in 4 years and 6 years. Since I was given a sample query that showed 200540 as the start and < 200930 as the end (meaning it would include 200920 and not 200930 or 200940), I realized I could not simply add 4 to the 2005. I did a subtraction of 200930 from 200540 and that is where the 390 came from. For now, I've gone with your suggestion to continue adding more conditions and looks like that will provide the results.
            with c as (
            select cohortterm, 
                     sum(recordcount) as headcount,
                     sum(
                            CASE WHEN cohortterm = '200540' and gradtermban < '200930' then 1 
                                     WHEN cohortterm = '200640' and gradtermban < ''201030' then 1
                                     WHEN cohortterm = '200740' and gradtermban < '201130' then 1
            --...etc
            --...
                                     else 0 
                            end
                            ) as gradyr4,
                     sum(
                            CASE WHEN cohortterm = '200340' and gradtermban < '200930' then 1 
                                     WHEN cohortterm = '200440' and gradtermban < '201030' then 1
                                     WHEN cohortterm = '200540' and gradtermban < '201130' then 1
            --...etc
            --...
                                     else 0 
                            end
                            ) as gradyr6
            from cohorts
            where cohortterm between '200340' and '201240'
            and substr(ftpt,1,1) = 'F'
            group by cohortterm
            )
            select cohortterm, headcount, gradyr4, (gradyr4)/headcount as pct4yr 
            from c
            order by 1
            • 3. Re: Summary by year sum or analytic
              Stew Ashton
              I think you just need to "hard code" the meaning of the two digits at the end.

              What are the possible values and their meaning?

              Sample data (with CREATE TABLE and INSERT statements) would be most helpful.
              • 4. Re: Summary by year sum or analytic
                rp0428
                >
                The terms are made up of a year and a two digit code indicating spring summer and fall. We are looking to identify the start term and count how many graduated in 4 years and 6 years. Since I was given a sample query that showed 200540 as the start and < 200930 as the end (meaning it would include 200920 and not 200930 or 200940), I realized I could not simply add 4 to the 2005. I did a subtraction of 200930 from 200540 and that is where the 390 came from
                >
                You need to finishing nailing down the requirements before you start writing code. For that you need the answers to (post them if you have them):

                1. what are the two digit codes for: spring, summer and fall

                2. if the start term is for spring how do you calculate the end term that would represent 1 year? 2 years? 3 years? 'n' years?

                3. do you need to take into account a term (period/semester/whatever you call it) that a student skipped and did not attend? A student might attend 2005 in the fall and 2009 in the spring but that does that mean they graduated in 4 years when they didn't even attend any other periods? They really only took one 'logical' year; it just took 4 years to do it.

                The answers to the above will tell you how calculate the end period from the start period and then you don't need to hard code those values.
                • 5. Re: Summary by year sum or analytic
                  Stew Ashton
                  Based on your sample code, here's a guess at what you want. You just need to extract the year part, add 4 or 6, then put it back with the right value for the period. By the way, putting two bits of information in the same column is against "normal form" database theory.
                  DROP TABLE T;
                  CREATE TABLE T(COHORTTERM VARCHAR2(6), GRADTERMBAN VARCHAR2(6));
                  INSERT ALL
                  into t values('200304', '200620')
                  INTO T VALUES('200304', '200720')
                  INTO T VALUES('200304', '200820')
                  INTO T VALUES('200304', '200920')
                  INTO T VALUES('200404', '200620')
                  INTO T VALUES('200404', '200720')
                  INTO T VALUES('200404', '200820')
                  INTO T VALUES('200404', '200920')
                  SELECT NULL FROM DUAL;
                  
                  SELECT
                  COUNT(*) CNT,
                  SUM(CASE WHEN GRADTERMBAN < TO_NUMBER(SUBSTR(COHORTTERM,1,4))+4||'03' THEN 1 END) YEAR4,
                  sum(case when gradtermban < TO_NUMBER(SUBSTR(COHORTTERM,1,4))+6||'03' then 1 end) year6
                  FROM T
                  group by cohortterm;
                  
                         CNT      YEAR4      YEAR6
                  ---------- ---------- ----------
                           4          2          4 
                           4          1          3
                  • 6. Re: Summary by year sum or analytic
                    708388
                    Thanks Stew,

                    Your TO_NUMBER(SUBSTR(COHORTTERM,1,4))+4||'03' formula worked after changing '03' to '30' as the last two digits are '20','30','40'.