Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

AVG of un normalized data

a_yavuzJan 2 2011 — edited Jan 3 2011
Hi,

I have a table as TFORM_RESULTS. And a column VANSWERS in that table. I keep the answers in that column as like "12:3:5:18". Simply my table data is like this:
TFORM_RESULTS

ID      VANSWERS     NFORM_ID 
1       12:3:5:6:7       2
2       13:2:3:2:4       2
3       10:8:1:3:4       2
4       19:4:6:3:7       2
The reason that i keep data un-normalized is maybe VANSWERS may change. In this example we have 5 answers but for NFORM_ID = 6 we may have 8 answers. But for the same FORM_ID we have equal answer count.

I would like to get the average of this answers as like this:
AVG 
13,5:4.25:3,75:3,5:5,5
Is there a simple way to get average as like this? I use APEX_UTIL.STRING_TO_TABLE to parse the column. There maybe simple solutions.

Thanks.

yvz
This post has been answered by Solomon Yakobson on Jan 2 2011
Jump to Answer

Comments

Etbin
NOT TESTED!
select to_char(avg(to_number(substr(vanswers,1,instr(vanswers,':',1,1) - 1))))||':'||
       to_char(avg(to_number(substr(vanswers,instr(vanswers,':',1,1) + 1,instr(vanswers,':',1,2) - instr(vanswers,':',1,1) - 1))))||':'||
       to_char(avg(to_number(substr(vanswers,instr(vanswers,':',1,2) + 1,instr(vanswers,':',1,3) - instr(vanswers,':',1,2) - 1))))||':'||
       to_char(avg(to_number(substr(vanswers,instr(vanswers,':',1,3) + 1,instr(vanswers,':',1,4) - instr(vanswers,':',1,3) - 1))))||':'||
       to_char(avg(to_number(substr(vanswers,instr(vanswers,':',1,4) + 1)))) "AVG"
  from tform_results
Regards

Etbin
a_yavuz
Thanks for the reply

But i cant use static code. As i said answer count may change in the cell. In my example there are 5 answers but this can be 3 for NFORM_ID = 10. So its not static.
Etbin
But i cant use static code.
Then you need a function which would read a row from your table, count the number of colons in vanswers, generate the appropriate query, execute it and return the string of averages.
Will try to post a sketch just in case it may suit you.

Regards

Etbin
function averages return varchar2 is

  nanswers number;
  the_sql varchar2(32767) := q'{to_char(avg(to_number(substr(vanswers,1,instr(vanswers,':',1,1) - 1))))||':'||}';
  ret_val varchar2(4000);

begin

  select length(vanswers) - length(replace(vanswers,';',''))
    into nanswers
    from tform_results
   where rownum = 1;

  for i in 1 .. nanswers - 1
  loop
    the_sql := the_sql ||
               q'{to_char(avg(to_number(substr(vanswers,
                                               instr(vanswers,':',1,}' || to_char(i) || q'{) + 1,
                                               instr(vanswers,':',1,}' || to_char(i + 1) || q'{) - instr(vanswers,':',1,}' || to_char(i) || q'{) - 1
                                              ))))||':'||
                 }';
  end loop;

  the_sql := the_sql ||
             q'{to_char(avg(to_number(substr(vanswers,instr(vanswers,':',1,}' || to_char(nanswers) || q'{) + 1)))) "AVG"
               }';

/*  dbms_output.put_line(the_sql);  -- while testing check if the_sql is well formed before executing it */

  execute immediate the_sql into ret_val;

  return ret_val;

end;
Edited by: Etbin on 2.1.2011 15:06
NOT TESTED!
Solomon Yakobson
Answer
with tform_results as (
                       select 1 id,'12:3:5:6:7' vanswers,2 nform_id from dual union all
                       select 2,'13:2:3:2:4',2 from dual union all
                       select 3,'10:8:1:3:4',2 from dual union all
                       select 4,'19:4:6:3:7',2 from dual
                      )
-- end of data sample
select  ltrim(
              sys_connect_by_path(
                                  avg_val,
                                  ':'
                                 ),
              ':'
             ) vanswers,
        nform_id
  from  (
         select  avg(regexp_substr(vanswers,'[^:]+',1,column_value)) avg_val,
                 nform_id,
                 column_value answer_id
           from  tform_results,
                 table(
                       cast(
                            multiset(
                                     select  level
                                       from  dual
                                       connect by level <= length(regexp_replace(vanswers,'[^:]'))
                                    )
                            as sys.OdciNumberList
                           )
                      )
           group by nform_id,
                    column_value
        )
  where connect_by_isleaf = 1
  start with answer_id = 1
  connect by nform_id = prior nform_id
         and answer_id = prior answer_id + 1
/

VANSWERS                         NFORM_ID
------------------------------ ----------
13.5:4.25:3.75:3.5                      2

Elapsed: 00:00:00.06
SQL> 
SY.
Marked as Answer by a_yavuz · Sep 27 2020
Etbin
A nice one (a proof that avoiding list use does have it's drawbacks -> thinking of dynamic sql too soon)

Regards

Etbin
Aketi Jyuuzou
I like recursive with clause and ListAgg and using nested agg function B-)
with tform_results(id,vanswers,nform_id) as (
select 1,'12:3:5:6:7',2 from dual union all
select 2,'13:2:3:2:4',2 from dual union all
select 3,'10:8:1:3:4',2 from dual union all
select 4,'19:4:6:3:7',2 from dual),
rec(vanswers,extStr,LV) as(
select vanswers,RegExp_Substr(vanswers,'[^:]+',1,1),1
  from tform_results
union all
select vanswers,RegExp_Substr(vanswers,'[^:]+',1,LV+1),LV+1
  from rec
 where LV+1 <= RegExp_Count(vanswers,'[^:]+'))
select ListAgg(avg(to_number(extStr)),':')
       within group(order by LV) as "avg"
  from rec
group by LV;

avg
----------------------
13.5:4.25:3.75:3.5:5.5
Solomon Yakobson
Aketi Jyuuzou wrote:
I like recursive with clause and ListAgg and using nested agg function B-)
Well, you missed it is has to be grouped by nform_id, so "full" version would be something like:
with tform_results(id,vanswers,nform_id) as (
                                             select 1,'12:3:5:6:7',1 from dual union all
                                             select 2,'13:2:3:2:4',1 from dual union all
                                             select 1,'12:3:5:6:7',2 from dual union all
                                             select 2,'13:2:3:2:4',2 from dual union all
                                             select 3,'10:8:1:3:4',2 from dual union all
                                             select 4,'19:4:6:3:7',2 from dual
                                            ),
        rec(nform_id,vanswers,answer,lv) as (
                                              select  nform_id,
                                                      vanswers,
                                                      regexp_substr(vanswers,'[^:]+',1,1),
                                                      1
                                                from  tform_results
                                             union all
                                              select  nform_id,
                                                      vanswers,
                                                      regexp_substr(vanswers,'[^:]+',1,lv + 1),
                                                      LV+1
                                                from  rec
                                              where lv + 1 <= regexp_count(vanswers,'[^:]+')
                                            )
select  nform_id,
        ListAgg(lv_avg,':') within group(order by lv) as vanswers
  from (
        select  nform_id,
                lv,
                avg(to_number(answer)) lv_avg
          from  rec
          group by nform_id,
                   lv
       )
group by nform_id;

  NFORM_ID VANSWERS
---------- ------------------------------
         1 12.5:2.5:4:4:5.5
         2 13.5:4.25:3.75:3.5:5.5

SQL> 
SY.
Aketi Jyuuzou
Oh thanks "Solomon Yakobson"
I have fixed my previous solution.

This year,I will study English hard :-)
col "avg" for a30

with tform_results(id,vanswers,nform_id) as (
select 1,'12:3:5:6:7',1 from dual union all
select 2,'13:2:3:2:4',1 from dual union all
select 1,'12:3:5:6:7',2 from dual union all
select 2,'13:2:3:2:4',2 from dual union all
select 3,'10:8:1:3:4',2 from dual union all
select 4,'19:4:6:3:7',2 from dual),
rec(nform_id,vanswers,extStr,LV) as(
select nform_id,vanswers,
RegExp_Substr(vanswers,'[^:]+',1,1),1
  from tform_results
union all
select nform_id,vanswers,
RegExp_Substr(vanswers,'[^:]+',1,LV+1),LV+1
  from rec
 where LV+1 <= RegExp_Count(vanswers,'[^:]+'))
select nform_id,
ListAgg(to_char(avgVal),':') within group(order by LV) as "avg"
from (select nform_id,LV,
      avg(to_number(extStr)) as avgVal
      from rec
      group by nform_id,LV)
group by nform_id
order by nform_id;

NFORM_ID  avg
--------  ----------------------
       1  12.5:2.5:4:4:5.5
       2  13.5:4.25:3.75:3.5:5.5
MichaelS
Nice, but there's still some problem when lengths are not the same for different nform_id's:
SQL> with tform_results as (
  select 1 id,'12:3:5:6:7' vanswers,2 nform_id from dual union all
  select 2,'13:2:3:2:4',2 from dual union all
  select 3,'10:8:1:3:4',2 from dual union all
  select 4,'19:4:6:3:7',2 from dual union all
  select 1,'12:3:5:6:7:1',1 from dual union all
  select 2,'13:2:3:2:4:1',1 from dual 
)
-- end of data sample
select  ltrim(
              sys_connect_by_path(
                                  avg_val,
                                  ':'
                                 ),
              ':'
             ) vanswers,
        nform_id
  from  (
         select  avg(regexp_substr(vanswers,'[^:]+',1,column_value)) avg_val,
                 nform_id,
                 column_value answer_id
           from  tform_results,
                 table(
                       cast(
                            multiset(
                                     select  level
                                       from  dual
                                       connect by level <= length(regexp_replace(vanswers,'[^:]'))
                                    )
                            as sys.OdciNumberList
                           )
                      )
           group by nform_id,
                    column_value
        )
  where connect_by_isleaf = 1
  start with answer_id = 1
  connect by nform_id = prior nform_id
         and answer_id = prior answer_id + 1
/
VANSWERS                              NFORM_ID
------------------------------ ---------------
12,5:2,5:4:4:5,5                             1
13,5:4,25:3,75:3,5                           2

2 rows selected.
The final 1 is missing for nform_id = 1.
MichaelS
In 11gR2:
SQL> with tform_results as (
 select 1 id,'12:3:5:6:7' vanswers,2 nform_id from dual union all
 select 2,'13:2:3:2:4',2 from dual union all
 select 3,'10:8:1:3:4',2 from dual union all
 select 4,'19:4:6:3:7',2 from dual union all
 select 5,'16:4:6:3:7:6:1',1 from dual union all
 select 6,'1:2:4:3:8:8:1',1 from dual 
)
--
--
  select nform_id, listagg (av, ':') within group (order by rn) vanswers_av
    from (  select nform_id, avg (to_number (column_value)) av, rn
              from (select t.*, row_number () over (partition by nform_id, id order by rownum) rn, x.column_value
                      from tform_results t, xmltable (replace (vanswers, ':', ',')) x)
          group by nform_id, rn)
group by nform_id
/
       NFORM_ID VANSWERS_AV                   
--------------- ------------------------------
              1 8,5:3:5:3:7,5:7:1             
              2 13,5:4,25:3,75:3,5:5,5        

2 rows selected.
Solomon Yakobson
MichaelS wrote:
Nice, but there's still some problem when lengths are not the same for different nform_id's:
Oops, my bad. It should be:
connect by level <= length(regexp_replace(vanswers,'[^:]')) + 1
with tform_results as (
                       select 1 id,'12:3:5:6:7' vanswers,2 nform_id from dual union all
                       select 2,'13:2:3:2:4',2 from dual union all
                       select 3,'10:8:1:3:4',2 from dual union all
                       select 4,'19:4:6:3:7',2 from dual union all
                       select 1,'12:3:5:6:7:1',1 from dual union all
                       select 2,'13:2:3:2:4:1',1 from dual 
                      )
-- end of data sample
select  ltrim(
              sys_connect_by_path(
                                  avg_val,
                                  ':'
                                 ),
              ':'
             ) vanswers,
        nform_id
  from  (
         select  avg(regexp_substr(vanswers,'[^:]+',1,column_value)) avg_val,
                 nform_id,
                 column_value answer_id
           from  tform_results,
                 table(
                       cast(
                            multiset(
                                     select  level
                                       from  dual
                                       connect by level <= length(regexp_replace(vanswers,'[^:]')) + 1
                                    )
                            as sys.OdciNumberList
                           )
                      )
           group by nform_id,
                    column_value
        )
  where connect_by_isleaf = 1
  start with answer_id = 1
  connect by nform_id = prior nform_id
         and answer_id = prior answer_id + 1
  order by nform_id
/

VANSWERS                         NFORM_ID
------------------------------ ----------
12.5:2.5:4:4:5.5:1                      1
13.5:4.25:3.75:3.5:5.5                  2

SQL> 
SY.
Sven W.
a_yavuz wrote:
Hi,

I have a table as TFORM_RESULTS. And a column VANSWERS in that table. I keep the answers in that column as like "12:3:5:18". Simply my table data is like this:
The reason that i keep data un-normalized is maybe VANSWERS may change. In this example we have 5 answers but for NFORM_ID = 6 we may have 8 answers. But for the same FORM_ID we have equal answer count.
Which is exactly why you should normalize your data. There is no reason to keep this kind of data unnormalized. The number of rows might be different but the structure of the data will stay.

btw: Did you have a look at the table design of the "knowledge testing" packaged apex application? It might just give you all the table structure that you are looking for.

http://www.oracle.com/technetwork/developer-tools/apex/application-express/packaged-apps-090453.html#KT

Edited by: Sven W. on Jan 3, 2011 5:29 PM
1 - 12
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Jan 31 2011
Added on Jan 2 2011
12 comments
755 views