8 Replies Latest reply on Jul 23, 2008 10:38 PM by Aketi Jyuuzou

    Join using LIKE operator [SOLVED]

    Peter Gjelstrup
      Hi,
      BANNER                                                          
      -------------------------------------------------
      Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
      PL/SQL Release 10.2.0.3.0 - Production                         
      CORE     10.2.0.3.0     Production                                     
      TNS for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Productio
      NLSRTL Version 10.2.0.3.0 - Production
      I am given a log table tlog that has a few mio records, and need to group the messages based on their pattern. My problem is that the message are not defined anywhere, so I have created my own definition in table regx. Currently, this table has 65 records and I expect it to grow to perhaps 100-200.

      These are the tables:
      drop table tlog purge;
      drop table regx purge;

      create table tlog (id number(9) primary key, msg varchar2(55));
      create table regx (id number(9) primary key, pattern varchar2(55) not null unique);

      insert into tlog values(1, 'Error where item = 1234');
      insert into tlog values(2, 'Error where item = 2345');

      insert into tlog values(3, 'Discard due to code = 123 on customer');
      insert into tlog values(4, 'Discard due to code = 234 on customer');
      insert into tlog values(5, 'Discard due to code = 345 on customer');

      insert into tlog values(6, 'Discard due to code = 123 on vendor = 456');
      insert into tlog values(7, 'Discard due to code = 234 on vendor = 567');
      insert into tlog values(8, 'Discard due to code = 345 on vendor = 678');
      insert into tlog values(9, 'Discard due to code = 456 on vendor = 789');

      insert into regx values (100, 'Error where item =%');
      insert into regx values (200, 'Discard due to code =%on customer');
      insert into regx values (300, 'Discard due to code =%on vendor =%');
      commit
      /

      select * from tlog;

      select * from regx;
      /
      Now, I can do something like this:
      select regx.id, count(*)
        from regx, tlog
      where tlog.msg like regx.pattern
      group by regx.id
      order by regx.id;

              ID   COUNT(*)
      ---------- ----------
             100          2
             200          3
             300          4
      3 rows selected.      
      So far so good. My problem now is that I have messages that does not fall into
      any of the categories. Consider
      insert into tlog values(10, 'Other kind of message');
      insert into tlog values(11, 'Another kind of message');
      insert into tlog values(12, 'Yet, another');
      insert into regx values (400, '%');
      commit
      /

      select regx.id, count(*)
        from regx, tlog
      where tlog.msg like regx.pattern
      group by regx.id
      order by regx.id;

              ID   COUNT(*)
      ---------- ----------
             100          2
             200          3
             300          4
             400         12 wrong!
      4 rows selected.      
      The correct answer for group *400* would be 3.

      Any ideas on how to handle these other messages?

      Any changes to the regx table, including contents and definition is possible. Changes to tlog table is not.

      Regards
      Peter

      Message was edited by:
      Peter Gjelstrup (Added banner)

      Message was edited by:
      Peter Gjelstrup [SOLVED]
        • 1. Re: Join using LIKE operator
          Tomasz K.
          Perhaps code below will be sufficient:
          DELETE FROM regx WHERE id=400;
          COMMIT;

          select Nvl(regx.id,400), count(*)  from regx, tlog
          where tlog.msg LIKE regx.pattern(+)
          group by regx.id
          order by regx.id;
          Tomasz K.
          • 2. Re: Join using LIKE operator
            Peter Gjelstrup
            Thank you Tomasz,

            I will definately take that into account, also.

            I may have oversimplified my example since both tlog and regx has an additional category that needs to be included. Each category has it its own other, and I have apprx. 25 of these in tlog, but only 20 in regx.

            Thanks again
            Peter
            • 3. Re: Join using LIKE operator
              Tomasz K.
              Could you post your "real" tables structure with sample data?
              Is this "additional category" stored as separate column ?
              Maybe I can help then.

              Thanks
              Tomasz K.
              • 4. Re: Join using LIKE operator
                Peter Gjelstrup
                Thank you Tomasz,

                I'm working on it.
                • 5. Re: Join using LIKE operator
                  Aketi Jyuuzou
                  Maybe this.
                  select regx.id,
                  case regx.pattern when '%' then
                  count(*) - sum(decode(regx.pattern,'%',0,count(*))) over()
                  else count(*) end
                  from regx,tlog
                  where tlog.msg like regx.pattern
                  group by regx.id,regx.pattern
                  order by regx.id;
                  • 6. Re: Join using LIKE operator
                    Peter Gjelstrup
                    Tomasz, new column (condition) added

                    drop table tlog purge;
                    drop table regx purge;

                    create table tlog (id number(9) primary key, condition varchar2(9) not null, msg varchar2(55));
                    create table regx (id number(9) primary key, condition varchar2(9) not null, pattern varchar2(55) not null
                                      ,constraint uk unique(condition, pattern));



                    insert into tlog values(1, 'SPE00001', 'Error where item = 1234');
                    insert into tlog values(2, 'SPE00001', 'Error where item = 2345');

                    insert into tlog values(3, 'SPE00001', 'Discard due to code = 123 on customer');
                    insert into tlog values(4, 'SPE00001', 'Discard due to code = 234 on customer');
                    insert into tlog values(5, 'SPE00001', 'Discard due to code = 345 on customer');

                    insert into tlog values(6, 'SPE00001', 'Discard due to code = 123 on vendor = 456');
                    insert into tlog values(7, 'SPE00001', 'Discard due to code = 234 on vendor = 567');
                    insert into tlog values(8, 'SPE00001', 'Discard due to code = 345 on vendor = 678');
                    insert into tlog values(9, 'SPE00001', 'Discard due to code = 456 on vendor = 789');

                    insert into tlog values(10, 'SPE00001', 'Other kind of message');
                    insert into tlog values(11, 'SPE00001', 'Another kind of message');
                    insert into tlog values(12, 'SPE00001', 'Yet, another');

                    insert into regx values (100, 'SPE00001', 'Error where item =%');
                    insert into regx values (200, 'SPE00001', 'Discard due to code =%on customer');
                    insert into regx values (300, 'SPE00001', 'Discard due to code =%on vendor =%');
                    insert into regx values (400, 'SPE00001', '%');
                    commit
                    /

                    insert into tlog values(21, 'SPE00002', 'Error where item = 1234');
                    insert into tlog values(22, 'SPE00002', 'Error where item = 2345');

                    insert into tlog values(23, 'SPE00002', 'Discard due to reason = 123 on customer');
                    insert into tlog values(24, 'SPE00002', 'Discard due to reason = 234 on customer');
                    insert into tlog values(25, 'SPE00002', 'Discard due to reason = 345 on customer');

                    insert into tlog values(26, 'SPE00002', 'Discard due to code = 123 on invoice = 456');
                    insert into tlog values(27, 'SPE00002', 'Discard due to code = 234 on invoice = 567');
                    insert into tlog values(28, 'SPE00002', 'Discard due to code = 345 on invoice = 678');
                    insert into tlog values(29, 'SPE00002', 'Discard due to code = 456 on invoice = 789');
                    insert into tlog values(30, 'SPE00002', 'Discard due to code = 456 on invoice = 789');
                    insert into tlog values(31, 'SPE00002', 'Discard due to code = 456 on invoice = 789');

                    insert into tlog values(32, 'SPE00002', 'Other kind of message');
                    insert into tlog values(33, 'SPE00002', 'Another kind of message');
                    insert into tlog values(34, 'SPE00002', 'Yet, another');
                    insert into tlog values(35, 'SPE00002', 'Yet, another');
                    insert into tlog values(36, 'SPE00002', 'Yet, another');

                    insert into regx values (500, 'SPE00002', 'Error where item =%');
                    insert into regx values (600, 'SPE00002', 'Discard due to reason =%on customer');
                    insert into regx values (700, 'SPE00002', 'Discard due to code =%on invoice =%');
                    insert into regx values (800, 'SPE00002', '%');
                    commit
                    /

                    insert into tlog values(41, 'SPE00003', 'Error 1234');
                    insert into tlog values(42, 'SPE00003', 'Error 2345');

                    insert into tlog values(43, 'SPE00003', 'Reason = 123 on customer');
                    insert into tlog values(44, 'SPE00003', 'Reason = 234 on customer');

                    insert into tlog values(46, 'SPE00003', 'Other kind of message');
                    insert into tlog values(47, 'SPE00003', 'Another kind of message');

                    insert into regx values (900, 'SPE00003', '%');
                    commit
                    /

                    select * from tlog;

                    select * from regx;
                    /
                    select regx.id, tlog.condition, count(*)
                      from regx, tlog
                    where tlog.condition = regx.condition
                       and tlog.msg like regx.pattern
                    group by regx.id, tlog.condition
                    order by tlog.condition, regx.id;

                            ID CONDITION   COUNT(*)
                    ---------- --------- ----------
                           100 SPE00001           2
                           200 SPE00001           3
                           300 SPE00001           4
                           400 SPE00001          12 expect 3
                           500 SPE00002           2
                           600 SPE00002           3
                           700 SPE00002           6
                           800 SPE00002          16 expect 5
                           900 SPE00003           6
                    Note, there are approx 25 conditions, so I would like to avoid any decode/case stuff.

                    Regards
                    Peter
                    • 7. Re: Join using LIKE operator
                      Peter Gjelstrup
                      Thanks Aketi,

                      Looks very promising, will look into that.

                      Like that over(), would never had thougth of that :-)

                      Regards
                      Peter
                      • 8. Re: Join using LIKE operator
                        Aketi Jyuuzou
                        OK
                        Let us use below.
                        select regx.id,
                        case regx.pattern when '%' then
                        count(*) - sum(decode(regx.pattern,'%',0,count(*))) over(partition by regx.condition)
                        else count(*) end cn
                        from regx,tlog
                        where tlog.msg like regx.pattern
                           and regx.condition = tlog.condition
                        group by regx.id,regx.pattern,regx.condition
                        order by regx.id;