6 Replies Latest reply: Oct 22, 2012 2:22 AM by wkobargs-Oracle RSS

    2 queries should return the same result (but they dont...)

    UserMB
      hello
      i have a following query:
      select col1,extract(year from datum) yr, COUNT(*)
      from tableA@dblink where
      DATUM between '1-jan-1985' and '31-dec-2012'
      and col2 > 100 and col2 not in ('999999')
      and TRIM(TO_CHAR(col1)) in ('0','1')
      group by col1,extract(year from DATUM);
      the above query returns the count: 143 982 for year 1991
      however when i put the filter directly into this year the query returns a different number: 143 917
      select col1,extract(year from datum) yr, COUNT(*)
      from tableA@dblink  where
      DATUM between '1-jan-1991' and '31-dec-1991'
      and col2 > 100 and col2 not in ('999999')
      and TRIM(TO_CHAR(col1)) in ('0','1')
      group by col1,extract(year from DATUM);
      please can you help me understand why is this happening and why these 2 counts are different?
      id appreciate any tips
      thanks very much
      rgds
        • 1. Re: 2 queries should return the same result (but they dont...)
          908002
          Try these...
          select col1,extract(year from datum) yr, COUNT(*)
          from tableA@dblink where
          trunc(DATUM) between to_date('1-jan-1985','dd-mon-yyyy') and to_date('31-dec-2012','dd-mon-yyyy')
          and col2 > 100 and col2 not in ('999999')
          and TRIM(TO_CHAR(col1)) in ('0','1')
          group by col1,extract(year from DATUM);
          select col1,extract(year from datum) yr, COUNT(*)
          from tableA@dblink  where
          trunc(DATUM) between to_date('1-jan-1991','dd-mon-yyyy') and to_date('31-dec-1991','dd-mon-yyyy')
          and col2 > 100 and col2 not in ('999999')
          and TRIM(TO_CHAR(col1)) in ('0','1')
          group by col1,extract(year from DATUM);
          • 2. Re: 2 queries should return the same result (but they dont...)
            Frank Kulash
            Hi,

            If datum is a DATE, then don't try to compare it to strings, such as '1-jan-1985'. Compare it to other DATEs, such as the values returned by TO_DATE:
            WHERE   datum >= TO_DATE ('1-jan-1985', 'dd-mon-yyyy')
            AND     datum <  TO_DATE ('1-jan-2013', 'dd-mon-yyyy')
            or DATE literals:
            WHERE   datum >= DATE '1985-01-01'
            AND     datum <  DATE '2013-01-01'
            Also, remember that Oracle DATEs always include hours, minutes and seconds. "BETWEEN dt1 AND dt2" *excludes almost all of dt2.  It includes midnight (00:00:00) on dt2, but no other times on dt2.


             

            I hope this answers your question.
            If not, post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) that shows your problem, and the results you want from that data.
            Explain, using specific examples, how you get those results from that data.
            Always say what version of Oracle you're using (e.g. 11.2.0.2.0).
            See the forum FAQ {message:id=9360002}
            • 3. Re: 2 queries should return the same result (but they dont...)
              Kim Berg Hansen
              Kiran wrote:
              trunc(DATUM) between to_date('1-jan-1991','dd-mon-yyyy') and to_date('31-dec-1991','dd-mon-yyyy')
              Please remember, that a predicate on trunc(DATUM) prohibits the use of indexes on DATUM (unless you create a function based index on trunc(DATUM).)
              You will therefore enforce either full table scan or fast full index scan.

              Consider either:
              DATUM between to_date('1-jan-1991','dd-mon-yyyy') and to_date('31-dec-1991 23:59:59','dd-mon-yyyy hh24:mi:ss')
              Or:
              DATUM >= to_date('1-jan-1991','dd-mon-yyyy') and 
              DATUM <  to_date('1-jan-1992','dd-mon-yyyy')
              Do not, please, not use trunc(DATUM)...
              • 4. Re: 2 queries should return the same result (but they dont...)
                UserMB
                hi
                im running the query with the to_date function and it runs already more then half an hour, maybe more
                so i think this already prevented the index from being used;
                the source table i access via dblink is residing in RDB database, its not oracle
                thats why originally i had simply literal
                anyway once the query is returned i will post an update;
                this is one time check, i simply need to know one time if the exported data are the same or not;
                so i dont have to worry about indexes not being used on regular basis
                thx
                rgds
                • 5. Re: 2 queries should return the same result (but they dont...)
                  Sven W.
                  UserMB wrote:
                  please can you help me understand why is this happening and why these 2 counts are different?
                  id appreciate any tips
                  This is tricky. All others of cause are right. You must compare date columns with date values.
                  If you don't explicitly compare as a date value, then it might happen that a string comparison is made.

                  And if that happens both of your counts would return a wrong result. For example all dates that start with
                  '4' are not counted. Like '4-jan-1991'. The string '4-jan-1991' is greater then the string '31-dec-1991'. Therefore it would not be included in the count.
                  I don't think this happened. But the danger is there.

                  However I have a problem seeing, what the difference between your two counts can be.
                  The two important parts are the date between filter and the group by condition.
                  DATUM between '1-jan-1985' and '31-dec-2012'
                  ...
                  DATUM between '1-jan-1991' and '31-dec-1991'
                  ...
                  group by col1,extract(year from DATUM);
                  The second option returns less results than the first option. Some rows must be missing because of this different filter condition.
                  If this would be a text comparison then there shouldn't be a difference. Only if it is a date comparison then this could be explained.

                  Here is an example:
                  Dates like 31-dec-1991 17:53:14 are included in the first count+group, but not included in the second count.
                  Why? Because the string '31-dec-1991' is converted into a date. This date is 31-dec-1991 00:00:00 (midnight). Everything from this day, that is not exactly on midnight is greater than this value and therefore not included in your second query.

                  So as others already pointed out you need to either truncate the date or to compare it a little differently.
                  select col1,extract(year from datum) yr, COUNT(*)
                  from tableA@dblink  
                  where DATUM >= to_date('01-01-1991','dd-mm-yyyy')
                  and  DATUM < to_date('01-01-1992','dd-mm-yyyy')
                  and col2 > 100 and col2 not in ('999999')
                  and TRIM(TO_CHAR(col1)) in ('0','1')
                  group by col1,extract(year from DATUM);
                  This should give the same result as your first query.

                  Note that I changed the month from a month name to a number. This makes it independent from national language settings. E.g. DEZ = German, DEC = American

                  Edited by: Sven W. on Oct 9, 2012 2:04 PM
                  • 6. Re: 2 queries should return the same result (but they dont...)
                    wkobargs-Oracle
                    Hello,

                    which version of Rdb are you using? There is a known performance bug using the TO_DATE function if the database is Rdb:

                    Bug 9112403 - BAD PERFORMANCE USING TO_DATE FUNCTION

                    The bug is fixed in Rdb 7.2.5 and later.

                    Regards
                    Wolfgang