This discussion is archived
6 Replies Latest reply: Oct 22, 2012 12:22 AM by wkobargs RSS

2 queries should return the same result (but they dont...)

UserMB Explorer
Currently Being Moderated
hello
i have a following query:
select col1,extract(year from datum) yr, COUNT(*)
from tableA@dblink where
DATUM between '1-jan-1985' and '31-dec-2012'
and col2 > 100 and col2 not in ('999999')
and TRIM(TO_CHAR(col1)) in ('0','1')
group by col1,extract(year from DATUM);
the above query returns the count: 143 982 for year 1991
however when i put the filter directly into this year the query returns a different number: 143 917
select col1,extract(year from datum) yr, COUNT(*)
from tableA@dblink  where
DATUM between '1-jan-1991' and '31-dec-1991'
and col2 > 100 and col2 not in ('999999')
and TRIM(TO_CHAR(col1)) in ('0','1')
group by col1,extract(year from DATUM);
please can you help me understand why is this happening and why these 2 counts are different?
id appreciate any tips
thanks very much
rgds
  • 1. Re: 2 queries should return the same result (but they dont...)
    908002 Expert
    Currently Being Moderated
    Try these...
    select col1,extract(year from datum) yr, COUNT(*)
    from tableA@dblink where
    trunc(DATUM) between to_date('1-jan-1985','dd-mon-yyyy') and to_date('31-dec-2012','dd-mon-yyyy')
    and col2 > 100 and col2 not in ('999999')
    and TRIM(TO_CHAR(col1)) in ('0','1')
    group by col1,extract(year from DATUM);
    select col1,extract(year from datum) yr, COUNT(*)
    from tableA@dblink  where
    trunc(DATUM) between to_date('1-jan-1991','dd-mon-yyyy') and to_date('31-dec-1991','dd-mon-yyyy')
    and col2 > 100 and col2 not in ('999999')
    and TRIM(TO_CHAR(col1)) in ('0','1')
    group by col1,extract(year from DATUM);
  • 2. Re: 2 queries should return the same result (but they dont...)
    Frank Kulash Guru
    Currently Being Moderated
    Hi,

    If datum is a DATE, then don't try to compare it to strings, such as '1-jan-1985'. Compare it to other DATEs, such as the values returned by TO_DATE:
    WHERE   datum >= TO_DATE ('1-jan-1985', 'dd-mon-yyyy')
    AND     datum <  TO_DATE ('1-jan-2013', 'dd-mon-yyyy')
    or DATE literals:
    WHERE   datum >= DATE '1985-01-01'
    AND     datum <  DATE '2013-01-01'
    Also, remember that Oracle DATEs always include hours, minutes and seconds. "BETWEEN dt1 AND dt2" *excludes almost all of dt2.  It includes midnight (00:00:00) on dt2, but no other times on dt2.


     

    I hope this answers your question.
    If not, post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) that shows your problem, and the results you want from that data.
    Explain, using specific examples, how you get those results from that data.
    Always say what version of Oracle you're using (e.g. 11.2.0.2.0).
    See the forum FAQ {message:id=9360002}
  • 3. Re: 2 queries should return the same result (but they dont...)
    Kim Berg Hansen Expert
    Currently Being Moderated
    Kiran wrote:
    trunc(DATUM) between to_date('1-jan-1991','dd-mon-yyyy') and to_date('31-dec-1991','dd-mon-yyyy')
    Please remember, that a predicate on trunc(DATUM) prohibits the use of indexes on DATUM (unless you create a function based index on trunc(DATUM).)
    You will therefore enforce either full table scan or fast full index scan.

    Consider either:
    DATUM between to_date('1-jan-1991','dd-mon-yyyy') and to_date('31-dec-1991 23:59:59','dd-mon-yyyy hh24:mi:ss')
    Or:
    DATUM >= to_date('1-jan-1991','dd-mon-yyyy') and 
    DATUM <  to_date('1-jan-1992','dd-mon-yyyy')
    Do not, please, not use trunc(DATUM)...
  • 4. Re: 2 queries should return the same result (but they dont...)
    UserMB Explorer
    Currently Being Moderated
    hi
    im running the query with the to_date function and it runs already more then half an hour, maybe more
    so i think this already prevented the index from being used;
    the source table i access via dblink is residing in RDB database, its not oracle
    thats why originally i had simply literal
    anyway once the query is returned i will post an update;
    this is one time check, i simply need to know one time if the exported data are the same or not;
    so i dont have to worry about indexes not being used on regular basis
    thx
    rgds
  • 5. Re: 2 queries should return the same result (but they dont...)
    Sven W. Guru
    Currently Being Moderated
    UserMB wrote:
    please can you help me understand why is this happening and why these 2 counts are different?
    id appreciate any tips
    This is tricky. All others of cause are right. You must compare date columns with date values.
    If you don't explicitly compare as a date value, then it might happen that a string comparison is made.

    And if that happens both of your counts would return a wrong result. For example all dates that start with
    '4' are not counted. Like '4-jan-1991'. The string '4-jan-1991' is greater then the string '31-dec-1991'. Therefore it would not be included in the count.
    I don't think this happened. But the danger is there.

    However I have a problem seeing, what the difference between your two counts can be.
    The two important parts are the date between filter and the group by condition.
    DATUM between '1-jan-1985' and '31-dec-2012'
    ...
    DATUM between '1-jan-1991' and '31-dec-1991'
    ...
    group by col1,extract(year from DATUM);
    The second option returns less results than the first option. Some rows must be missing because of this different filter condition.
    If this would be a text comparison then there shouldn't be a difference. Only if it is a date comparison then this could be explained.

    Here is an example:
    Dates like 31-dec-1991 17:53:14 are included in the first count+group, but not included in the second count.
    Why? Because the string '31-dec-1991' is converted into a date. This date is 31-dec-1991 00:00:00 (midnight). Everything from this day, that is not exactly on midnight is greater than this value and therefore not included in your second query.

    So as others already pointed out you need to either truncate the date or to compare it a little differently.
    select col1,extract(year from datum) yr, COUNT(*)
    from tableA@dblink  
    where DATUM >= to_date('01-01-1991','dd-mm-yyyy')
    and  DATUM < to_date('01-01-1992','dd-mm-yyyy')
    and col2 > 100 and col2 not in ('999999')
    and TRIM(TO_CHAR(col1)) in ('0','1')
    group by col1,extract(year from DATUM);
    This should give the same result as your first query.

    Note that I changed the month from a month name to a number. This makes it independent from national language settings. E.g. DEZ = German, DEC = American

    Edited by: Sven W. on Oct 9, 2012 2:04 PM
  • 6. Re: 2 queries should return the same result (but they dont...)
    wkobargs Journeyer
    Currently Being Moderated
    Hello,

    which version of Rdb are you using? There is a known performance bug using the TO_DATE function if the database is Rdb:

    Bug 9112403 - BAD PERFORMANCE USING TO_DATE FUNCTION

    The bug is fixed in Rdb 7.2.5 and later.

    Regards
    Wolfgang

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points