This discussion is archived
13 Replies Latest reply: May 3, 2013 10:52 AM by 694514 RSS

SQL Retrieves Duplicate Rows Sometimes

694514 Newbie
Currently Being Moderated
Hello, I have a SQL that retrieves duplicate rows sometimes........but not most of the time. I cannot figure out what I am doing wrong. I am gathering data from 2 tables.

The first table (INC_FILES) contains these fields:
FileName
FileDesc
TYear
FileLoc
LocDate
RetDate
FileSize
DaysToRet

The second table (FILE_ACT) contains these fields with examples:
FileName
Code
ADate
CodeID
DateofLoad
AName

The FileName field in each table is stored using the exact same name. DateofLoad and RetDate fields are stored with the
exact same date with this format:

mm/dd/yyyy hh:mm

but retrieves like this example:
03/20/2012 11:11:00 AM

The seconds are always 00.

Here is the SQL that I am using:

Select Distinct INC_FILES., FILE_ACT.
from INC_FILES, FILE_ACT
where inc_files.filename = file_act.filename
and inc_files.retdate = file_act.dateofload
ORDER BY inc_files.tyear ASC, file_act.adate

I'm missing something but I don't know what. Please let me know what other information you may need.

Thanks in advance!!

Edited by: user8350636 on May 1, 2013 8:53 AM
  • 1. Re: SQL Retrieves Duplicate Rows Sometimes
    BluShadow Guru Moderator
    Currently Being Moderated
    user8350636 wrote:
    Hello, I have a SQL that retrieves duplicate rows sometimes........but not most of the time. I cannot figure out what I am doing wrong. I am gathering data from 2 tables.

    The first table (INC_FILES) contains these fields:
    FileName
    FileDesc
    TYear
    FileLoc
    LocDate
    RetDate
    FileSize
    DaysToRet

    The second table (FILE_ACT) contains these fields with examples:
    FileName
    Code
    ADate
    CodeID
    DateofLoad
    AName

    The FileName field in each table is stored using the exact same name. DateofLoad and RetDate fields are stored with the
    exact same date with this format:

    mm/dd/yyyy hh:mm

    but retrieves like this example:
    03/20/2012 11:11:00 AM
    That indicates that your RetDate column is of DATE datatype.
    The DATE datatype stores a full date from the year down to the second, in an internal format (that is most certainly not MM/DD/YYYY HH:MM or any other format like that - and note that MM is the month format mask and MI is used for minutes.). How you see the dates on the screen is up to the interface you use and/or your NLS_DATE_FORMAT database/session setting. Even if you change how the data is displayed this doesn't change the fact that there is data down to the second actually stored. So when you query dates the seconds will be taken into account unless you truncate the date/time down to the granularity you want. e.g. TRUNC(RetDate,'MI') will ensure that all of those date/time values have a 00 number of seconds. or TRUNC(RetDate,'DD') will ensure all the date/times have a time of 00:00:00 because you've truncated to the day. etc.
  • 2. Re: SQL Retrieves Duplicate Rows Sometimes
    sb92075 Guru
    Currently Being Moderated
    user8350636 wrote:
    Hello, I have a SQL that retrieves duplicate rows sometimes........but not most of the time. I cannot figure out what I am doing wrong. I am gathering data from 2 tables.

    The first table (INC_FILES) contains these fields:
    FileName
    FileDesc
    TYear
    FileLoc
    LocDate
    RetDate
    FileSize
    DaysToRet

    The second table (FILE_ACT) contains these fields with examples:
    FileName
    Code
    ADate
    CodeID
    DateofLoad
    AName

    The FileName field in each table is stored using the exact same name. DateofLoad and RetDate fields are stored with the
    exact same date with this format:

    mm/dd/yyyy hh:mm

    but retrieves like this example:
    03/20/2012 11:11:00 AM

    The seconds are always 00.

    Here is the SQL that I am using:

    Select Distinct INC_FILES.*, FILE_ACT.*
    from INC_FILES, FILE_ACT
    where inc_files.filename = file_act.filename
    and inc_files.retdate = file_act.dateofload
    ORDER BY inc_files.tyear ASC, file_act.adate

    I'm missing something but I don't know what. Please let me know what other information you may need.

    Thanks in advance!!
    How do I ask a question on the forums?
    SQL and PL/SQL FAQ


    post CREATE TABLE statements for both INC_FILES & FILE_ACT tables
  • 3. Re: SQL Retrieves Duplicate Rows Sometimes
    694514 Newbie
    Currently Being Moderated
    Thank you for responding!

    I'm sorry, but our Oracle administrator creates the tables (not my expertise---I am just a programmer) but maybe this will help:

    SQL> desc inc_files

    Name Null? Type
    ----------------------------------------- -------- --------------

    FILENAME VARCHAR2(50)
    FILEDESC VARCHAR2(50)
    TYEAR NUMBER(4)
    FILELOC VARCHAR2(50)
    LOCDATE DATE
    RETDATE DATE
    FILESIZE CHAR(10)
    DAYSTORET NUMBER(3)


    SQL> desc file_act

    Name Null? Type
    ----------------------------------------- -------- -------------
    FILENAME VARCHAR2(50)
    CODE CHAR(2)
    ADATE DATE
    CODEID VARCHAR2(50)
    DATEOFLOAD DATE
    ANAME VARCHAR2(50)

    Although it would be rare, I was concerned that if I used the "seconds" within the format that the moment I hit "Save" the seconds would change and the two files would not match up. Basically, the first table stores information about the file. The second table stores any activity changes for that file.
  • 4. Re: SQL Retrieves Duplicate Rows Sometimes
    davidp 2 Pro
    Currently Being Moderated
    "Although it would be rare, I was concerned that if I used the "seconds" within the format that the moment I hit "Save" the seconds would change and the two files would not match up. Basically, the first table stores information about the file. The second table stores any activity changes for that file."

    If you are storing the current time as you store into each of the two tables, you will have the same problem as the minute changes when seconds tick over from 59 seconds to 00 seconds, just less often. The only correct way to do it is to do something that guarantees that you save the same value into inc_files.retdate and file_act.dateofload. You can get the timestamp into a variable first, or use a DML RETURNING clause as you insert the first row, or there are a few other, more complicated, ways

    For your main issue, I think BluShadow's response is correct.
  • 5. Re: SQL Retrieves Duplicate Rows Sometimes
    694514 Newbie
    Currently Being Moderated
    Ok, I finally got back to this.........

    I've been researching how "Trunc" works. This is helpful since I did not even know about it. However, for some reason, I cannot get it to work. I'm still selecting the same number of duplicate records. I am using this SQL:

    Select Distinct Inc_files.*, file_act.*
    from inc_files, file_act
    where inc_files.filename = file_act.filename
    and trunc(inc_files.retdate, 'MI') = trunc(file_act.dateofload, 'MI')

    BluShadow pointed out that HH:MM is hour and month so i changed it to HH:MI in my program. I am using Delphi and their FormatDateTime function called for the HH:MM. My records saved fine when i used the MI without getting an error.

    Davidp.......you are right, I would have the same problem with the minute change if it happened.
  • 6. Re: SQL Retrieves Duplicate Rows Sometimes
    sb92075 Guru
    Currently Being Moderated
    use of UNIQUE index preclude duplicates from ever occurring.
  • 7. Re: SQL Retrieves Duplicate Rows Sometimes
    694514 Newbie
    Currently Being Moderated
    Thank you, I just tried UNIQUE but I still get the same number of duplicate rows.
  • 8. Re: SQL Retrieves Duplicate Rows Sometimes
    rp0428 Guru
    Currently Being Moderated
    >
    I'm still selecting the same number of duplicate records. I am using this SQL:

    Select Distinct Inc_files., file_act.
    from inc_files, file_act
    where inc_files.filename = file_act.filename
    and trunc(inc_files.retdate, 'MI') = trunc(file_act.dateofload, 'MI')
    >
    That query is not even valid since you are specifying 'INC_FILES.' and 'FILE_ACT.' so post the actual query you are using.

    It is IMPOSSIBLE for you to be getting duplicates if you use DISTINCT. For that query above you will ONLY get unique combinations of 'INC_FILES

    Post the duplicate data that you say you are getting.

    If you TRUNC the values to the minute then a row with ANY minute value from the one table will match a row of ANY minute value from the other table.

    So '03:17' will match '03:42'. How does that make sense?
  • 9. Re: SQL Retrieves Duplicate Rows Sometimes
    694514 Newbie
    Currently Being Moderated
    Oh, I'm sorry, I should have explained........the inc_files and the other table actually have a period and an asterick after it but in the plain text help I see that it sees the 2 astericks as "bold". Which is why the second table name shows bold.

    Here is an actual set of duplicate rows but it's hard to read because it's all smashed together. I don't see where i can enter actual code in posts??
    There should only be 2 rows returning instead of 4......one row with the code "D" and one row with the code "S".

    763AB.51.201311 RETAIL CODE 2011 C:\NEW FILES 03/27/2013 03/27/2013 3:00:00 PM 271904 60 (line from inc_files)
    763AB.51.201311 D 03/27/2013 received on 03/27/2013 3:00:00 PM LORI EVERSON (line from file_act)

    763AB.51.201311 RETAIL CODE 2011 C:\NEW FILES 03/27/2013 03/27/2013 3:00:00 PM 271904 60 (line from inc_files)
    763AB.51.201311 S 03/27/2013 sent on 03/27/2013 3:00:00 PM WILLIAM MUSER (line from file_act)

    763AB.51.201311 RETAIL CODE 2011 C:\LOADED FILES 03/27/2013 03/27/2013 3:00:00 PM 271904 60 (line from inc_files)
    763AB.51.201311 D 03/27/2013 received on 03/27/2013 3:00:00 PM LORI EVERSON (line from file_act)

    763AB.51.201311 RETAIL CODE 2011 C:\LOADED FILES 03/27/2013 03/27/2013 3:00:00 PM 271904 60 (line from inc_files)
    763AB.51.201311 S 03/27/2013 sent on 03/27/2013 3:00:00 PM WILLIAM MUSER (line from file_act)
  • 10. Re: SQL Retrieves Duplicate Rows Sometimes
    rp0428 Guru
    Currently Being Moderated
    >
    Here is an actual set of duplicate rows but it's hard to read because it's all smashed together. I don't see where i can enter actual code in posts??
    >
    You use \
     on the line before and on the line after to preserve formatting.
    {quote}
    There should only be 2 rows returning instead of 4......one row with the code "D" and one row with the code "S". 
    {quote}
    Those don't appear to be duplicates to me.
    
    One 'D' record is for C:\NEW FILES and one is for  C:\LOADED FILES.
    
    That value alone makes the rows different.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
  • 11. Re: SQL Retrieves Duplicate Rows Sometimes
    694514 Newbie
    Currently Being Moderated
    You are right. It does show that the rows are not duplicate but when I select that file name from one table at a time, it displays just two rows like it is supposed to but when i select from both tables, it appears like the way I just showed. Most of the other files do not. I was not aware that Oracle was also saving the seconds internally. I'm thinking that maybe somehow the seconds saved are different?? I'm not sure.

    I may try davidp's suggestion by putting the timestamp into a variable to see what happens. I'm working back and forth on 2 projects right now and will try to get to that later today.
  • 12. Re: SQL Retrieves Duplicate Rows Sometimes
    rp0428 Guru
    Currently Being Moderated
    Post the DDL for the tables, a small amount of sample data (in INSERT INTO . . format so we can use it) and the query you are using and maybe we can try to reproduce it.
  • 13. Re: SQL Retrieves Duplicate Rows Sometimes
    694514 Newbie
    Currently Being Moderated
    I hope the DDL and the inserts are correct. I do not have the authority to actually create and insert to try it
    at work. Please let me know if it's not correct.
    CREATE TABLE inc_files (
        FILENAME      VARCHAR2(50),
        FILEDESC      VARCHAR2(50),
        TYEAR         NUMBER(4),
        FILELOC       VARCHAR2(50),
        LOCDATE       DATE,
        RETDATE       DATE,
        FILESIZE      CHAR(10),
        DAYSTORET     NUMBER(3)
    );
    
    CREATE TABLE file_act (
        FILENAME      VARCHAR2(50),
        CODE          CHAR(2),
        ADATE         DATE,
        CODEID        VARCHAR2(50),
        DATEOFLOAD    DATE,
        ANAME         VARCHAR2(50)
    );
    
    
    
    
    INSERT ALL
      INTO inc_files (filename, filedesc, tyear, fileloc, locdate, retdate, filesize, daystoret) VALUES 
        ('763AB.51.201311', 'RETAIL STORE', 2013, 'C:\New Files', 10/01/2013, 10/04/2013 9:17:00 AM, '16212', 30)
      INTO inc_files (filename, filedesc, tyear, fileloc, locdate, retdate, filesize, daystoret) VALUES 
        ('654AA.51.201210', 'CONSTRUCTION', 2012, 'C:\New Files', 11/15/2012, 12/11/2012 9:27:00 AM, '16251', 60)
      INTO inc_files (filename, filedesc, tyear, fileloc, locdate, retdate, filesize, daystoret) VALUES 
        ('812AA.51.201111', 'BUSINESS', 2011, 'C:\Loaded Files', 02/10/2011, 09/23/2011 9:31:00 AM, '16089', 60)
      INTO inc_files (filename, filedesc, tyear, fileloc, locdate, retdate, filesize, daystoret) VALUES
        ('512AX.51.201005', 'RESIDENCE', 2010, 'C:\Loaded Files', 10/10/2010, 11/13/2010 9:33:00 AM, '15983', 45)
      INTO inc_files (filename, filedesc, tyear, fileloc, locdate, retdate, filesize, daystoret) VALUES
        ('763AC.51.201301', 'RETAIL STORE', 2013, 'C:\Sent Out Files', 03/11/2013, 09/01/2013 9:42:00 AM, '16158', 60)
      INTO inc_files (filename, filedesc, tyear, fileloc, locdate, retdate, filesize, daystoret) VALUES
        ('812AB.51.201308', 'BUSINESS', 2013, 'C:\New Files', 04/05/2013, 05/04/2013 9:45:00 AM, '16082', 90)
    SELECT * FROM dual;
    
    
    INSERT ALL
      INTO file_act (filename, code, adate, codeid, dateofload, aname) VALUES 
        ('763AB.51.201311', 'N', 10/01/2013, 'New', 10/04/2013 9:17:00 AM, 'Margaret Houston')
      INTO file_act (filename, code, adate, codeid, dateofload, aname) VALUES 
        ('654AA.51.201210', 'N', 11/15/2012, 'New', 12/11/2012 9:27:00 AM, 'Amy Whitter')
      INTO file_act (filename, code, adate, codeid, dateofload, aname) VALUES 
        ('812AA.51.201111', 'L', 02/10/2011, 'Loaded', 09/23/2011 9:31:00 AM, 'Bill Jones')
      INTO file_act (filename, code, adate, codeid, dateofload, aname) VALUES
        ('512AX.51.201005', 'L', 10/10/2010, 'Loaded', 11/13/2010 9:33:00 AM, 'Amy Whitter')
      INTO file_act (filename, code, adate, codeid, dateofload, aname) VALUES
        ('763AC.51.201301', 'S', 03/11/2013, 'Sent', 09/01/2013 9:42:00 AM, 'Amy Whitter')
      INTO file_act (filename, code, adate, codeid, dateofload, aname) VALUES
        ('812AB.51.201308', 'N', 04/05/2013, 'New', 05/04/2013 9:45:00 AM, 'Margaret Houston')
    SELECT * FROM dual;
    My SQL:
    There is a period and an asterick after the table names in the first line.

    Select Distinct inc_files.*, file_act.*
    from inc_files, file_act
    where inc_files.filename = file_act.filename
    and inc_files.retdate = file_act.dateofload
    ORDER BY inc_files.tyear ASC, file_act.adate

    Thanks!!

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points