13 Replies Latest reply on Jun 1, 2011 3:23 PM by Sven W.

    External table -> fetch location ?

    Sven W.
      Using Oracle 10.2.0.5

      An external table is a construct that gives me SQL access to a file.

      Is it possible to know the name of the file somehow inside the select? Like Add a column with the file name?

      pseudo example
      CREATE TABLE EXT_DUMMY
      (
          "RECORDTYPE" VARCHAR2(100 BYTE), 
          "COL1" VARCHAR2(100 BYTE), 
          "COL2" VARCHAR2(100 BYTE), 
          "FILE" VARCHAR2(100 BYTE) 
      )
      ORGANIZATION EXTERNAL
      (
          TYPE ORACLE_LOADER DEFAULT DIRECTORY "IMPORT_BAD_FILE" 
          ACCESS PARAMETERS ( 
                   records delimited BY newline 
                   FIELDS TERMINATED BY ';' 
                   MISSING FIELD VALUES ARE NULL 
                     ( RECORDTYPE CHAR
                     , COL1 CHAR 
                     , COL2 CHAR 
                     , FILE CHAR FILLER
                     ) 
                     ) 
          LOCATION ( 'Testfile1.txt, Testfile2.txt' )
      )
          reject limit 10
      ;
      The result could look like this:
      RECORDTYPE   COL1       COL2      FILE
      SAMPLE           DUMMY    DUMMY Testfile1.txt
      SAMPLE           DUMMY1   DUMMY Testfile1.txt
      SAMPLE           DUMMY2   DUMMY Testfile1.txt
      SAMPLE           DUMMY3   DUMMY Testfile1.txt
      SAMPLE           DUMMY1   DUMMY1 Testfile2.txt
      SAMPLE           DUMMY1   DUMMY2 Testfile2.txt
      SAMPLE           DUMMY2   DUMMY1 Testfile2.txt
      I would like to know from which file a certain row is read. Maybe I missed an option in the documentation. In this example I have two different files as the source for the external table.

      Another use case could be this:
      If I enable a user to switch the external table to a different file
      alter table EXT_DUMMY location ('Testfile3.txt' )
      . How can we know which file is read during the select on the table? When userA does the select, maybe userB just altered the location before the select was started. Therefore userA would read in a different file then expected.

      Edited by: Sven W. on May 26, 2011 4:48 PM

      Edited by: Sven W. on May 26, 2011 4:51 PM

      Edited by: Sven W. on May 26, 2011 5:11 PM
        • 1. Re: External table -> fetch location ?
          John Spencer
          Sven:

          I'm not sure about your first use case, but you can find the current location parameter for the external table from the XXX_EXTERNAL_LOCATIONS view.

          As to getting it in a select, with multiple files in the location, my guess would be not possible. I'm not sure that Oracle would even know itself which file it was porcessing, depending on which O/S calls it was using.

          John
          1 person found this helpful
          • 2. Re: External table -> fetch location ?
            riedelme
            Sven W. wrote:
            Using Oracle 10.2.0.5
            Is it possible to know the name of the file somehow inside the select? Like Add a column with the file name?
            A question from Sven would be a hard one ; - )

            Tricky, but probably not impossible.

            Start by creating a function to accept an external table name argument. The funciton would use DBMS_METADATA to get the DDL and extract the table name(s - can you base an external table on more than one file?) returning the filename (a concatenated string if more than one or use a pipelined function to get multiple rows ). Create a view on the original table including a call to the function in the SELECT list hard-coding the external table name. Use the view to get your data + external table file(s)

            I hope somebody suggests something easier: - )

            Edited by: riedelme on May 26, 2011 11:55 AM

            John said his answer is easier - if you find it so then use it!

            Edited by: riedelme on May 26, 2011 11:55 AM
            1 person found this helpful
            • 3. Re: External table -> fetch location ?
              John Spencer
              riedelme wrote:
              Sven W. wrote:
              Using Oracle 10.2.0.5
              Is it possible to know the name of the file somehow inside the select? Like Add a column with the file name?
              A question from Sven would be a hard one ; - )

              Tricky, but probably not impossible.

              Start by creating a function to accept an external table name argument. The funciton would use DBMS_METADATA to get the DDL and extract the table name(s - can you base an external table on more than one file?) returning the filename (a concatenated string if more than one or use a pipelined function to get multiple rows ). Create a view on the original table including a call to the function in the SELECT list hard-coding the external table name. Use the view to get your data + external table file(s)

              I hope somebody suggests something easie : - )
              I thought I did with the xxx_external_locations view {noformat}:-){noformat}, though with two files in the location (which iwork fine), you get two rows in the view.

              John
              • 4. Re: External table -> fetch location ?
                Solomon Yakobson
                John Spencer wrote:

                I thought I did with the xxx_external_locations view {noformat}:-){noformat}, though with two files in the location (which iwork fine), you get two rows in the view.
                Yes, you will get two rows back. But this is just static info saying rows are coming from two files. And OP wants to know, at SELECT time, what file this particular row was loaded from.

                SY.

                Edited by: Solomon Yakobson on May 26, 2011 3:58 PM
                1 person found this helpful
                • 5. Re: External table -> fetch location ?
                  John Spencer
                  Solomon Yakobson wrote:
                  John Spencer wrote:

                  I thought I did with the xxx_external_locations view {noformat}:-){noformat}, though with two files in the location (which iwork fine), you get two rows in the view.
                  Yes, you will get two rows back. But this is just static info saying rows are coming from two files. And OP wants to know, at SELECT time, what file this particular row was loaded from.

                  SY.

                  Edited by: Solomon Yakobson on May 26, 2011 3:58 PM
                  Solomon:

                  Yes, I realize what Sven was asking in the original post, and in my first response I did say
                  As to getting it in a select, with multiple files in the location, my guess would be not possible. I'm not sure that Oracle would even know itself which file it was porcessing, depending on which O/S calls it was using.
                  My response to riedelme was aimed more at the dbms_metadata suggestion than at the desire to get the filename in the select with multiple files in the location.

                  John
                  • 6. Re: External table -> fetch location ?
                    Sven W.
                    riedelme wrote:
                    Sven W. wrote:
                    Using Oracle 10.2.0.5
                    Is it possible to know the name of the file somehow inside the select? Like Add a column with the file name?
                    A question from Sven would be a hard one ; - )
                    I just want your grey cells to be used constantly ;)
                    Tricky, but probably not impossible.

                    ... can you base an external table on more than one file? ...
                    Yes that is my major point. In fact I quite like that possibility.
                    ... returning the filename (a concatenated string if more than one or use a pipelined function to get multiple rows ). Create a view on the original table including a call to the function in the SELECT list hard-coding the external table name. Use the view to get your data + external table file(s)
                    Unfortunatly this doesn't solve the problem.
                    I would like to know exactly which row was read from which file.

                    There are workarounds but they all depend other things.

                    If multiple files are the source of the external table then it seems as if they are always read in that exact order.
                    If the files have a header or footer row, this information can be used to state that now the next file is taken. The list of files can be fetched from the external_locations view that John already posted. But this depends on the file data itself and is no general solution.

                    For a short moment I was exploring the possibility to parse the log (or even bad and discard) file after the select is made. But this is way to complex and still doesn't solve certain timing issues.

                    I decided yesterday to go the easy way and simply show the name of the file (or multiple files) that were read on the Apex-Gui. This servs currently my purpose. And I added some extra indicators (simple logical checks) that will popup, if another user switched the table to a different file location.
                    • 7. Re: External table -> fetch location ?
                      Peter Gjelstrup
                      Hi Sven,

                      Would it matter much - for performance or maintenance - if you were to use multiple external files, one for each location.

                      And then a view on top them all, including location for each?


                      Regards
                      Peter
                      • 8. Re: External table -> fetch location ?
                        Sven W.
                        Peter Gjelstrup wrote:
                        Hi Sven,

                        Would it matter much - for performance or maintenance - if you were to use multiple external files, one for each location.
                        The background story behind this question is that certain applications that I'm working with exchange their data via a file interface. Sometimes there is an error meaning one application doesn't accept the file from the other application. in this case an error ("bad") file is written to some specific folder.

                        I now gave some powerusers the possibility to check and repair those files from an Apex Gui. However the file name is not known beforehand. I simply use an external table for certain file types (file type depends on prefix and suffix of the file). Those files constantly change and I wouldn't want to use multiple external tables for that. I let the user choose the file and switch the table to this file.

                        I fact nobody knows which file needs to be read in tomorrow. They will have new/different file names again.

                        The first implementation was simply using all files of one type as the source. So that one user could look at the issues in all those bad files at once. But this is too confusing if we can't track the exact file name. Hence the question.

                        Edited by: Sven W. on May 27, 2011 4:28 PM
                        • 9. Re: External table -> fetch location ?
                          John Spencer
                          Sven:

                          Are you using the bad file external tables to allow the users to correct the data, or are they correcting the data in the files then loading them into the application through the external tables? If the latter, then could you process them serially in a procedure?

                          I'm thinking you could pass an array of file names to the prodecure and have it do something like:
                          FOR i IN array.First .. array.Last LOOP
                             execute immediate 'alter table ext_dummy location('''||array(i)||''')';
                             < process file>
                          END LOOP;
                          Depending on how many users might be processing the same set of files, you may need to use DBMS_LOCK to serialize the individual users processes. If you really need the file name in the query from the external table, it would be a simple join to xxx_external_locations.

                          John
                          1 person found this helpful
                          • 10. Re: External table -> fetch location ?
                            Sven W.
                            John,

                            I show the users simply part of the content of the file and hint at some typical errors (often caused by timing issues, because data changed while the file was written). The users then manually correct the files and restart the import. This is a very manual process. I just gave them a small file checking mechanism, for some of the more often occuring errors.

                            I think I keep the thread open till monday, maybe somebody comes up with a different new idea. But for now I'm satisfied with the brainstorming I got.

                            Thanks all and have a nice weekend!

                            Edited by: Sven W. on May 27, 2011 5:08 PM
                            • 11. Re: External table -> fetch location ?
                              Solomon Yakobson
                              Wronmg post.

                              SY.

                              Edited by: Solomon Yakobson on May 27, 2011 12:35 PM
                              • 12. Re: External table -> fetch location ?
                                odie_63
                                Hi Sven,

                                I'm not sure how much we can rely on this, but let's consider the following :
                                create table test_xt (
                                  rec_id  number
                                , message varchar2(100)
                                )
                                organization external ( 
                                  default directory test_dir
                                  access parameters ( 
                                    records delimited by newline
                                    fields terminated by ';'
                                  )
                                  location (
                                    'marc5.txt'
                                  , 'test1.csv'
                                  , 'test2.csv'
                                  , 'test3.csv'
                                  )  
                                );
                                I always thought the ROWID doesn't hold much sense for an external table, but...
                                SQL> select t.rowid
                                  2       , dump(t.rowid) as rowid_dump
                                  3       , regexp_substr(dump(t.rowid,10,9,1),'\d+$') as file#
                                  4       , t.*
                                  5  from test_xt t
                                  6  ;
                                 
                                ROWID              ROWID_DUMP                                                FILE#      REC_ID MESSAGE
                                ------------------ --------------------------------------------------------- ------ ---------- -------------------------------
                                (AADVyAAAAAAAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,0,0,0,0,0,0,0,0,0     0               1 this is a line from marc5.txt
                                (AADVyAAAAAAAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,0,0,0,0,0,0,0,0,33    0               2 this is a line from marc5.txt
                                (AADVyAAAAAAAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,0,0,0,0,0,0,0,0,66    0               3 this is a line from marc5.txt
                                (AADVyAAAAAAAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,0,0,0,0,0,0,0,0,99    0               4 this is a line from marc5.txt
                                (AADVyAAAAAEAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,1,0,0,0,0,0,0,0,0     1               1 this is a line from test1.csv
                                (AADVyAAAAAEAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,1,0,0,0,0,0,0,0,33    1               2 this is a line from test1.csv
                                (AADVyAAAAAEAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,1,0,0,0,0,0,0,0,66    1               3 this is a line from test1.csv
                                (AADVyAAAAAEAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,1,0,0,0,0,0,0,0,99    1               4 this is a line from test1.csv
                                (AADVyAAAAAIAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,2,0,0,0,0,0,0,0,0     2               1 this is a line from test2.csv
                                (AADVyAAAAAIAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,2,0,0,0,0,0,0,0,33    2               2 this is a line from test2.csv
                                (AADVyAAAAAIAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,2,0,0,0,0,0,0,0,66    2               3 this is a line from test2.csv
                                (AADVyAAAAAMAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,3,0,0,0,0,0,0,0,0     3               1 this is a line from test3.csv
                                (AADVyAAAAAMAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,3,0,0,0,0,0,0,0,33    3               2 this is a line from test3.csv
                                (AADVyAAAAAMAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,3,0,0,0,0,0,0,0,66    3               3 this is a line from test3.csv
                                (AADVyAAAAAMAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,3,0,0,0,0,0,0,0,99    3               4 this is a line from test3.csv
                                (AADVyAAAAAMAAAAAA Typ=208 Len=17: 4,0,0,213,200,0,0,0,3,0,0,0,0,0,0,0,132   3               5 this is a line from test3.csv
                                 
                                16 rows selected
                                 
                                Then with a join to EXTERNAL_LOCATION$ :
                                SQL> with ext_loc as (
                                  2    select position-1 as pos
                                  3         , name as filename
                                  4    from sys.external_location$
                                  5    where obj# = ( select object_id
                                  6                   from user_objects
                                  7                   where object_name = 'TEST_XT' )
                                  8  )
                                  9  select x.filename,
                                 10         t.*
                                 11  from test_xt t
                                 12       join ext_loc x on x.pos = to_number(regexp_substr(dump(t.rowid,10,9,1),'\d+$'))
                                 13  ;
                                 
                                FILENAME       REC_ID MESSAGE
                                ------------ -------- --------------------------------
                                marc5.txt           1 this is a line from marc5.txt
                                marc5.txt           2 this is a line from marc5.txt
                                marc5.txt           3 this is a line from marc5.txt
                                marc5.txt           4 this is a line from marc5.txt
                                test1.csv           1 this is a line from test1.csv
                                test1.csv           2 this is a line from test1.csv
                                test1.csv           3 this is a line from test1.csv
                                test1.csv           4 this is a line from test1.csv
                                test2.csv           1 this is a line from test2.csv
                                test2.csv           2 this is a line from test2.csv
                                test2.csv           3 this is a line from test2.csv
                                test3.csv           1 this is a line from test3.csv
                                test3.csv           2 this is a line from test3.csv
                                test3.csv           3 this is a line from test3.csv
                                test3.csv           4 this is a line from test3.csv
                                test3.csv           5 this is a line from test3.csv
                                 
                                Seems to work... assuming the files are always read in the order specified through the LOCATION parameter, and the generated ROWID actually means what I think it means.
                                • 13. Re: External table -> fetch location ?
                                  Sven W.
                                  Extremly interesting!

                                  In fact this even makes sense from a logical perspective. Since a rowid is kind of the physical address of the data in a table. Then for an external table it should include the file number and maybe even the record/line number.

                                  If we have a writable table then the rowid is probaly the connection to the position in the file. Not sure if the multi file syntax is usable for writeable external tables...

                                  Edited by: Sven W. on Jun 1, 2011 5:19 PM