2 Replies Latest reply: Sep 25, 2013 1:45 AM by Barbara Boehmer RSS

    csv file processing using external table

    navshankar

      Dear All,

       

      Our database is oracle 10g r2 and OS is solaris

       

      We  would receive csv files to a particular directory on server each day.

       

       

      File Format look like:

      -------------------------

       

      H00,SOURCE_NAME,FILE_CREATED_DATE

      RECORD_TYPE,EMP_ID,EMP_NAME,EMP_DOB(DDMMYYYY),EMP_HIRE_DATE(DDMMYYYY),EMP_LOCATION

      T00,RECORD_COUNT

       

       

      EMPLOYEE TABLE STRUCTURE

      ---------------------------

       

      EMP_ID                   NOT NULL    NUMBER ,

      EMP_NAME            NOT NULL    VARCHAR2(10) ,

      EMP_DOB                                  DATE,

      EMP_HIRE_DATE   NOT NULL     DATE,

      EMP_LOCATION                VARCHAR2(80)

       

       

       

      Sample File:

      -------------

      H00,ABC,21092013

      "R01",1,"EMP1","14021986","06072010","LOC1"

      "R01",20000000000,"EMP2","14021-987","06072011",""

      ,***,"EMPPPPPPPPPPP3","14021988","060**012","LOC2"

      "R01",4,4,"14021989","06072013",

      T00,4

       

      we need to validate each record excluding header and trailer  for:

       

      DATATYPE, LENGTH,OPTIONALITY, and other date validations such as EMP_HIRE_DATE can not be less than EMP_DOB

       

      In case of any data errors we need to send a response file for corresponding source file.

       

      we have  predefined error codes to be sent in the response file.

       

      ERR001    EMP_ID can not be null

      ERR002    EMP_ID  exceeds 10 digits

      ERR003    EMP_ID is not a number    

      ERR004    EMP_NAME   has to be text

      ERR005    EMP_NAME  length can not exceed 10

      ERR006    EMP_NAME   can not be null

      ERR007    EMP_DOB is not a date

      ERR008    EMP_DOB is not in ddmmyyyy format

      ERR009    EMP_HIRE_DATE is not a date

      ERR010    EMP_HIRE_DATE is not in ddmmyyyy format

      ERR011    EMP_HIRE_DATE can not be null

      ERR012    EMP_LOCATION    has to be text

      ERR013    EMP_LOCATION   length can not exceed 80

      ERR014    EMP_HIRE_DATE can not be less than EMP_DOB

      ERR015    Field missing in the record

      ERR016    More number of fields than allowed

       

       

      1.Do I need to create external table before processing each file.(EMP1.txt,EMP2.txt)?

       

      2.How to generate these error codes in case of respective failure scenarios and to log into an exception table?

       

      3.response file needs to have entire record and a concatination of all the error codes in the next line.

       

      4.what would be a better approach among

      creating an external table with all char(2000) fields and writing a select statement

      such as select * from ext_table where (emp id is not null and length(emp_id)<=10 and....to select only proper data);

      or creating the external table to be same as employee table and creating a bad file? if this is the preferred how can I generate the custom error codes?

       

       

      Could you please help me in achieving this!

       

      Warm Regards,

      Shankar.

        • 1. Re: csv file processing using external table
          rp0428

          Why not just write a Java stored procedure and use Java to process the files?

           

          Pass the file path/name as a parameter, let Java peel off the header and trailer and do the other validations.

           


          • 2. Re: csv file processing using external table
            Barbara Boehmer

            You can do a one-time creation of an external table.  After that, you can either overwrite the existing text file or alter the location.  In the example below I have split your original sample file into two files, in order to demonstrate altering the location.  You can make the external table all varchar2 fields as large as they need to be to accommodate all possible data.  If you then create a staging table and rejects table of the same structure, you can create a trigger on the staging table such that, when you insert from the external table to the staging table, it inserts into either the employee table or rejects table with the concatenated error list.  If you want this in a file, then you can just spool a select from the rejects table or use some other method such as utl_file.  The following is a partial example.  Another alternative would be to use SQL*Loader to load directly into the staging table without an external table.

             

             

            SCOTT@orcl12c> HOST TYPE emp1.txt

            H00,ABC,21092013

            "R01",1,"EMP1","14021986","06072010","LOC1"

            "R01",20000000000,"EMP2","14021-987","06072011",""

            T00,4

             

            SCOTT@orcl12c> HOST TYPE emp2.txt

            H00,ABC,21092013

            ,***,"EMPPPPPPPPPPP3","14021988","060**012","LOC2"

            "R01",4,4,"14021989","06072013",

            T00,4

             

            SCOTT@orcl12c> CREATE OR REPLACE DIRECTORY my_dir AS 'c:\my_oracle_files'

              2  /

             

            Directory created.

             

            SCOTT@orcl12c> CREATE TABLE external_table

              2    (record_type       VARCHAR2(10),

              3     emp_id            VARCHAR2(11),

              4     emp_name          VARCHAR2(14),

              5     emp_dob           VARCHAR2(10),

              6     emp_hire_date     VARCHAR2(10),

              7     emp_location      VARCHAR2(80))

              8  ORGANIZATION external

              9    (TYPE oracle_loader

            10     DEFAULT DIRECTORY my_dir

            11     ACCESS PARAMETERS

            12       (RECORDS DELIMITED BY NEWLINE

            13        LOAD WHEN ((1: 3) != "H00" AND (1:3) != 'T00')

            14        LOGFILE 'test.log'

            15        FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM

            16        MISSING FIELD VALUES ARE NULL

            17        REJECT ROWS WITH ALL NULL FIELDS

            18          (record_type, emp_id, emp_name, emp_dob, emp_hire_date, emp_location))

            19     LOCATION ('emp1.txt'))

            20  /

             

            Table created.

             

            SCOTT@orcl12c> CREATE TABLE staging

              2    (record_type       VARCHAR2(10),

              3     emp_id            VARCHAR2(11),

              4     emp_name          VARCHAR2(14),

              5     emp_dob           VARCHAR2(10),

              6     emp_hire_date     VARCHAR2(10),

              7     emp_location      VARCHAR2(80))

              8  /

             

            Table created.

             

            SCOTT@orcl12c> CREATE TABLE employee

              2    (emp_id          NUMBER       NOT NULL,

              3     emp_name        VARCHAR2(10) NOT NULL,

              4     emp_dob         DATE,

              5     emp_hire_date   DATE,

              6     emp_location    VARCHAR2(80))

              7  /

             

            Table created.

             

            SCOTT@orcl12c> CREATE TABLE rejects

              2    (record_type       VARCHAR2(10),

              3     emp_id            VARCHAR2(11),

              4     emp_name          VARCHAR2(14),

              5     emp_dob           VARCHAR2(10),

              6     emp_hire_date     VARCHAR2(10),

              7     emp_location      VARCHAR2(80),

              8     error_codes       VARCHAR2(4000))

              9  /

             

            Table created.

             

            SCOTT@orcl12c> CREATE OR REPLACE TRIGGER staging_air

              2    AFTER INSERT ON staging

              3    FOR EACH ROW

              4  DECLARE

              5    v_rejects         NUMBER := 0;

              6    v_error_codes  VARCHAR2(4000);

              7    v_num          NUMBER;

              8    v_dob          DATE;

              9    v_hire         DATE;

            10  BEGIN

            11    IF :NEW.emp_id IS NULL THEN

            12      v_rejects := v_rejects + 1;

            13      v_error_codes := v_error_codes || ',' || 'ERR001';

            14    ELSIF LENGTH (:NEW.emp_id) > 10 THEN

            15      v_rejects := v_rejects + 1;

            16      v_error_codes := v_error_codes || ',' || 'ERR002';

            17    END IF;

            18    BEGIN

            19      v_num := TO_NUMBER (:NEW.emp_id);

            20    EXCEPTION

            21      WHEN value_error THEN

            22        v_rejects := v_rejects + 1;

            23        v_error_codes := v_error_codes || ',' || 'ERR003';

            24    END;

            25    IF :NEW.emp_name IS NULL THEN

            26      v_rejects := v_rejects + 1;

            27      v_error_codes := v_error_codes || ',' || 'ERR006';

            28    ELSIF LENGTH (:NEW.emp_name) > 10 THEN

            29      v_rejects := v_rejects + 1;

            30      v_error_codes := v_error_codes || ',' || 'ERR005';

            31    END IF;

            32    BEGIN

            33      v_dob := TO_DATE (:NEW.emp_dob, 'ddmmyyyy');

            34    EXCEPTION

            35      WHEN OTHERS THEN

            36        v_rejects := v_rejects + 1;

            37        v_error_codes := v_error_codes || ',' || 'ERR008';

            38    END;

            39    BEGIN

            40      IF :NEW.emp_hire_date IS NULL THEN

            41        v_rejects := v_rejects + 1;

            42        v_error_codes := v_error_codes || ',' || 'ERR011';

            43      ELSE

            44        v_hire := TO_DATE (:NEW.emp_hire_date, 'ddmmyyyy');

            45      END IF;

            46    EXCEPTION

            47      WHEN OTHERS THEN

            48        v_rejects := v_rejects + 1;

            49        v_error_codes := v_error_codes || ',' || 'ERR010';

            50    END;

            51    IF LENGTH (:NEW.emp_location) > 80 THEN

            52      v_rejects := v_rejects + 1;

            53      v_error_codes := v_error_codes || ',' || 'ERR013';

            54    END IF;

            55    IF v_hire IS NOT NULL AND v_dob IS NOT NULL AND v_hire < v_dob THEN

            56        v_rejects := v_rejects + 1;

            57        v_error_codes := v_error_codes || ',' || 'ERR014';

            58    END IF;

            59    IF :NEW.emp_id IS NULL OR :NEW.emp_name IS NULL OR :NEW.emp_dob IS NULL

            60       OR :NEW.emp_hire_date IS NULL OR :NEW.emp_location IS NULL THEN

            61        v_rejects := v_rejects + 1;

            62        v_error_codes := v_error_codes || ',' || 'ERR015';

            63    END IF;

            64    IF v_rejects = 0 THEN

            65      INSERT INTO employee (emp_id, emp_name, emp_dob, emp_hire_date, emp_location)

            66      VALUES (:NEW.emp_id, :NEW.emp_name,

            67              TO_DATE (:NEW.emp_dob, 'ddmmyyyy'), TO_DATE (:NEW.emp_hire_date, 'ddmmyyyy'),

            68              :NEW.emp_location);

            69    ELSE

            70      v_error_codes := LTRIM (v_error_codes, ',');

            71      INSERT INTO rejects

            72        (record_type,

            73         emp_id, emp_name, emp_dob, emp_hire_date, emp_location,

            74         error_codes)

            75      VALUES

            76        (:NEW.record_type,

            77         :NEW.emp_id, :NEW.emp_name, :NEW.emp_dob, :NEW.emp_hire_date, :NEW.emp_location,

            78         v_error_codes);

            79    END IF;

            80  END staging_air;

            81  /

             

            Trigger created.

             

            SCOTT@orcl12c> SHOW ERRORS

            No errors.

            SCOTT@orcl12c> INSERT INTO staging SELECT * FROM external_table

              2  /

             

            2 rows created.

             

            SCOTT@orcl12c> ALTER TABLE external_table LOCATION ('emp2.txt')

              2  /

             

            Table altered.

             

            SCOTT@orcl12c> INSERT INTO staging SELECT * FROM external_table

              2  /

             

            2 rows created.

             

            SCOTT@orcl12c> SELECT * FROM employee

              2  /

             

                EMP_ID EMP_NAME   EMP_DOB         EMP_HIRE_DATE   EMP_LOCATION

            ---------- ---------- --------------- --------------- ------------

                     1 EMP1       Fri 14-Feb-1986 Tue 06-Jul-2010 LOC1

             

            1 row selected.

             

            SCOTT@orcl12c> COLUMN error_codes NEWLINE

            SCOTT@orcl12c> SELECT * FROM rejects

              2  /

             

            RECORD_TYP EMP_ID      EMP_NAME       EMP_DOB    EMP_HIRE_D EMP_LOCATION

            ---------- ----------- -------------- ---------- ---------- ------------

            ERROR_CODES

            --------------------------------------------------------------------------------

            R01        20000000000 EMP2           14021-987  06072011

            ERR002,ERR008,ERR015

             

                       ***         EMPPPPPPPPPPP3 14021988   060**012   LOC2

            ERR003,ERR005,ERR010

             

            R01        4           4              14021989   06072013

            ERR015

             

             

            3 rows selected.