This discussion is archived
3 Replies Latest reply: Dec 31, 2012 7:19 AM by User286067 RSS

External Table Preprocessor

VS TRINAYANT Newbie
Currently Being Moderated
Dear Experts,

I was trying to explore the "preprocessor" command in external table, but I was not able to succeed.

I just followed the steps mentioned in, latest Oracle Magazine.
http://www.oracle.com/technetwork/issue-archive/2012/12-nov/o62asktom-1867739.html

But finally when I do the select from table df, I am getting the following error.

ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04095: preprocessor command /export/csrmgr/ripple_dms/run_df.sh encountered error "error during exec: errno is 2
"

PS: The output of v$version;

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
"CORE     11.2.0.2.0     Production"
TNS for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production


Could anyone help me to resolve this issue.
  • 1. Re: External Table Preprocessor
    BluShadow Guru Moderator
    Currently Being Moderated
    I'm not able to test Tom's example at the minute.

    However, that new feature was discussed in the March/April 2011 edition of Oracle Magazine too.

    -----

    Oracle Magazine March/April 2011 article on new feature in 11gR2 called the external table preprocessor.
    by Arup Nanda

    CHANGE IN PROCESS
    Now suppose that with your external table and its text file in place, the input file for the external table is compressed to reduce the volume of data transmitted across the network. Although compression helps the network bandwidth utilization, it creates a challenge for the ETL process. The file must be uncompressed before its contents can be accessed by the external table.

    Rather than uncompress the file in a separate process, you can use the preprocessor feature of Oracle Database 11g Release 2 with external tables to uncompress the file inline. And you will notneed to change the ETL process.

    To use the preprocessor feature, first you need to create a preprocessor program. The external table expects input in a text format but not necessarily in a file. The external table does not need to read a file; rather, it expects to get the file contents “fed” to it. So the preprocessor program must stream the input data directly to the external table—and not
    create another input file. The input to the pre-processor will be a compressed file, and the output will be the uncompressed contents.

    The following is the code for your new preprocessor program, named preprocess.bat:
    @echo off
    C:\oracle\product\11.2.0\dbhome_1\BIN\unzip.exe -qc %1
    The first line, @echo off, suppresses the output of the command in a Windows environment. The remaining code calls the unzip.exe utility located in the Oracle home. The utility needs an input file, which is the first (and only) parameter passed to it, shown as %1. The options q and c tell the utility to uncompress quietly (q) without producing extraneous output such as “Inflating” or “%age inflated” and match filenames case-insensitively (c), respectively.

    Next you need to create the directory object where this preprocessor program is located. Logging in as SYS, issue
    create directory execdir as 'c:\tools';
    And now grant EXECUTE permissions on the directory to the ETLADMIN user:
    grant execute on directory execdir to etladmin;
    Finally, create the new external table:
    create table indata1
    (
      cust_id number,
      cust_name varchar2(20),
      credit_limit number(10)
    )
    organization external
    (
      type oracle_loader
      default directory etl_dir
      access parameters
      (
        records delimited by newline
        preprocessor execdir:'preprocess.bat'
        fields terminated by ","
      )
      location ('indata1.txt')
    )
    /
    It calls the preprocess.bat executable in the directory specified by EXECDIR before the external table accesses the indata1.txt file in the location specified by the ETL_DIR directory. Remember, indata1.txt is now a compressed file. So, in effect, the external table reads not the actual specified input file but rather the output of preprocess.bat, which is the uncompressed data from the indata1.txt file.

    If you select from the external table now, the output will be similar to that of the earlier select * from indata1; query. The preprocessor passed the uncompressed contents of the indata1.txt (compressed) file on to the external table. There was no need to uncompress the file first—saving significant time and the intermediate space required and making it unnecessary to change the ETL process.

    This inline preprocessing unzip example uses a script, but that is not always necessary. An executable can be used instead. For example, in Linux you can use /bin/gunzip. However, the utility can’t accept any parameters. So if you pass parameters (as in this article’s example), you must use a script.

    Security Concerns
    =================
    The EXECUTE privilege on a directory is a new feature introduced in Oracle Database 11g Release 2. It enables the DBA to grant EXECUTE permissions only for certain directories and only to certain users. Without WRITE privileges, users will not be able to
    update the executables inside a directory to insert malicious code, but users will be able to execute the “approved” code accessible in a single location. The DBA can put all the necessary preprocessor tools into a single directory and grant EXECUTE privileges there to the users who may need them. And, of course, the executables and data should be in different directories.

    Preprocessing also requires some special precautions on the part of the DBA. Because the executables called by preprocessing programs will be executed under the privileges of the Oracle software owner and malicious executable code can cause a lot of damage, the DBA should be extremely careful in monitoring executables for potentially harmful code.

    The directory containing the preprocessor executables needs to be accessible to the Oracle software owner for EXECUTE operations only, not for WRITE activity. Therefore, as an added precaution, the system administrator can remove WRITE access to that directory from all users, including the Oracle software owner. This significantly reduces the chance of damage by malicious code.

    Other Uses
    ==========
    Compression is not the only use for inline preprocessing, although it certainly is the most widely used. You can, for example, use this preprocessing technique to show the output of a program as an external table. Consider, for instance, the dir command in Windows for listing the contents of a directory. How would you like to get the output as a table so that you can apply predicates?

    Getting and using this output is quite simple with the preprocessor functionality. Remember, the preprocessor does not actually need a file but, rather, requires the output of the preprocessor program. You can write a preprocessor program to send the
    output of a dir command. The new preprocessor program, named preproc_dir.bat, has only the following two lines:
    @echo off
    dir
    You will also need a file for the external table. The contents of the file are irrelevant, so you can use any file that the Oracle software owner can read in a directory to which that owner has read access. For this example, the file is dirfile.txt, and although the contents of the file are immaterial, the file must exist, because the external table will access it. Listing 1 shows how to create the table.

    Because the dir command displays output in a prespecified manner, the external table easily parses it by reading the fields located in specific positions. For example, positions 1 through 10 display the date, 11 through 20 display the time, and so on. The dir command produces some heading and preliminary information that the external table has to ignore, so there is a skip 5 clause in Listing 1 that skips the first five lines of the output. The last few lines of the output show how many files and directories are present and how much free space remains. This output must be skipped as well, so the external table displays records only when the date column has a value.

    Listing 1 also shows the result of a query against the external table. Because the MOD_DT column is of the date datatype, you can also apply a WHERE condition to select a specified set of records.

    Code Listing 1:
    ===============
    create table dir_tab
    (
      mod_dt date,
      mod_time char(10),
      file_type char(10),
      file_size char(10),
      file_name char(40)
    )
      organization external
    (
      type oracle_loader
      default directory etl_dir
      access parameters
      (
        records delimited by newline
        preprocessor execdir:'preproc_dir.bat'
        skip 5
        load when (mod_dt != blanks)
        fields
        (
          mod_dt position (01:10) DATE mask "mm/dd/yyyy",
          mod_time position (11:20),
          file_type position (21:29),
          file_size position (30:38),
          file_name position (39:80)
        )
      )
      location ('dirfile.txt')
    )
    reject limit unlimited
    /
    
    -- select from this table
    SQL> select * from dir_tab;
    
    MOD_DT        MOD_TIME       FILE_TYPE       FILE_SIZE      FILE_NAME
    ————————————— —————————————— ——————————————— —————————————— ————————————————————————————
    16-DEC-10     10:12 AM       <DIR> .
    16-DEC-10     10:12 AM       <DIR> ..
    22-MAY-10     09:57 PM       <DIR> archive
    22-MAY-10     10:27 PM                                2,048 hc_alap112.dat
    05-DEC-10     07:07 PM                                   36 indata1.txt
    22-DEC-05     04:07 AM                               31,744 oradba.exe
    16-DEC-10     09:58 AM                                1,123 oradim.log
    28-SEP-10     12:41 PM                                1,536 PWDALAP112.ora
    16-DEC-10     09:58 AM                                2,560 SPFILEALAP112.ORA
    
    9 rows selected.
    
    -- select a file not updated in last 1 year
    SQL> select * from dir_tab where mod_dt < sysdate - 365;
    
    MOD_DT        MOD_TIME       FILE_TYPE       FILE_SIZE      FILE_NAME
    ————————————— —————————————— ——————————————— —————————————— ————————————————————————————
    22-DEC-05     04:07 AM                               31,744 oradba.exe
  • 2. Re: External Table Preprocessor
    APC Oracle ACE
    Currently Being Moderated
    Have you checked that you can run the shell script standalone, as the oracle os user? For instance by using the SQL*PLus host access as Tom shows in that article?

    Cheers, APC
  • 3. Re: External Table Preprocessor
    User286067 Journeyer
    Currently Being Moderated
    VS TRINAYANT wrote:
    KUP-04095: preprocessor command /export/csrmgr/ripple_dms/run_df.sh encountered error "error during exec: errno is 2"
    Error 2 on unix is
    #define ENOENT           2      /* No such file or directory */
    ... which means the command/file you referenced on that line is not accessible ...

    try providing a fully qualified path for df, did you check by running 'whch df' to make sure that it is */bin/df* in your environment ?

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points