12 Replies Latest reply on Mar 18, 2010 7:03 PM by user11971257

    sql loader load multi-lines into one column

    733391
      I got a txt file, which is like:

      the first record : |111|,,|aaa|bbb|,||,12/01/2009,||

      the second record : |222|,|xx|,|ccccwwww|,||,12/02/2009,|PA|

      the third record : |333|,|yy|,|abcd
      2345 9989098099899 asasjkhsjkd

      yoyiyoi : yyyyy
      jjjjjjjjj|,
      11/30/2009,|NY|

      The field is terminated by ",", and is enclosed by "|".

      I am trying to use SQL loader to load into oracle database.
      The issue for me is
      1. the Vertical bar(|, pipe)
      2. multi-lines should be in one column.

      ANy suggestion for the control file?

      Thanks!

      Edited by: user1665291 on Mar 15, 2010 11:55 AM
        • 1. Re: sql loader load multi-lines into one column
          Jim Stenoish
          Here's what's going on. When SQL Loader parses data out of the data file, it first searches for the end of the next record. Once it finds the end of the next record, it parses data out of that record. I'm guessing you're records are delimited by new lines. So, it interprets the new lines in the middle of your data fields as the end of a record.

          There aren't any great solutions to this problem. Any solution involves a change to how the data file is generated and chances are that is out of your control. One solution is to use a record delimiter other than new line; it can even be multiple characters that end with newline, but the bottom line is that it record delimiter also appear in the middle of the record.
          1 person found this helpful
          • 2. Re: sql loader load multi-lines into one column
            732412
            Hello user1665291.
            The only available option in SQL*Loader for this scenario is the use of CONTINUEIF.
            CONTINUEIF LAST PRESERVE = '|'
            Based on the sample that you provided this would work, however if the multi-row column data had a pipe on the end of one of the lines within its body, that record would load in part or error.

            http://www.mcs.csueastbay.edu/support/oracle/doc/10.2/server.102/b14215/ldr_control_file.htm#i1005509

            Hope this helps,
            Luke

            Please mark the answer as helpful or answered if it is so. If not, provide additional details.
            Always try to provide create table and insert table statements to help the forum members help you better.
            1 person found this helpful
            • 3. Re: sql loader load multi-lines into one column
              733391
              Thanks for the responses.

              What I have in mind is that:
              1. concatenate multi-lines into one line.
              2. replace "|," ",|" ",||" "||," to outstand the "|" in the data content.

              so, is there any way (maybe text editor or any global replace) to remove the LF/CR (line feed/carriage return), chr(13)chr(10) in a txt file?

              Thanks!
              • 4. Re: sql loader load multi-lines into one column
                TheOtherGuy
                you can use multiple characters as a records delimiter

                here is small sample for external table (sqlloader is similar), as you can see my records are delimited by RECORDS DELIMITED BY '\r\n' which is CHR(10)+CHR(13) - classic DOS end of line character, you can replace it with whatever set of 2 or more characters you want. In your case it would be probably combination of | and CHR(10)+CHR(13)




                CREATE TABLE JIRI.SAMPLE_01_EXT
                (
                PRODUCT VARCHAR2(60),
                NUMBER_OF_UNITS NUMBER(10,0),
                TOTAL_PRICE NUMBER(10,2)
                )
                ORGANIZATION EXTERNAL
                ( TYPE ORACLE_LOADER
                DEFAULT DIRECTORY DIR_USERS_JIRI_DATA
                ACCESS PARAMETERS
                ( RECORDS DELIMITED BY '\r\n'
                SKIP 1
                LOGFILE DIR_USERS_JIRI_LOG:'sample_01.log'
                BADFILE DIR_USERS_JIRI_BAD:'sample_01.bad'
                NODISCARDFILE
                FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
                MISSING FIELD VALUES ARE NULL
                )
                LOCATION (DIR_USERS_JIRI_DATA:'sample_01.dat')
                )
                REJECT LIMIT 0
                NOPARALLEL
                NOMONITORING;
                1 person found this helpful
                • 5. Re: sql loader load multi-lines into one column
                  732412
                  Hello user1665291.
                  If you
                  concatenate multi-lines into one line
                  you will not need to
                  replace "|," ",|" ",||" "||," to outstand the "|" in the data content
                  In order to
                  concatenate multi-lines into one line
                  you will need to determine how to identify which lines are multi-lines. Why not just let SQL*Loader do this for you, then for any rejected records, address those after the fact?

                  The application or program that you'd use to
                  remove the LF/CR (line feed/carriage return), chr(13)chr(10) in a txt file
                  will be greatly dependant on what OS you are using. If you were to remove all record delimiters from your input file, you'll have to configure your control file to identify the logical records contained on the single physical record.

                  The question I'd suggest asking yourself is whether you can work with the person/company providing the input file to provide the file to meet a definable format. If they haven't been told that they were allowed to provide multi-line columns or delimiters that could also exist within the data itself, an unpredictable combination, they should incur the cost of correcting their mistake.


                  Hope this helps,
                  Luke

                  Please mark the answer as helpful or answered if it is so. If not, provide additional details.
                  Always try to provide create table and insert table statements to help the forum members help you better.
                  1 person found this helpful
                  • 6. Re: sql loader load multi-lines into one column
                    733391
                    Thanks for the input.

                    I tried using external table and it came out :

                    ORA-29913: error in executing ODCIEXTTABLEFETCH callout
                    ORA-30653: reject limit reached

                    I think it is because there is pipe ("|") in between of two "|" for certain columns.
                    It is supposed to be enclosed by "|", but it turned out that there is another "|" in between two "|".

                    Thank you so much anyway.
                    • 7. Re: sql loader load multi-lines into one column
                      TheOtherGuy
                      that should not matter, optionally enclosed should take care of it

                      FIELDS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"'


                      are you sure the reject is due the optional quotes? If the file is not confidential or too large I can help you with the external table.
                      1 person found this helpful
                      • 8. Re: sql loader load multi-lines into one column
                        733391
                        Thank you so much for your kindness.

                        Here is the script I used :

                        CREATE TABLE topic(
                             SID int,
                             uniqID varchar2(60) ,
                             topicID varchar2(3) ,
                             topic varchar2(50) NULL,
                             Specifictopic varchar2(4000) NULL,
                             year varchar2 (4) NULL
                        )
                        organization external
                        (type oracle_loader
                        default directory data_dir
                        access parameters
                        (records delimited by '\r\n'
                        skip 1
                        logfile data_dir:'topic.log'
                        badfile data_dir:'topic.bad'
                        nodiscardfile
                        fields terminated by ',', optionally enclosed by "|"
                        missing field values are null
                        )
                        location (data_dir:'topic.txt')
                        )
                        reject limit 0
                        noparallel
                        nomonitoring
                        ;



                        and here is portial of the data file (topic.txt):
                        ********************************************************
                        15,|0370A01D-DC1E-4534-8176-A08A1E2F82E4|,|EDU|,|Education|,|Appropriations and authorization regarding higher education issues.|,|2008|
                        16,|03A8F7BB-9716-4494-BF41-013C27B5ECA6|,|GOV|,|Government Issues|,|issues affecting local government including appropriations|,|2003|
                        17,|04696109-082B-4EF6-9AA8-A6DB1013D15D|,|TEC|,|Telecommunications|,|RUS Broadband Applikcation|,|2008|
                        18,|04FA0BA7-E9D2-4F1E-8193-45F023065C89|,|DOC|,|District of Columbia|,|HUD Appropriations FY2009, CDBG
                        Financial Services Appropriations FY2009, District of Columbia
                        Commerce, Justice, Science Appropriations, Juvenile Justice, Byrne Grant|,|2008|
                        19,|04FA0BA7-E9D2-4F1E-8193-45F023065C89|,|HOU|,|Housing|,|HUD Appropriations FY2009, CDBG
                        Financial Services Appropriations FY2009, District of Columbia
                        Commerce, Justice, Science Appropriations, Juvenile Justice, Byrne Grant|,|2008|
                        20,|04FA0BA7-E9D2-4F1E-8193-45F023065C89|,|LAW|,|Law Enforcement & Crime|,|HUD Appropriations FY2009, CDBG
                        Financial Services Appropriations FY2009, District of Columbia
                        Commerce, Justice, Science Appropriations, Juvenile Justice, Byrne Grant|,|2008|
                        64043,|57C869AF-B535-430D-A74D-BB4F1F7B4F4C|,|HOM|,|Homeland Security|,|
                        FY 06 and FY 07 Department of Justice and Department of Homeland Security appropriations - emergency communications interoperability
                        Automotive safety, research and development, technology, interoperability and emergency communications issues, manufacturing competitiveness, SAFETEA-LU implementation, FY 06 and FY07 DOT appropriations - automotive research and development, technology, safety and communications issues|,|2007|
                        64044,|DFBE0690-0F55-4A7E-B1C9-9B166373346D|,|HCR|,|Health Issues|,||,|2002|
                        64045,|B48AF167-C7D2-4B3C-AB6E-5D13D81E500B|,|DEF|,|Defense|,||,|2004|
                        64046,|9F0F6618-D321-4C46-8666-76C634403736|,|AGR|,|Agriculture|,|Monitor and advocate regarding school lunch program issues, WIC issues, and agriculture appropriations (H.R. 5384).|,|2006|
                        64047,|1998/01/000/050/000050955-3|,|AVI|,|Aviation, Airlines & Airports|,,|1998|
                        64048,|1998/01/000/084/000084215-4|,|FOR|,|Foreign Relations|,,|1998|
                        64049,|F91E8447-C727-4541-85BB-018924458613|,|FOR|,|Foreign Relations|,||,|2000|
                        64050,|48AE237D-2C09-4F8D-8B93-22F4076DF659|,|DEF|,|Defense|,||,|2000|
                        64051,|2817D232-B223-47D9-930B-DCE3CFBA062F|,|ENV|,|Environment & Superfund|,|U.S. Forest Service Wilderness Permitting Policy|,|2001|
                        64052,|38EBFC47-D13A-4777-8965-35325CD7E617|,|ENV|,|Environment & Superfund|,|HR 4818 - The FY2005 Consolidated Appropriations Bill
                        HR 4614 - FY2005 Energy and Water Appropriations Bill
                        FY2005 VA/HUD/EPA Appropriations Bill
                        Water Resources Development Act, and related proposals
                        |,|2004|

                        **********************************************************
                        P.S. Last line of topic.txt file should be "|,|2004|"
                        for some reason, the vertical bar can't be showed when I posted it.

                        Edited by: user1665291 on Mar 17, 2010 11:27 AM
                        • 9. Re: sql loader load multi-lines into one column
                          TheOtherGuy
                          what is end-of-line character? is it CRLF or just LF ?
                          • 10. Re: sql loader load multi-lines into one column
                            733391
                            It is in Windows system, so, it should be LFCR.

                            Thanks!
                            • 11. Re: sql loader load multi-lines into one column
                              TheOtherGuy
                              before I even go to the solution... I would highly recommend to go back to the source and change it. The best way is to add one more dummy static field at the end and use that as end-of-line character for example DUMMYFIELD with no pipes around it, then the records delimiter will look like records delimited by 'DUMMYFIED\r\n'


                              if you cannot modify the source file, then you will have very very hard time. Please take this as a workaround for your problem. It worked on my computer on the small set, but I think you might run into some issues

                              Basically I change the delimiter for all fields and removed the optional delimiter. This allows to read the pipe at the end as part of the end-of-line delimiter. You will have to remove | from SQL statement you run on the table

                              note: the file you sent me had no header, I change SKIP 1 to SKIP 0 in the sample below


                              CREATE TABLE topic
                              (
                              SID number,
                              uniqID varchar2(60) ,
                              topicID varchar2(4) ,
                              topic varchar2(4000) NULL,
                              Specifictopic varchar2(4000) NULL,
                              year varchar2(4) NULL
                              )
                              organization external
                              (type oracle_loader
                              default directory data_dir
                              access parameters
                              (records delimited by '|\r\n'
                              skip 0
                              logfile data_dir:'topic.log'
                              badfile data_dir:'topic.bad'
                              nodiscardfile
                              fields terminated by ',|'
                              missing field values are null
                              )
                              location (data_dir:'topic.txt')
                              )
                              reject limit 0
                              noparallel
                              nomonitoring
                              ;
                              • 12. Re: sql loader load multi-lines into one column
                                user11971257
                                Oracle® Database Utilities
                                11g Release 1 (11.1)
                                Part Number B28319-02
                                Ch. 6 SQL*Loader Concepts


                                Stream Record Format

                                A file is in stream record format when the records are not specified by size; instead SQL*Loader forms records by scanning for the record terminator. Stream record format is the most flexible format, but there can be a negative effect on performance. The specification of a datafile to be interpreted as being in stream record format looks similar to the following:

                                INFILE datafile_name ["str terminator_string"]

                                The terminator_string is specified as either 'char_string' or X'hex_string' where:

                                'char_string' is a string of characters enclosed in single or double quotation marks

                                X'hex_string' is a byte string in hexadecimal format

                                When the terminator_string contains special (nonprintable) characters, it should be specified as a X'hex_string'. However, some nonprintable characters can be specified as ('char_string') by using a backslash. For example:

                                \n indicates a line feed

                                \t indicates a horizontal tab

                                \f indicates a form feed

                                \v indicates a vertical tab

                                \r indicates a carriage return

                                If the character set specified with the NLS_LANG parameter for your session is different from the character set of the datafile, character strings are converted to the character set of the datafile. This is done before SQL*Loader checks for the default record terminator.

                                Hexadecimal strings are assumed to be in the character set of the datafile, so no conversion is performed.

                                On UNIX-based platforms, if no terminator_string is specified, SQL*Loader defaults to the line feed character, \n.

                                On Windows NT, if no terminator_string is specified, then SQL*Loader uses either \n or \r\n as the record terminator, depending on which one it finds first in the datafile. This means that if you know that one or more records in your datafile has \n embedded in a field, but you want \r\n to be used as the record terminator, you must specify it.


                                Note:74719.1 How to specify alternate end-of-record terminators in SQL*Loader