I have an external table that reads from a fixed length file. The file is expected to contain special characters. In my case the word containing special character is "Göteborg". Because "ö" is a special character, looks like Oracle is considering it as 2 bytes. That causes the trouble. The subsequent fields in the files get shifted by 1 byte thereby messing up the data. Has anyone faced the issue before. So far we have tried the following solution:
Changed the value of NLS_LANG to AMERICAN_AMERICA.WE8ISO8859P1
Tried Setting the Database Character set to UTF-8
Tried changing the NLS_LENGTH_SYMMANTIC to CHAR instead of BYTE using ALTER SYSTEM
Tried changing the External table characterset to: AL32UTF8
Tried changing the External table characterset to: UTF-8
Nothing works. Other details include:
• File is UTF-8 encoded
• Operating System : RHEL
• Database: Oracle 11gR2
Any thing else that I might be missing? Any help will be appreciated. Thanks in advance!
By fixed length file do you mean fixed-length records or a fixed-length fields with records separated by new line? By fixed, do you mean fixed number of bytes or fixed number of characters? Have you looked at the file in hex (in an editor with hex display or using the Unix 'od' utility)? What is the code of "ö" in the file?
CREATE TABLE "APP_XXX_XX_XXXX"(
"XXX_XXX_XXXXX" VARCHAR2(1 CHAR),
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "DIR"
( RECORDS DELIMITED BY NEWLINE
KEY_NO POSITION(1:9) INTEGER EXTERNAL,
XXX_XXX_XXXXX POSITION(10:10) CHAR,
XXX_XXX_XXXXX POSITION(11:19) INTEGER EXTERNAL,
XXX_XXX_XXXXX POSITION(11:14) INTEGER EXTERNAL,
XXX_XXX_XXXXX POSITION(15:16) INTEGER EXTERNAL,
XXX_XXX_XXXXX POSITION(17:19) INTEGER EXTERNAL)
REJECT LIMIT UNLIMITED
The data forms part of a major driver table, which I may not be able to provide as per the NDA with my company.
The Script I have shared is standard template for creating external table as part of our applications.
We majorly cater for EU regions with this project.(European Characterset needed to be supported)
Edited by: RGRAM on Feb 11, 2011 12:44 PM
Please post the exact content of the record from the input file that contains the word "Göteborg" as seen in a hex editor. If there is any sensitive information there, you can first mask it out by replacing certain ASCII letters with 'X' (but do not change the non-ASCII letters, such as "ö").
000000000NX X X XXX SCOTLAND 0000000000 0001200448705325325 19940000000000523180000000000020000488G00313559901376 AB Göteborg_ XXXXXX 717640967881X X X XXX 785353952559XX XXXXX Göteborg_ XXXXXX 717320101204
Edited by: RGRAM on Feb 11, 2011 12:47 PM
Edited by: RGRAM on Feb 11, 2011 12:51 PM