This content has been marked as final. Show 9 replies
Try setting the "Automatic CSV Encoding" to "Yes".
This is done through navigating: "Shared Components", then on right-hand side of page clicking "Edit Definition", the automatic CSV encoding item is found under the "Globalization" tab.
the Setting of "Automatic CSV Encoding" was already set to "YES", i didn't changed anything and still the characters are not exported correctly!
Application Express : 4.0.2.00.07
Database : 10.2.0.4.0
NLS_CHARACTERSET : AL32UTF8
Using: OHS 10g
Has anyone tried this combination and see whether non-Latin characters are exported correctly?
As i see it it looks like a bug to me, but the issue is very serious!
What also is very interesting is that the export in HTML has no problems though, as the characters are exported correctly as expected!
1. i tried to reproduce the issue in apex.oracle.com which has same characterset by putting some non-Latin characters on field data, but i was not able to, as there the chars are exported correctly!
2. The settings in nls_session_paramaters and nls_database_parameters are the same between the my database and database of apex.oracle.com
3. I checked the functionality for download as csv in a non - UTF-8 database character set and still the same problem. Also i checked in EPG 11g and OHS 10g configuration and still same problem.
Can somebody test this too on a database other than that on apex.oracle.com if non-Latin characters in VARCHAR2 fields are exported (download) in csv correctly please ?
Also it would be very helpful if somebody from apex team could inform how they did succeed facing this issue during working in apex.oracle.com database...
Edited by: Dionyssis on Jan 17, 2011 4:55 PM
Well, i think i have located the problem. The issue is that the Download as csv process is creating a text file in ANSI codification. When i have UTF-8 data character set in my database and characters that do not belong to Latin or in Application Primary Language character set, these are not possible to be exported correctly in csv.
I tested this in apex.oracle.com and the issue is reproducible there.
How could i solved this? It looks to me as bug in APEX!
Any suggestions ?
This is a demonstration page in apex.oracle.com for the issue (Page 8):
Edited by: Dionyssis on Jan 21, 2011 1:43 PM
Fortunately, this is not a bug in APEX.
Automatic CSV encoding in Application Express ultimately was implemented to work with the MS Excel localization behavior. If you're downloading Japanese characters in CSV to Excel, then the localized version of Excel expects the characters to be encoded in the local Windows character set (Shift JIS, I think). If you're downloading German characters in CSV to Excel, then the localized version of Excel expects windows-1252 encoding.
For Application Express, Automatic CSV Encoding is used in conjunction with the user's session language. In your case, if you go to the Globalization Attributes of your application 20695, I see that it's always English (en), which in APEX defaults to an encoding of windows-1252. I changed the application primary language to Greek.
Now when I download the CSV, it still looks like corrupted characters because my American Excel is expecting windows-1252, but I confirmed that the downloaded file is properly encoded in windows-1253, which should work with a Greek localized Excel.
Let me know if this works for you.
thanks for taking the time to answer to my Issue. Well this does not work for my case as the source of data (Database character set) is UTF-8. The Data inside the database that are shown in the IR on the Screen is UTF-8 and this is done correctly. You can see this in my example. The actual Data in the Database are from multiple languages, English, Greek, German, Bulgarian etc that's why i selected the UTF-8 character set when implementing the Database and this requirement was for all character data. Also the suggested character set from Oracle is Unicode when you create a Database and you have to support data from multiple languages.
What is the requirement, is that what i see in the IR (i mean in Display) i need to export in CSV file correctly and this is what i expect from the Download as CSV feature to achieve. I understand that you had in mind Excel when implementing this feature but a CSV is just an easy way to export the Data - a Comma Separated Values file, not necessarily to open them directly in Excel. Also i want to add here that in Excel you can import the Data in UTF-8 encoding when importing from CSV, which is fine for my customer. Also Excel 2008 and later understands a UTF-8 CSV file if you have placed the UTF-8 BOM character at the start of the file (well, it drops you to the wizzard, but it's almost the same as importing).
Since the feature you describe and if i understood correctly is creating always an ANSI encoded file in every case, even when the Database character set is UTF-8, it is impossible to export correctly if i have data that are neither in Latin, not in the other 128 country specific characters i choose in Globalization attributes and these data is that i see in Display and need to export to CSV. I believe that this feature in case the Database character set is UTF-8 should create a CSV file that is UTF-8 encoded and export correctly what i see i the screen and i suspect that others would also expect this behaviour. Or at least you can allow/implement(?) this behaviour when Automatic CSV encoding is set to No. But i stongly believe - and especially from the eyes of a user - to have different things in screen and in the depicted CSV file is a bug, not a feature.
I would like to have comments on this from other people here too.
I modified your application, changing the Application Primary Language back to English and I set Automatic CSV Encoding to No.
Now when I run page 8 in your application and I download to CSV, it is properly encoded in UTF-8. Of course, when my version of Excel opens it up directly, the characters appear corrupted again because my version of Excel expects windows-1252 encoding. However, if I import the data in Excel (Data -> From Text), and I choose File Origin of 65001: Unicode (UTF-8), all of the data appears correct.
+>> to have different things in screen and in the depicted CSV file is a bug, not a feature.+
I have now adjusted the settings of your application so that "what is depicted in CSV to be equal", and thus, this is not a bug in APEX.
thank you a lot for your response, now in my case too, i managed to export correctly too if i set it to No Automatic Encoding - it just not directly opens it in Excel correctly but this is accepted. Btw, it is not in my case bu tif you try in the same example i provided to export in PDF, it is not exported correctly.
Thanks again A LOT.
Is there any way in APEX to incorporate the BOM (Byte Order Mark) to be included within the excel download? Or must I write my own custom excel download to include the BOM?