This content has been marked as final. Show 4 replies
I'm not aware of any FAQ for this subject. What kind of information are you seeking (or issues you're encountering)?
a specific question was posted WE8MSWIN1252 database, APEX listener and UTF-8 support.
Specifically we want to implement an application that can handle person's names from all over the world, so UTF-8 will be needed.
I considered using NVARCHAR2 data type but I found posts that, with APEX, this will not be sufficent to an implement UTF-8 application. We do have a non-UTF-8 database (WE8MSWIN1252).
There seem to be further considerations, as data is passed through APEX listener (in our case), EPG, or mod.plsql.
So, can you somehow summarize the requirements for implementing an APEX UTF-8 application?
Spring greetings, Tom
The short answer is "no". There isn't an easy (or even feasible) way to accomplish this without converting your database character set to AL32UTF8.
As you have already identified, there are utf-8 characters which cannot be stored in a database of character set WE8MSWIN1252. While a data type of NVARCHAR2 could be considered to store a utf-8 character in a WE8MSWIN1252 database, you have no practical way of moving those bytes from the browser to the Web server to the database to the NVARCHAR2 data type, at least certainly not in the context of an APEX application.
The character set of the "client" to the database (in this case, either OHS/mod_plsql or APEX Listener) must be AL32UTF8. The page encoding of all pages served by APEX must be utf-8. This is for a variety of reasons, but let's just treat this as a requirement for now.
You now have an AL32UTF8 client talking to a WEMSWIN1252 database. The character conversion will happen between these two components. So by the time your utf-8 character makes it to the database, the character is already changed. Also, the entry points into the APEX engine are all primarily VARCHAR2. PL/SQL variables, by definition, are in the character set of the database. So even if you conjured a way to avoid the the character conversion between the client and the database, by the time it "touches" the APEX engine, it would be to a VARCHAR2 local variable, not NVARCHAR2, and once again, be converted and ultimately corrupted.
I realize you're probably faced with a legacy database, and/or a third-party system which may require WE8MSWIN1252, or numerous other systems which may not be able to endure a conversion to AL32UTF8, but there really isn't a practical way to support utf-8 data through an APEX application, if the underlying database character set cannot support that data.
Many years ago, the performance of an AL32UTF8 database was dramatically worse than a US7ASCII or WE8MSWIN1252 or WE8ISO8859P1 database. But those problems have long been rectified. And, as a colleague of mine said before, "the language of the Internet is utf-8". Basically, if you're creating a new Oracle database today, there is no compelling reason to not create it in character set AL32UTF8.
Sorry I don't have better news for you.
Mit freundlichen Grüßen,
thanks for that enlighting explanation. We will have to discuss internally which way to go.