I am running a script that reads data from a local text file, parses it, and then do an INSERT with the parsed values.
The inserts run ok.
But, when I retrieve the values from the database I see strange characters instead of our Norwegian characters æ,ø,å.
- The NLS_CHARACTERSET in the database is AL32UTF8.
- The database is Oracle XE 10.
- The OS is CentOS 5.3
- The file is saved as UTF in the Gedit Gnome editor.
- I open the file with open('filename.txt')
- Read all lines into a list with readlines()
- I parse all lines using [from:to] string splitting notation.
- And add to the INSERT using "INSERT into mytable values (%d, %s)" % (int(pn), ps).
I also print out the INSERT string just to make sure everything looks ok and it does, but when inside the database the characters æøå are weird.
Oracle allows the database client (e.g. Python) character set to be different to the database character set. Oracle libraries will attempt to map data between the two character sets. If the mapping isn't possible, then characters will often appear as question marks.
Set the character set used for the client with the environment variable NLS_LANG.