2 Replies Latest reply: Apr 19, 2012 8:33 AM by pl_sequel RSS

    Language detection

      Hi all,

      Running on:

      Oracle Database 11g Enterprise Edition Release - 64bit Production
      With the Partitioning, Oracle Label Security, OLAP, Data Mining,
      Oracle Database Vault and Real Application Testing options

      I'm aware of auto_lexer within the context of Oracle text. but what I am wondering is if there is a way to determine the language and extract the language code from a given text input so we can then tag the content in our table on insert/update?

      We have a public web form, where users can enter content in their language of choice (french or english)... there are multiple input textareas, and it's possible some of these could contain a mix of both languages. (some content is provided by third party sources, users simply copy and paste into form)

      Our clients need to determine which submissions contain bilingual content and which are unilingual. We could prompt the users to specify the language for each input, but wondering if there is a more "automated" way of doing this, without cluttering our input forms with language drop downs?