Forum Stats

  • 3,727,639 Users
  • 2,245,430 Discussions
  • 7,852,905 Comments

Discussions

Chinese Character Detection

User_19BPU
User_19BPU Member Posts: 1,086 Blue Ribbon
edited August 2017 in New To Java

Hi,

I have a user registration textbox in which the user will enter locale specific characters like Chinese, Japanese, Korean, etc. How I can detect the enter character in the textbox is Chinese , Japanese or Korean? Whether we have any API in java to handle it? I am using JDK1.6. Please let me know the best way in detecting these characters.

Thanks

Answers

  • handat
    handat Member Posts: 4,688 Gold Crown
    edited August 2017

    You can't really do that unless it is a very specific character that only exists in one language but not the other. The three languages share the CJK code pages since there are many common characters. Java itself does not care so it needs to be something custom.

    For example, the following code snippet checks whether the character is CJK or not:

    public static boolean containsHanScript(String s) {<br/>  for (int i = 0; i < s.length(); ) {<br/>  int codepoint = s.codePointAt(i);<br/>  i += Character.charCount(codepoint);<br/>  if (Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN) {<br/>  return true;<br/>  }<br/>  }<br/>  return false;<br/>}

    For reference see the following for a list of unicode character sets: https://docs.oracle.com/javase/7/docs/api/java/lang/Character.UnicodeScript.html

  • Unknown
    edited August 2017
    I have a user registration textbox in which the user will enter locale specific characters like Chinese, Japanese, Korean, etc. How I can detect the enter character in the textbox is Chinese , Japanese or Korean?

    They can ONLY enter characters supported by the character set you are using for the textbox. You should already know what character set that is.

This discussion has been closed.