Tutorial :Character encoding for thai characters


I have a requirement to read a RTF file with thai characters and write it to a text file. I tried using TIS-620, MS874,ISO-8859-11,but thai characters are not displaying properly when i open the resulting output file in notepad or textpad. But it works well with Wordpad. Please guide me.

Thanks and Regards, Ramya.

Code that solved the problem (posted in comment, adding here to make it readable!):

FileInputStream fin = new FileInputStream(fileName);  DataInputStream din = new DataInputStream(fin);  //creating a default blank styled document  DefaultStyledDocument styledDoc = new DefaultStyledDocument();  //Creating a RTF Editor kit  RTFEditorKit rtfKit = new RTFEditorKit();  //Populating the contents in the blank styled document  rtfKit.read(din,styledDoc,0);  // Getting the root document  Document doc = styledDoc.getDefaultRootElement().getDocument();  //Printing out the contents of the RTF document as plain text  System.out.println(doc.getText(0,doc.getLength()));   


I don't think notepad handles all character encodings, from a little Googling. Could you try re-encoding the characters into UTF-8 (or some other unicode format), since Notepad does handle that correctly? You'll want to use the BOM.

I also stumbled across a tool for converting files in Thai into various other encodings.

Finally, is there a requirement that the files can be opened in Notepad? It's not as if Notepad is the last word in text editing.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »