Ubuntu: Convert a (txt|srt) document from Western(ISO-8859-15) to UTF-8


I am having problems with subtitles in my language. They are encoded as Western(ISO-8859-15) and therefore some characters are not displayed correctly. I am tired of replacing manually using gedit and ctrl-h and then saving as UTF-8. How to automate this process?


You can use iconv:

If the file is named chapter1.srt, then run:

iconv -f iso88591 -t utf8 chapter1.srt > outputfile.srt  

and it will create the file, although it'll have a different name. If you move them out into another directory, then you can easily pipe them back in.


Another option is

konwert Install konwert

konwert isolatin1-utf8 inputfile.srt > outputfile.srt  

Besides conversion, konwert can also be used as an encoding detector:

konwert any/en-test inputfile.srt  

Which is great to get the input file encoding needed for the conversion, since both konwert and iconv requires that as argument. You must however provide a language parameter: en in any/en-test means English

It also has an option for in-place conversion, so you don't have to move and rename files afterwards:

konwert isolatin1-utf8 -O inputfile.srt  


Also, since you're dealing with .srt files, you should really check pysrt. It has many features to manipulate subtitles, like shifting and rescaling times.


sudo pip install pysrt  

Converting to UTF-8 (it auto-detects input file encoding using either chardet or charade)

srt -i --encoding 'utf-8' shift 0s mysubtitle.srt  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »