[wellylug] utility to determine text file encoding?
Stephen Judd
stephen at vital.org.nz
Tue Mar 3 07:24:00 NZDT 2009
On Mon, 2009-03-02 at 17:21 +1300, Joe Mahoney wrote:
> Hi All
>
> Is there a nice little command line app that, given a text file, will
> tell me the encoding/charset of the file.
There is a python module called "chardet" which is somewhat successful a
lot of the time:
http://chardet.feedparser.org/
stephen at lung:~$
>>> import urllib
>>> urlread = lambda url: urllib.urlopen(url).read()
>>> import chardet
>>> chardet.detect(urlread("http://google.cn/"))
{'encoding': 'GB2312', 'confidence': 0.99}
More information about the wellylug
mailing list