[wellylug] utility to determine text file encoding?

Peter Lambrechtsen plambrechtsen at gmail.com
Mon Mar 2 18:50:08 NZDT 2009


Googling around would cpdetector on source forge do what you want?



On 2/03/2009, at 6:26 PM, Daniel Pittman <daniel at rimspace.net> wrote:

> Joe Mahoney <joe at cheerschopper.com> writes:
>> On Mon, Mar 2, 2009 at 5:33 PM, Daniel Pittman  
>> <daniel at rimspace.net> wrote:
>>
>>> No, because this is an impossible task.  Unless the file contains
>>> embedded or external metadata you can /guess/, but not actually  
>>> know.
>>>
>> Yeah, I knew it was black magic, I just wondered if anyone had had a
>> crack at a best guess app.
>
> Well, the ICU classes used to implement coding conversion include a
> statistical and algorithmic model.  I don't know of anything that has
> implemented that in a command-line wrapper:
>
> http://icu-project.org/docs/papers/Automatic_Charset_Recognition_IUC29.ppt
>
> Regards,
>        Daniel
>
>
> -- 
> Wellington Linux Users Group Mailing List: wellylug at lists.wellylug.org.nz
> To Leave:  http://lists.wellylug.org.nz/mailman/listinfo/wellylug



More information about the wellylug mailing list