[wellylug] utility to determine text file encoding?
Peter Lambrechtsen
plambrechtsen at gmail.com
Tue Mar 3 08:38:53 NZDT 2009
Or were you more interested to know if a file was XML/HTML/PS/
something else as it's missing a file extension and you want to figure
out the file contents. Rather than the code page of the text in the
document which is what that sf app does.
On 2/03/2009, at 10:31 PM, Joe Mahoney <joe at cheerschopper.com> wrote:
> I'll give it a crack. Thanks!
>
> Joe
>
> On Mon, Mar 2, 2009 at 6:50 PM, Peter Lambrechtsen
> <plambrechtsen at gmail.com> wrote:
>> Googling around would cpdetector on source forge do what you want?
>>
>>
>>
>> On 2/03/2009, at 6:26 PM, Daniel Pittman <daniel at rimspace.net> wrote:
>>
>>> Joe Mahoney <joe at cheerschopper.com> writes:
>>>> On Mon, Mar 2, 2009 at 5:33 PM, Daniel Pittman
>>>> <daniel at rimspace.net> wrote:
>>>>
>>>>> No, because this is an impossible task. Unless the file contains
>>>>> embedded or external metadata you can /guess/, but not actually
>>>>> know.
>>>>>
>>>> Yeah, I knew it was black magic, I just wondered if anyone had
>>>> had a
>>>> crack at a best guess app.
>>>
>>> Well, the ICU classes used to implement coding conversion include a
>>> statistical and algorithmic model. I don't know of anything that
>>> has
>>> implemented that in a command-line wrapper:
>>>
>>> http://icu-project.org/docs/papers/Automatic_Charset_Recognition_IUC29.ppt
>>>
>>> Regards,
>>> Daniel
>>>
>>>
>>> --
>>> Wellington Linux Users Group Mailing List: wellylug at lists.wellylug.org.nz
>>> To Leave: http://lists.wellylug.org.nz/mailman/listinfo/wellylug
>>
>>
>> --
>> Wellington Linux Users Group Mailing List: wellylug at lists.wellylug.org.nz
>> To Leave: http://lists.wellylug.org.nz/mailman/listinfo/wellylug
>>
>
>
> --
> Wellington Linux Users Group Mailing List: wellylug at lists.wellylug.org.nz
> To Leave: http://lists.wellylug.org.nz/mailman/listinfo/wellylug
More information about the wellylug
mailing list