Archive: Unicode string in binary file


Unicode string in binary file
Hello,
A while ago I built an installer that appended a string of data at the end of the uninstaller file. Something like this:

CUSTDATA:C:\Gallery;1;1033;TIG;0;07022013010946;TN;0;0

It worked fine, but lately I've been thinking about the string. Shouldn't it be converted to unicode, in a unicode installer/uninstaller?

It would make sense, in my view, to convert the string to UTF-8. The problem with UTF-8 is the uninstaller file, that seems to be ansi.

I wonder if there's a way around this. My main worry is that the path data might contain non-Latin-A characters.

Any suggestions would be welcome.

So far I've only tried manually editing text in the uninstaller. Maybe the file can be saved as unicode. Is that possible or advisable or a deadend?

String conversion could be a deadend also when it comes to reading the data for parsing. I'm unsure how FileRead handles UTF-8 in a binary file.

Can anyone help?

Thanks.


The Unicode fork of NSIS has FileWriteUTF16LE and FileReadUTF16LE. That can solve your situation. The official version of NSIS can only hold ANSI strings (not necessarily latin, but the current codepage), so conversion might not work out.


FileWriteUTF16LE is useful for writing the custom string to a UTF-16 file.

I'm just wondering if the user's codepage would convert the custom data string to that system's codepage and would it be safe to write to the uninstaller file.

The codepage converts Latin1 to, for instance Japanese, if the installer is non-unicode. Unicode, in practice, means that Latin1 characters are unchanged. In Japanese 'C:\Gallery' is still 'C:\Gallery', because unicode symbols are in use. If I install with unicode and write 'C:\Gallery' to my data string, the file that I write to must be a unicode file, or else the codepage will convert the string.

It kind of begs the question if 'C:\Gallery' is a valid path in Japanese. I really don't know. I have never worked on a Japanese version of Windows. I'd like to.