Archive: Unicode


Unicode
Jim Park has issued a patch to add Unicode support to NSIS. Currently it's a completely different build that can create nothing but Unicode installers (no Windows 9x support).

Please help me test it for speedier integration.

https://sourceforge.net/tracker/?fun...group_id=22049

You can download a pre-built version from:

http://www.scratchpaper.com/nsis-05-...code-setup.exe


Thanks for posting that kichik.

A few things to note. The Unicode NSIS installer is COMPLETELY Unicode. This means that your NSI script must also be Unicode (UTF-16LE). This can easily be done by using Notepad. Just change the encoding to "Unicode" and that will do it.

Note that the plugins you may be relying on will not work unless they are also built with Unicode in mind. This means that the access to the global stack that is used to transmit information back and forth from plugin to NSIS must contain Unicode (wchar_t) strings.

*** Developers ***
If you are a plugin author, it shouldn't be too difficult to make the changes. You can take a look at what I did for the standard plugins that ship with NSIS. The source is here:

http://www.scratchpaper.com/source.zip

If you look at ExDLL, you will notice new functions that will help you with the translation of the stack as well. For instance, if you DLL only cares about ASCII codesets, then you may not need to do much modifications at all. It just needs to call the new functions that automatically translate the stack's strings from Unicode to ANSI and ANSI to Unicode. But most likely, if you are dealing with user entered strings, you will need to make your DLL completely Unicode aware as well.

The source uses TCHARs so you can target building both ANSI and Unicode versions of NSIS. It is based on NSIS 2.29 with some bug fixes and minor tweaks. If you are interested, the ANSI version of the binary built with the same source code is also here:

http://www.scratchpaper.com/nsis-05-....cvs-setup.exe

Please look at the tracker submission for more information about building the source yourself. I developed on VS2005 and I would be surprised if it builds on the other platforms. Even if they do, I did not change the SCONS config files for the other platforms to define _UNICODE and UNICODE which is what's needed for the source to be built as Unicode.


Source code link:

http://www.scratchpaper.com/source.zip

ANSI version link:

http://www.scratchpaper.com/nsis-05-....cvs-setup.exe


Is this will affect future versions? I mean I'm not a UNICODE fan, but if is it going to be...


Well, the niceness of this approach is that those who want the ANSI version for Win95/98/ME compatibility can use the ANSI version. For those who need support of languages that only Unicode can provide can use the Unicode version. It gives you a choice. And we can give you that choice with one code base.


I have just adapted my installer to use this unicode version for a russian language installer, and everything seems fine in russian except for some reason the welcome page has loads of question marks instead of the correct characters of the introduction text. It's weird because all the other panels are fine. Does anyone know what is going on here?

Cheers,

Hayden Devlin


The introduction text should ALSO be Unicode. So open it up in Notepad and then save as Unicode encoding. That should work. So remember EVERYTHING must be Unicode. If you have an ASCII file, you need to make sure it's now UNICODE. Hope this helps.


Sounds great. One quick question: what happens when one of these builds gets run on 9x? Is it possible to detect in .onInit and show a MessageBox to the user, or would it just crash immediately? If the latter, then I guess the build could be wrapped by a non-Unicode build that does this detection and messaging.


Originally posted by dienjd
Sounds great. One quick question: what happens when one of these builds gets run on 9x? Is it possible to detect in .onInit and show a MessageBox to the user, or would it just crash immediately? If the latter, then I guess the build could be wrapped by a non-Unicode build that does this detection and messaging.
I decided to not be so lazy and just try for myself. I compiled Examples\bigtest.nsi and tried it on Windows 98. It crashed immediately, which is expected. However, there are ways to gracefully handle this, as I mentioned in my last post.

how about Microsoft Layer for Unicode on Win95/98/ME (MSLU)?

http://www.microsoft.com.nsatc.net/g..._announce.mspx

sounds easy to integrate.


yeah, but you have to ship the MSLU dll along with the installer


well, at least nsis may get compiled against mslu, so it still works normal on win2k/xp/vista, but also MAY work on win95/98/me IF the user has mslu installed.


When converting ANSI files to Unicode, I've noticed that the conversion works fine with Notepad, but converting using Textpad or the iconv program on FreeBSD results in the Unicode file not working with Unicode NSIS.

While testing this patched version of NSIS, I want to be able to automatically convert files as part of a build process instead of maintaining separate copies of the same files but just in different encodings. Notepad doesn't seem to have a command line interface other than 'notepad.exe filename.txt'. I guess I could make something that launches Notepad and hooks the window to do a Save As :)

This isn't a big deal, but it would be nice to have. Anyone have ideas on how to do this other than the ugly window hooking method?


There are a few possible reasons. The first is conversion to a wrong charset. Even Unicode has multiple charsets available. Make sure you convert to UTF-16LE.

Another reason could be a missing BOM. Try perpending the file content with FF FE (the actual bytes, not text).


My thoughts are exactly the same as kichik's. Here's a little utility I wrote called a2u.exe that converts an ANSI codepage text file to Unicode. You provide what codepage to use. So it can be run regardless of your OS codepage setting. So this will fit your bill in making this whole process automated. (I didn't convert all the include files, example, etc., by hand, you know?) It should also be helpful in converting your current NSIS scripts to Unicode as well.

I included the source as well so you can modify it to work anyway you want.

Look in http://www.scratchpaper.com/a2u.zip or I'm attaching the file here as well.

Enjoy.


Thanks for the utility and patch. The Unicode support will hopefully save me from having to use InstallShield!


Jim -

We'll give this thing a good workout soon.


Anymore news about this?

Will it be integrated into the main NSIS branch someday? Any extreme rough timeframes if so? (something like 3 months, 6months a year, etc?)

I suggest making this sticky since it's pretty important news.

Thanks heaps for the patch work Jim. Your efforts are greatly appreciated.

Sidenote question: If you are using the System plugin for API calls, will you have to use the Unicode calls, or can you use the normal ANSI calls too?


Should be integrated, no timeframes and System can already handle both ANSI and Unicode so nothing should change.


The only change made to System is that all the calls default to Unicode strings rather than ANSI strings. So for example, if you specified MessageBox(...), then it will actualyl call MessageBoxW(...). As long as you provide the right function signature, then the System will work as you expect. I just had to rework some of the interfacing back and forth through the NSIS global stack but using it should be the same.

You WANT to use the Unicode functions by default because while it will automatically convert Unicode strings to ANSI codepage and back when you call ANSI functions, you may actually lose vital information or get bad behavior if the user's computer's code page setting is different from what you'd expect. And definitely things will not work if you are trying to provide an installer for a Unicode only language.

If you are converting your script from the ANSI NSIS to Unicode, and you are using the System plugin, look for functions that are ANSI only. Most of these end in "A". Change them to use Unicode. Or just specify their "TCHAR" versions -- i.e. MessageBox instead of MessageBoxW or MessageBoxA. Then the correct one will be picked up by your NSIS installer.


Kichik, I was wondering, also, of what it would take for it to be incorporated into the NSIS trunk. Since the source for Unicode NSIS can generate both Unicode and ANSI versions of NSIS, bug fixes in this version of the source will benefit both the Unicode and the ANSI version. I would hate to see the Unicode NSIS code get orphaned. If we wait much longer the code base for the NSIS and Unicode NSIS will diverge too much and someone's going to have to redo the 1.5 months of fulltime work to make it work again.


As I see it, the first step is a completely separate build for Unicode (like 8192 string and logging builds) which will minimally affect the main build. Later, the compiler can start moving towards being able to create either type of installers. Finally, the stubs could be made common so ANSI will be automatically used on 9x and Unicode on any capable system.

I, too, would hate to see the code gets orphaned, requiring more and more work with each passing version. To get to first step done and integrate it as a separate build, I'd want to make it as hassle free as possible. Currently it seems there are more than a few changes that'd require extra maintenance and will serve useful only for the Unicode build. All of the SConscript changes are good example. There is no need for duplication of examples and such - makensis.exe should be able to read ANSI scripts and differentiate Unicode scripts according to BOM.

Another minor example is wininit.ini handling. Since that's only supported on 9x, the code could be completely removed for Unicode builds and there's no need for the conversions and Unicode specific code paths.

Lines 716-726 in Source\exehead\util.c could be reduced to a cast to char*.

NSIS Menu doesn't have to be Unicode. But I guess it'd be useful for translated packages of NSIS itself. In that case, it'd be easier to have wxUSE_UNICODE defined externally or according to _UNICODE instead of having two setup.h.

In short, every #ifdef _UNICODE or separate Unicode file is more work down the line. I can't ask you to make all the changes, but that's what I'll do when I integrate it. I will probably do it gradually, but I'll definitly start with automatically generating the duplicate files or getting rid of the need for them.

If you do, however, wish to help with this, I can create a branch for this in the soon-to-live SVN repository where you can check-in the changes. I will also do my best to sync it with trunk for every released version (which hopefully won't be too many).


Just had a look at Benski's Unicode branch and saw he used a nice trick to get around the CHAR4_TO_DWORD optimization - QTCHAR.

http://nsis.cvs.sourceforge.net/nsis...athrev=UNICODE


Actually, I disagree that the examples and NSIS include files should be generated from a single source for both. Eventually, the number of languages supported for Unicode will continue to grow while the number ANSI languages have already stopped growing. The common subset (e.g. Latin I) may be able to share a common source but even then there are things like System calls that will be subtly different for ANSI and Unicode. Much like the C++ source itself, we may have to add #ifdef UNICODE or some equivalent into the NSIS scripts if we pursue having one source script for both. But again, I imagine that the Unicode languages will continue to grow which will never have an ANSI counterpart. And Win9x/ME OSes will continue to dwindle in significance.


But currently there are none and so it generates an unjustified work overhead. Updating an example requires updating two files. That's always a cause for nice little "oops" bugs.


That is true. So for now, I do have that a2u.exe that I've uploaded here. We can use it as part of the build process to take the ANSI scripts and convert them to Unicode ones for those which are easily derivable. For those that do need hand tweaks (there may be a few), we can keep those separate. For scripts that use the System plugin, for example, we need only use the API without the "W" or "A" as the suffix i.e. don't use MessageBoxA or MessageBoxW just MessageBox and the correct version will be picked up by the System plugin. So I think it's quite possible to achieve a unified single source for most scripts.


I may have misunderstood your post. If you mean that there are no languages that are Unicode-only, then here are some:
1. Amharic
2. Armenian
3. Assamese (India)
4. Bengali (India)
5. Divehi (Maldives)
6. Georgian
7. Gujarati
8. Hindi
9. Inukitut (Canada, Syllabics) -- not the Latin-based one
10. Kannada
11. Khmer (Cambodia)
12. Konkani
13. Marathi
14. Nepali
15. Punjabi
16. Sanskrit
17. Sinhala (Sri Lanka)
18. Telugu (India)

The list is growing. Also, many people including myself run apps localized to a different language than the codepage of the OS that they're running. So the desire for Unicode support for an installer is there and the need is getting stronger.


There are none in NSIS. And even if language files are added for those, it still doesn't justify having every example in Unicode as most examples don't demonstrate multiple languages and those that do, usually do this with a small collection of languages.


Originally posted by kichik
There are none in NSIS. And even if language files are added for those, it still doesn't justify having every example in Unicode as most examples don't demonstrate multiple languages and those that do, usually do this with a small collection of languages.
Hi,

I find the idea of a unicode capable NSIS compiler great.
I've built an (standard NSIS)installer for network drivers which currently supports 25 languages, including 2 RTL languages.

If we had to add some extra dialogs or messages all translation services would deliver unicode files.
I haven't yet understood or found a detailed explanation of how to safely convert unicode texts into NSIS codepage-based strings.
Being able to directly use unicode strings would significantly ease localization.

In my work I normally use PERL and the automated generation of Windows driver .inf files (UTF-16) is done by PERL scripts.
But NONE of the sources/templates, except for the NLS dependent parts (strings and readme files), are in UTF-16. As PERL uses UTF-8 internally and ASCII is a real subset of UTF-8 there is no need to have program sources in UTF-16.
When including a NLS file, which is in UTF-16, this is detected at read time.

I don't know if the Microsoft C++ API only supports UTF-16, but I think it should not be a great problem to have a smart "read file" function which does detection and conversion, even if internally all is handled in UTF-16.

I don't know if the Microsoft C++ API only supports UTF-16, but I think it should not be a great problem to have a smart "read file" function which does detection and conversion, even if internally all is handled in UTF-16.
They support both ANSI and UTF-16. There is even a function for determining if a buffer is Unicode or not. The standard even defines special markers for files that specify if they're ANSI, UTF-8, UTF-16LE or UTF-16BE. They are called BOMs.

It shouldn't be too hard to make the compiler support both ANSI and UTF-16. But I'm just saying it won't serve too much of a purpose to have the examples in SVN in both ANSI and UTF-16. There are currently no examples whatsoever that need Unicode. If there will ever be, they can be specific to the Unicode build.

As I mentioned before, for the languages that have codepage support, I simply ran a a2u.exe to convert the ANSI script to the Unicode script. I did not hand edit all of those. I don't know 25 languages, after all. :) So you could have one source file for each language and then GENERATE ANSI and Unicode versions of the example files. So in essence, you never have "oops" mistakes because there is only one source.


From your reply, Kichik, it seems as though there IS going to be a Unicode build of some form. Is this correct? Does that mean that this Unicode build is going to be integrated into the main release sometime soon?

If so, that will be great. I for one have need for an installer in one of the languages mentioned below (Georgian), and next year I may have need of one of the others. PLUS, the benefits of not having to switch code-pages in order to have the install show up with intelligible text is also very nice.

Additionally, how many people have tried this version of the installer? I'd like some idea of how stable it is and how it is supported on different OSes. My experience so far has been very good. I was able to build my install with the version made available here without any changes to my script (other than saving the file in Unicode), and it is a pretty large script (1K lines). I'm now in the process of localizing the Georgian version, and I'll make sure to post back with how successful the results were.

Unicode seems to be the wave of the future, and I'll be really happy to see NSIS make the path available for those of us who want it. After all, the product I'm shipping does not support any of the ANSI versions of Windows, and Microsoft doesn't support them anymore either.


KrisAster, there's no time frame and my bet won't be on soon.


For those using the Unicode version of NSIS, I took the latest 2.33 source and merged the changes onto the Unicode version. The new binaries and source are posted to http://www.scratchpaper.com. Let me know if you find any problems with this release. Also let me know if you find the Unicode version useful.


I modified the Unicode language files in the installer to contain vernacular names for the various languages. So you can do something like this:
http://www.scratchpaper.com/Unicode-NSIS-Propaganda.gif


Thanks for this work. the unicode build works well.
I'm chinese, and using japanese windows edition to study japanese. And really many chinese language only nsis setup bother me which can't display correctly on my system.
Really great, hope this Unicode support feature will be merged into the official NSIS.


Thanks for your patch, it seems to be working great.

I'd love to see this included in the official source.


Unicode NSIS has been updated to support Georgian (a Unicode only language). Updated Unicode Simplified Chinese language files. Also fixed some bugs in
Unicode Turkish language file and Unicode zip2exe. Please give it a whirl and let me know if you see any more problems. As always, the site is http://www.scratchpaper.com


I'm trying to use your Unicode version. It works well until the use of some plugins.
I tryed to convert the GetVersion plugin of AfroUK.
I put your exDLL and recompiled with no success: the compiler give me some errors. I think that the problem is the definition of _UNICODE but I don't know where to put it (I don't know C++ and its compiling process).
If someone tells me how to do it, I think I'm able to translate the others plugin I need for my installer.

Thank you


Well, it will be a bit difficult not knowing C++ if you want to convert other plugins. But you could try downloading the source from the NSIS branch, then download the source from my site. And you could compare the source files for a smaller plugin like LangDLL and see what the differences are.

BTW, other than the GetVersion plugin, what other plugins do you use?


Archive: Unicode


Yes, C++ isn't the perfect language to make small change for a person doesn't know it.
Anyway I'll try to do it (is this the right time to study even the C++ :D ?).

These are the plugins that I use in my installer:
- GetVersion (to kown the Windows and IE versions and the OS architecture)
- GetSize (to know the size of the data folder)
- HwInfo (to know the processor name)
- nsProcess (to kill a process before uninstalling)

Bye


Yes, C++ isn't the perfect language to make small change for a person doesn't know it.
No language is.

Originally posted by CodeSquid
No language is.
You're right, but C++ isn't the easier one.

Originally posted by jimpark
Well, it will be a bit difficult not knowing C++ if you want to convert other plugins. But you could try downloading the source from the NSIS branch, then download the source from my site. And you could compare the source files for a smaller plugin like LangDLL and see what the differences are.

BTW, other than the GetVersion plugin, what other plugins do you use?
I converted GetVersion and works well only under ANSI version (like the original one). Can be the problem in the definition of _UNICODE (I don't find it)?

Thanks

Originally posted by prz
I converted GetVersion and works well only under ANSI version (like the original one). Can be the problem in the definition of _UNICODE (I don't find it)?

Thanks
I added the _UNICODE string to the pre-processor strings, but the output of the plugin is the same of the ANSI version.
something else to make a unicode version of the plugin?

Bye

Thanks for the great work on Unicode NSIS, and for hosting it. I just used it for an english/french/czech installer (and I should be adding more languages soon). It still needs to be more thoroughly tested but it seems to have worked fine. :)

A note though: I didn't convert the install script itself to UTF16-LE (I did at first but NIS Edit wouldn't display it correctly). It worked anyway. It was completely ANSI, but ANSI is not a subset of UTF16 so I'm (happily) surprised.

I consider Unicode as the way to go: displaying multiple or Unicode-only languages is more and more needed, and I'm really glad that NSIS is moving to complete Unicode support. :up:


Thank you for the kind words of encouragement. I've never used NIS Edit but I take your word for it that it doesn't understand UTF16. If you didn't convert the script to Unicode, perhaps the ANSI one was run and it created an ANSI installer? One thing you could do is check that the ANSI one works first, then convert it to Unicode using notepad or a2u.exe and then create the installer again using the Unicode version. If it worked for the ANSI it will most likely just work for the converted Unicode version (barring the use of 3rd party plugins that have not been converted to support Unicode.)


Actually, the ANSI script was compiled with classic NSIS until now. It works correctly. But I wondered if making a unicode installer was possible, and found out about Unicode NSIS.
I don't use any 3rd party plugin, only language files, which I converted to UTF16-LE. And I updated the link to the compiler in NIS Edit to Unicode NSIS. The only issue was that NIS Edit displays only the BOM of a unicode script.

Classic NSIS wouldn't compile even the ANSI script (using unicode language files), but Unicode NSIS produces a working installer.

However, I just compiled the Unicode script on Unicode NSIS, and did a byte-to-byte comparison of the 2 "unicode" .exe installer: they are indeed not exactly the same.
I couldn't spot any difference by running them though, even with czech texts (usually requiring CP1250 while I use CP1252 myself). I'll have to test with russian or chinese.

Anyway, I'll use the unicode script from now on to be on the safe side.
I described everything here in case it is of some help to you. I'll be happy to make some more tests or give some more details if you wish.


What about making a Unicode version that supports both Windows 95/98/Me as well as Windows 2000/XP/Vista? I think it should be possible to have scripts and internal NSIS strings in Unicode but convert the user interface strings to ANSI for 95/98/Me.

Because there is no simple one-to-one conversion of ANSI NSIS scripts to Unicode (multiple codepages can be used in a single file for e.g. LangStrings), we can just call it NSIS 3.0 and require scripts to be in Unicode.


Actually the problem is during the loading the exe and dynamic link time. By using Unicode-based Win32 calls, the program fails to even run. Not sure how to solve that except by distributing unicows.dll with the installer on Win9x. But seriously, I think we should just drop Win9x support soon. Looking at browser OS stats at http://www.w3schools.com/browsers/browsers_os.asp, only 1% of the web users use Win98. And the demographic for users of products that ship with NSIS installers are probably netizens. Every year that passes by, the need to support Win98 and therefore both ANSI and Unicode will dwindle. We will eventually only need Unicode. :D


Merged in 2.34 changes into the Unicode version of NSIS. Go to the usual place: http://www.scratchpaper.com for the binaries and source.

Added Pashtu support. Thank you Zabeeh khan.


I've tried to compile your patch on Linux but it utterly failed.

Can you please update your patch so that it compiles on Linux?
If you don't have Linux, you can download it from http://www.ubuntu.com/


Originally posted by jimpark
Actually the problem is during the loading the exe and dynamic link time. By using Unicode-based Win32 calls, the program fails to even run.
It's possible to call these APIs on run-time depending on the Windows version. A framework for this method is already included in the NSIS source and is used for various functions that are not available in all Windows versions.

CodeSquid, it's not that I don't have Linux. I did this project because of a need for a Unicode installer on Windows for the project I'm working on. I'm giving the source back to the community as-is. I'm only updating the code so that my code base doesn't go out of sync with the latest changes. And I'm offering it to the public and especially the NSIS developer community so they can potentially use it to create an official Unicode version of NSIS. And while there isn't such, the developers who need a Unicode solution for their installer will have something that works right now. I could just update it internally and never let anyone else use it or see it but that would not be following the spirit of open source. I don't have the time or resources to make sure the source builds on Linux, Mac and whatever else NSIS usually can build on. So if it's important to someone that the Unicode source builds on Linux or on the Mac, I hope that someone steps up to the plate and makes it work for those OSes.


I'd very much like this in the official NSIS code. Unfortunately I don't have the time to work on this either. :(


I think it will require quite some effort to have combine a non-Unicode/Unicode version in the same source. This would also cause a lot of trouble with plug-ins. Having a major new release (NSIS 3.0) with only Unicode support and Unicode script files would probably be a better solution.

If only Windows 2000 and later need to be supported, the current Unicode version is mostly complete (expect for some issues like Linux compilation). Windows 95/98/Me cannot be supported forever. Microsoft has dropped support for all these versions some time ago, so there are also no more security fixes. The C++ runtime of Visual Studio 2008 doesn't even support them anymore.

So maybe this would be the time to remove legacy stuff and introduce some breaking changes.


I'd like to suggest support for UTF-8 scripts, with and without BOM. I exclusively do UTF-8 without a BOM, as I feel it hurts more than it helps: it only serves as identification (even though in almost all cases, you'd do just fine by assuming UTF-8, and using ANSI as a fallback), and excluding it allows for a neater handling by non-Unicode programs.

If the file is BOM-less and contains invalid UTF-8, I think it should be read as ANSI and internally converted to Unicode for the actual installer. Such behavior would help reduce the effort needed to migrate installers to Unicode, as it should allow most existing scripts to continue compiling with no changes (as long as they don't use any plugins not supplied with NSIS).

I do realize there's a potential problem when a file in a different codepage is included from the script. This could be solved in various ways, such as using a stack to keep track of codepage information. Whenever inclusion starts, the current info is pushed onto the stack. When inclusion stops, pop it back.

Although there is a risk of incorrectly interpreting an ANSI script as UTF-8, it is extremely slim, due to the fact that it requires some very specific byte patterns as the only non-ASCII text in the entire script.


The real crux of the problem is the conversion from ANSI to Unicode. What codepage should you use? Hopefully, the script has that information. But even if it does, it may mean you have to parse a significant portion of the scripts to figure out which code page the text is in which means writing a lot of code that is not there currently. Or I guess you could default to using the system codepage but that may not be the intent of the developer. If someone develops a Chinese installer and gives it to someone who uses an English codepage on their machine, the same installer would produce junk characters on its messages.


Originally posted by jimpark
Or I guess you could default to using the system codepage but that may not be the intent of the developer.
I'm not sure you understood me correctly, but my idea is to use the system codepage to convert it at compile-time; allowing us to minimize issues. It would likely be best to issue a warning about ANSI scripts being deprecated when it has to do this, so the developer knows to use UTF-8 or UTF-16 instead of relying on the system codepage.

If someone develops a Chinese installer and gives it to someone who uses an English codepage on their machine, the same installer would produce junk characters on its messages.
As long as they only get the compiled installer, no problem, as it would have been handled by then. If they distribute the actual ANSI script, without realizing this change had happened, then yes, they'd have a problem, but this should not happen very frequently - even less if we issue that warning - and the fix is simple; that particular script just needs to be converted into a Unicode format.

For installers creating a dynamic script, nothing prevents us from using UTF-8 as a default output codepage for file writing to minimize issues; as long as they are able to change it from the script (for when they need a specific codepage). Yes, that one would require a change, but you'd be affecting fewer scripts, thus easing migration.

Yes, I do understand you. I am talking about the developers sharing scripts with different codepages set on their machines. The problem you are trying to fix... which is converting their ANSI scripts to Unicode is trivial. Open it up in notepad.exe and save as Unicode. Or use the a2u.exe program I provided. All the proposed changes you are talking about only saves the little tiny headache of converting your script once to Unicode. The real problem is that people write scripts that are shared like libraries, like the language files. They all still need to be converted to Unicode, whether UTF-8 or UTF-16 before they are shareable with others. There's no way around that. We can do some intelligent guessing as the NSIS program parses the ANSI scripts but it may not always be correct.


I don't think there is any possibility to convert existing scripts automatically.

There is no way identify the codepage of a script file, the bytes just have a different character representation depending on the users codepage. Script files can even contain multiple codepages (for example, if language strings are combined in a single file).

On run-time, texts that are sent to the dialogs can come from any source on the system, complete outside the installer itself. Again, no automatic conversion is possible. Things become even more complicated when external DLLs (including NSIS plug-ins) are called that return ANSI strings.

Taking this all into account, I think the only realistic thing is to release a new major version that will be Unicode-only.


Of course not, and you're right - those scripts would need to be converted. But we're not going to get around such an issue no matter what we do; and I don't see it as a good idea to cause extra hassle for everyone writing non-English single-language installers, compared to causing a little more work for those that would actually benefit greatly from converting it to Unicode - at least not for the first official Unicode release.

Also, are the amount of "non-native-codepage"-dependent scripts really that big? I'm under the impression that many people stick to the bundled stuff, which would already be Unicode, or have already switched to the Unicode build.

EDIT: Good point, Joost - I hadn't considered external calls. Alright, never mind the ANSI then.


IMHO two article developers or NSIS script creators should read (I mean really READ...)

http://www.microsoft.com/globaldev/h..._announce.mspx

http://www.joelonsoftware.com/articles/Unicode.html

Note that this message is NOT to insult, it's purpose is to help getting Unicode support in a decent smooth upgrade path to NSIS. Lots of people have lots of good ideas but somehow we have to create it...that will mean some fierce pro/cons discussions. But take for example Jim Park's initiative, it is *great*, he got things rolling for Unicode for him because he has a need and just created his own build!

It is also not that I personally still would support for Win9x/ME, but a lot of third pary software uses NSIS e.g. the Firefox installer on Win32, they are affected if there is an "Only" unicode build" with no backwards compatibility layer.

For those who do not know: "kichik" knows possibly the best what path to choose, and if he doesn't he still does ;) after all he does a major amount of the NSIS development.

But I also see the points by Joost, Unicode only major release to avoid confusing people which NSIS to use, and after all still 2.x can be used for non Unicode releases.

And for the plugins, most of them come with source, should we already ask the plugin creaters nicely to make them Unicode compatible if needed and not done so already?

...so now MY first step will be, make the plugins I've created unicode compatible, even if not used yet...


Windows 95/98/Me support would require NSIS itself and all plug-ins to provide both an ANSI and a Unicode user interface. If external APIs are used to modify the interface, scripts would also have to check whether ANSI or Unicode is used and do the required conversions.

All scripts need to updated anyway if NSIS uses Unicode internally, even if the user interface is still ANSI. The fact that an ANSI interface is still available would only improve backwards compatibility a little.

Because this is all very complicated, I'm afraid we don't have the manpower to implement it within a reasonable time.

Almost all new applications that are developed (e.g. Firefox 3) don't support 95/98/Me anymore. If legacy Windows versions still need to be supported for an installer, NSIS 2 can be used.


As an interested outsider here, I can say I would largely prefer having an officially maintained Unicode NSIS (NSIS3), with an unofficial ANSI legacy version (for example if new features would be backported to NSIS2). That is to say, of course, if a choice has to be made, since limited resources always push to make choices.


I think both NSIS 2 (ANSI) and NSIS 3 (Unicode) can remain official versions. However, we won't really have enough resources to backport features.


If we use TCHAR style macros in the C/C++ source, we can use one set of source files to generate both the ANSI and Unicode binaries. I know some people don't like the use of TCHARs but that's the choice I made because of the convenience of having one common set of source files for both binaries. So I know I can make changes to the source that will be reflected in both the ANSI and the Unicode versions. This should help in keeping NSIS ANSI and NSIS Unicode features in sync.

The other problem is the script files -- it's painful keeping two sets of script files as kichik pointed out. But I think most of the behavior enhancing script files are plain ASCII files which can easily be converted to Unicode and back to ANSI.

The language files, however, I think we need to keep separate because the two sets will have to be different. For one, the number of Unicode language files will be bigger than the ANSI. In fact, it already is in my build. And two, the Unicode script files can have the language name itself in its vernacular because the Unicode binaries can show them all at the same time.

Consider the multi-language installer demo picture on my website. It's a very powerful argument for being able to generate Unicode installers. I know of one case where a software targeted to the Chinese audience had two installers -- one in Traditional Chinese and the other Simplified Chinese. And they were able to use the NSIS Unicode multi-language installer to show the instructions / license agreement in both Traditional and Simplified at the same time. That's over a billion people that can benefit from that sort of flexibility. India also has a huge set of scripts for its languages. With China and India together, you've got almost half of the world's population! :)


A mixed ANSI/Unicode (ANSI on 95/98/Me and Unicode on NT/2000/XP/2003/Vista) stub is almost impossible as I pointed out, so building ANSI-only and Unicode-only stubs from the same source may indeed be a good option!

If the compiler would automatically convert Unicode scripts to ANSI (using multiple codepages in a script if necessary, just like the situation is right now), language files and example scripts not using special features would be compatible with both versions (solving the issue kichik pointed out). A simple define (!ifdef UNICODE ...) can allow more advanced scripts (such as the Modern UI) to support both stubs. A script command (independent of the script encoding) can then be used to select what stubs should be used.

We would need two folders for the plug-ins, because the ANSI stubs can only use ANSI plug-ins and the Unicode stubs can only use Unicode plug-ins. All standard NSIS plug-ins can of course support both (using the TCHAR thing).

You'll have to ask kichik about his opinion. Maybe you can work on implementing Unicode in a different branch of the official NSIS SVN repository.


What about FileOpen function in Unicode version? If a file does not exist, does it create an ANSI or Unicode file? There should be a way of creating an Unicode file, and maybe there is, but I didn't find it. Sorry if my question is out of place here, but I did try general forum and got no replies.


Use FileOpen and use FileWriteUTF16LE. If you want to write a BOM at the beginning of the file, you should be able to using FileWriteWord file_handle "65279" which is U+FEFF


Did that and it works perfect, thanks. However, if I try same approach with header I'm tinkering with to make it work with Unicode version of NSIS, I get garbage in file. I'm no expert in NSIS, but I have tried to find the cause of it, and I think that usage of FileJoin from TextFunc.nsh is causing the garbage in final file, since header uses that function to join two files. Files are utf16le, and here is a part of the function :


define ${_TEXTFUNC_UN}FileJoin `!insertmacro ${_TEXTFUNC_UN}FileJoinCall`
...
FileOpen $3 $4 a
IfErrors error
FileSeek $3 -1 END
FileRead $3 $5
StrCmp $5 '$\r' +3
StrCmp $5 '$\n' +2
FileWrite $3 '$\r$\n'
FileOpen $0 $1 r
IfErrors error
FileRead $0 $5
IfErrors +3
FileWrite $3 $5
goto -3
FileClose $0
FileClose $3

...

Maybe the bold part is causing problems? Should there be FileWriteUTF16LE and FileReadUTF16LE calls? Or maybe not, beacuse of the define? I tried to place UTF functions, but this function produces no file then. Any thoughts, Jim?


If you want this function to work for Unicode files, I think you'll have to create a Unicode version. Copy the macro and as you suspected, you need to convert the FileRead and FileWrite calls to the UTF16LE versions. You might have to open the files as binary as well.


Got it working. Thanks for all the help and sorry for bugging you. The function does not work with Unicode files - had to change calls, that was obviuos. But this script with UTF produces errors, and I am not sure why - so I had to do this:
(changes are bolded).

FileOpen $3 $4 a
;IfErrors error
FileSeek $3 -1 END
FileRead $3 $5
StrCmp $5 '$\r' +3
StrCmp $5 '$\n' +2
FileWrite $3 '$\r$\n'
FileOpen $0 $1 r
;IfErrors error
ClearErrors
FileReadUTF16LE $0 $5
IfErrors +3
FileWriteUTF16LE $3 $5
goto -3
FileClose $0
FileClose $3

The header in question is Advanced Uninstall Log, and I'm going to send changes to RedWine. I also had to change Serbian.nlf as UTF16LE is a pile of rubbish.
This
Ďđčőâŕňŕě óńëîâĺ äîăîâîđŕ î ďđŕâó ęîđčřžĺśŕ
should be, in fact, this
Прихватам услове договора о праву коришћења
but I guess it's all Greek to you. :D
Anyway, thanks, Jim - it sure is great to have a Unicode installer which produces Unicode log, and I can install in Unicode named folder, and uninstall files automatically.
Going to send you correct Serbian.nlf.


I appreciate that. Looking forward to getting those files. It's good to know that the Unicode version of NSIS is actively being used.


NSIS Unicode has been updated to 2.35. Please go to the usual spot to download the binaries and the modified source: http://www.scratchpaper.com.


Is there any particular reason why there are no registry entries at HKLM -> Software -> NSIS for VersionMajor and VersionMinor? ANSI release writes current release values there. Is there another way to get version of NSIS with which the script was compiled?
Regards.


It's because I've been building without specifying revision number and build number. I'm surprised you're the only one who brought this up. Thank you. The files have been updated on my site. Now NSIS Unicode will put the version info under Software/NSIS/Unicode.


Thanks. I was using this to read the version of NSIS I've used for buliding the installer so noticed it was missing...


Is there a way to detect if the compiler is unicode, meaning, is there a defined symbol like __NSIS_UNICODE__ in the unicode build? If not, you should add it so that 3rd party header files for stuff can support both ansi and unicode in one file


That's not a bad idea. I will add it to the next version.


Archive: Unicode


I just made the tweak and modified the binary on my website to include the NSIS_UNICODE predefine for the Unicode build. Enjoy.


7zip will not open the installer archive properly. I get rubbish filenames. Not sure if it's something that 7zip would need to fix, though.


7-zip expects a very specific format from the installer. One of its assumptions is that the string table is saved as ANSI. It's not such a faulty assumption when you add in the no-real-nsis-binary-specification factor.


So does the installer still work? Is it just that a 3rd party app can't open the Unicode-NSIS generated file? If so, then I don't think we can do much about that. If the installer doesn't work with 7zip compression algorithm, that's something I can look into. But the notion I had was that NSIS used those compression algorithms for its own nefarious purposes, not that it had to adhere to some binary format.


The installer works great. It's just that I often use 7zip's ability to open the installer in order to extract the binaries for crash dump debugging. Not a big deal, but I miss the feature :) Maybe there's a way for the 7zip author to add special handling for Unicode NSIS.


A helpful hint to anyone converting their NSIS scripts to Unicode. Be sure to change your System::Call stuff to use w. instead of t. for string types!


but is that the way it should work? The system plugin should probably be changed so that "t"="w" in the unicode build and you can use "a" for forcing ansi strings. Should help keep things portable, but some stuff might break


Yeah, I agree with that idea. Although 'a' is already taken to be a shortcut for the chosen language


"a" is not taken as a param type, only source/destination, but I'm not sure if the parser could handle using "a" for two different things. using @ might be a little weird, maybe "z" for zero terminated or whatever unless someone has a better idea

OT: also, adding support for "p" for pointers might ease a future move to 64 bit


Hello jimpark,

Your unicode nsis has been of immense help to me. There seems to be a slight problem with the language selection dialog - the languages that aren't supported by the system also show up and I'm NOT using MUI_LANGDLL_ALLLANGUAGES .

For example , I'm running the installer on Windows XP with English as the OS language (and I haven't installed the files required for displaying CJK languages). On running the installer , Japanese (actually garbled boxes) also appears in the drop down menu .

It works as expected on my system where i have installed the files for Japanese i.e., both English and Japanese are displayed.

It will be great if you can help.


I don't know of a way to enumerate all the supported language scripts on a given user's computer. I can do it per locale or language but that's not sufficient since the user's computer is usually set to one language/locale but may be able to support scripts of other languages. If someone knows how to list all the scripts supported by a given system, I think it's just a matter of calling GetStringScripts() and VerifyScripts().

Anyway, this shouldn't be an issue--everyone should have the East Asian Language support turned on! :)


Unfortunately, those two methods don't exist on versions prior to Vista (and there, East Asian Language support is always turned on) - you can get support on XP SP2 and 2003 SP1, but that requires a separate download (and doesn't help with Win2000).

You can use EnumSystemLocales with LCID_INSTALLED to check which locales are "ready for use" (if you know specifically which one to check for, you can just use IsValidLocale), or you could check the possible languages groups with IsValidLanguageGroup and LGRPID_INSTALLED.

However, all of this requires that you have a mapping between the languages and the locales (or language group) - I can't remember if that information is currently available to NSIS.


I don't think it is available hence my initial thought to use GetStringScripts(). Which makes me think that may be the use of the native language to display the language name may not be a good idea. Or maybe I should combine them together so that there's a legible English version of the name in the drop down menu along with it in the native script.


Couldn't it be a missing font issue, rather than an unsupported language?


Technically, on NT-based OSes, it's ONLY a matter of a missing font (and to some degree, the version of Unicode they support), but that's why LCID_/LGRPID_INSTALLED should be used (instead of _SUPPORTED). Using that flag makes the operating system report if it has everything it needs to display a string based on a given locale.

The problem is, checking against a specific font name is less than optimal, since their names may be localized (the japanese fonts MS Gothic and MS Mincho do that, to give examples), and Windows might know of a font you don't. That's why you need to check on the locale or language group, if you're going to filter on it - but to the best of my knowledge, no WinAPI function exists to detect locale, language group, or script for a given string (except those Jim already mentioned, which are only built into Vista and require a separate download for XP and 2003, therefore not really being usable here).


Latest 7-zip alpha build supports opening Unicode NSIS installers :up:

http://sourceforge.net/forum/forum.p...forum_id=45797


Our unicode-nsis installer calls a java program using nsExec and all the output from the java program gets directed into the scrolling text box. If we choose English for the installation language, everything in the text box looks fine , but when an exception stack trace is printed, a part of the stack trace gets cut off. If we choose Japanese for the installation language, weird characters are displayed in place of forward slashes. What could be the reason?

I've hacked the RealProgress plugin source to get it working with the unicode-nsis. Could it be the reason for this behaviour? (seems unlikely though). I'm attaching a snapshot.
Look at the second line in the image. It is actually supposed to be C:/PROGRA~1/Java/.. so on. But all "/" are replaced with some weird character. Similarly, both the stack traces are only partially displayed.
Any help will be greatly appreciated. Thanks in advance.


I can't say why it cuts off part of it, but the reason \ is replaced by the ¥ sign is for historical reasons: Japan (and Korea) put their currency symbol in their code page back in the pre-Unicode days. When Unicode was introduced, it was believed that people were too used to seeing the currency symbol as the path separator, so they kept it that way. (There's no real "right" answer to that problem, so you can't really blame them, even if it does seem a bit foolish these days)

As a result, all Japanese fonts (MS Gothic, etc.) show a Yen sign where the backslash is supposed to be, just like Korean fonts (GulimChe, etc.) show a Won sign (â‚©).


sridhare, when an exception is thrown and the stack trace is printed by your Java app, there are potential problems with buffered IO that is probably causing the cutoff of the text output. I don't know how much control over the IO you have with the Java program but you might want it to be unbuffered when outputting or at least flush the IO when it exits. The other issue is that I think NSIS will also buffer the input as well so it will display everything if it thinks the IO has been closed or is flushed.

BTW, I have not been able to work on the updates to the Unicode NSIS for a while. I will be very busy until probably sometime in June. My apologies.


Thanks for all the response. This java exception stack trace cut off never happened when we were using (non-Unicode) NSIS. Are you saying that the buffering issue could arise in the Unicode NSIS ?


No, there isn't anything different about how NSIS handles the IO buffers with Unicode. When an exception happens, does the output look like junk? I don't know how it is in Java but in C++, the exception strings need to be ASCII and I wonder if something similar is happening in Java.


Originally posted by jimpark
No, there isn't anything different about how NSIS handles the IO buffers with Unicode. When an exception happens, does the output look like junk? I don't know how it is in Java but in C++, the exception strings need to be ASCII and I wonder if something similar is happening in Java.
No, the output doesn't look like junk, but it gets displayed incompletely - This is the actual exception message that used to get displayed fine in the log window when I was using the official NSIS :


INFO: Not started: http://localhost:80/identity
java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.Socket.connect(Socket.java:516)
at sun.net.NetworkClient.doConnect(NetworkClient.java:152)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:365)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:477)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:214)
at sun.net.www.http.HttpClient.New(HttpClient.java:287)
at sun.net.www.http.HttpClient.New(HttpClient.java:299)
at sun.net.http://www.protocol.http.HttpURLConn...tNewHttpClient(HttpURLConnection.java:795)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:747)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:672)
at com.pramati.bfly.install.InstallOperations.getStatus(InstallOperations.java:335)
at com.pramati.bfly.install.InstallOperations.dasStatus(InstallOperations.java:190)
at com.pramati.bfly.install.BaseInitializer.start(BaseInitializer.java:301)
at com.pramati.bfly.install.offline.StandaloneInstaller.start(StandaloneInstaller.java:488)
at com.pramati.bfly.install.offline.StandaloneInstaller.start(StandaloneInstaller.java:475)
at com.pramati.izpack.DekohProcessPanel.invokeInstaller(DekohProcessPanel.java:479)
at com.pramati.izpack.DekohProcessPanel.installDekoh(DekohProcessPanel.java:346)
at com.pramati.izpack.DekohProcessPanel.access$300(DekohProcessPanel.java:43)
at com.pramati.izpack.DekohProcessPanel$2.run(DekohProcessPanel.java:262)


Now, after switching to the Unicode NSIS, the same exception stack trace started appearing as follows!


INFO: Not started: http://localhost:80/identity
java.net.ConnectException: Connection refused: connect
nnection.java:747)


Also, when this cut-off happens, it also swallows some other messages printed to the window after this stack trace. Having the full exception stack traces is very critical for debugging, it'd be great if you can be of help.

Just released Unicode NSIS 2.37 which merges in the changes from the trunk into the Unicode build. You can find it at the usual place: www.scratchpaper.com.


sridhare, try the new version to see if it fixes your issue. If not, I do have another idea that might fix it. So let me know.


sridhare, I found the bug and I've finally fixed it. Sorry it took so long. I didn't have any time to work on the Unicode NSIS until recently. I just released new binaries and fixed source at the usual place. http://www.scratchpaper.com The version is 2.37.1. Everyone should get this new one instead.


Just released 2.37.2 which has an initial Vietnamese support thanks to Clytie Siddall of the OpenOffice translation team.


Whats the status of this Unicode branch? Ready to be merged into trunk?

I ask because Debian would like to use it for our win32-loader program used to boot the Debian installer on Windows machines:

http://bugs.debian.org/489218


Because a Unicode build will break backwards compatibility with existing scripts, it will probably be a NSIS 3.0 release.

However, Win9x support is still not avaialble. It is possible to add support for these platforms (even with the internal installer data being in Unicode) by converting the data back to ANSI for the user interface.

The other option would be to completely drop support for Win9x in NSIS 3.x.


I would love to see the code get merged into the trunk as well, as I've mentioned many times already.

As for Win9x support, it is not on my list of priorities. I personally don't have a need for it since all the software we are shipping require Win2k+ anyway. I think most developers have similar requirements nowadays.

The "unofficial" status of the Unicode NSIS branch hasn't stopped high profile projects like Winamp, OpenOffice, Flickr, Filezilla, etc. from using it. And in the near future Mozilla's products like Firefox and Thunderbird will use it as well (Unicode NSIS is already incorporated into their build environment since MozillaBuild 1.2). It just goes to show that there is definitely a need for an Open Source Unicode Installer.


And I'll mention this point again as well. My codebase generates both the Unicode version and the ANSI version that DOES run on Win9x. So I really don't see what you can lose from adopting the codebase as-is and then working forward from there to make the Unicode version Win9x compliant via unicows.dll or some other method as mentioned by Joost.


I don't think it's a good idea to have an ANSI and a Unicode version at the same time. Many scripts will not be compatible with both versions, so this will cause a lot of confusion. And there is also the issue of plug-in compatibility.

Regarding Win9x, we cannot use Unicows or something like that because this file does not exist on all systems. The only option for real Win9x support would be to create ANSI dialogs and convert the intenal Unicode data to ANSI based on the current code page. In this case, scripts, examples, plug-ins etc. will all use Unicode internally and there will be no compatibility issues.

Personally I think it's OK to drop Win9x support in NSIS 3.0 (with NSIS 2.x still being available for Win9x users) and move to Unicode scripts. But I don't know whether other developers agree with this.

I'd definately like to have Unicode support in the official version.


I see three potential options other than the one I have which is two versions of NSIS, meaning two sets of plugins (but one codebase) and two sets of scripts, which I concede can be unwieldy.

1. One makensis.exe that can generate both an ANSI or a Unicode installer. This means that there will still need to be two sets of plugins and exeheads, but potentially just one set of scripts. If we go this route, I'd suggest that all scripts are Unicode. If we start with ANSI codepage scripts, mixed scripts in a single file becomes very difficult if not impossible to support. (Imagine a single string with mixed scripts.) And we would alienate Unicode-only languages.

2. We generate one exehead but with the ability to on-the-fly convert the internally stored Unicode strings to the ANSI codepage strings to call the ANSI Windows APIs on Win9x machines. This requires that we have a small but portable implementation of wide string to multi-byte string conversion function. Maybe we can leverage some code in the Wine project to do this. But it does mean making the exehead and therefore the final installer to be bigger. The plugins would also need to do this conversion as well since they would likely get the Unicode strings on the stack. So this method requires more complexity in the exehead and the plugins.

3. Drop ANSI support in NSIS 3.0 entirely. We have one version of everything going forward, but we lose the Win9x support. But as Joost mentioned, NSIS 2.0 can cover that small and rapidly shrinking market. (The Win9x crowd is ~0.2% of the web surfing public according to w3school: http://www.w3schools.com/browsers/browsers_os.asp)

I think the logical choice is 3. But it isn't my decision.


Originally posted by jimpark

2. ...This requires that we have a small but portable implementation of wide string to multi-byte string conversion function. Maybe we can leverage some code in the Wine project to do this
Win9x has support for a limited set of Wide API's out of the box, including WideCharToMultiByte IIRC


Originally posted by jimpark

3. ...according to w3school
That is a dev specific site, yes, the % of people using 9x is small, but not THAT small


I prefer option 1 myself, old scripts would stay in whatever ansi codepage they have always been in, unicode scripts would need a way to specify that they are unicode, with either a BOM or a nsis instruction at the top

You might be right about the percentage of people using 9x being too small but I would still argue that around 1% is right. Is there a different number you are aware of?

I wouldn't mind getting option 1, either. We might be able to do it by inserting both the Unicode and ANSI exehead into every installer (option controlled, of course). It also means two copies of every plugin as well. So your installer overhead would be 2x what is now if you wanted a "Universal installer."

Unfortunately, this change is not something I can do. Time constraints are a problem for one. But more importantly, it would fork the code too much and I wouldn't be able to maintain the code to be in sync with the main build. It could easily end up being an abandoned fork.

But we are all eager to see the main NSIS developers start work on an official Unicode version. Is there any work going underway for this support currently? If not, is there any tentative plans for Unicode support in the near future?


Personally, I see no reason to continue win9x development. I know that kichik and other NSIS developers disagree, but the userbase has dropped off dramatically. Based on winamp.com web stats, Winamp's win98/ME userbase went from 3% in 2006 to less than 1% by January 2007 which triggered us to drop support for the platform entirely.


I agree that developers should drop 9x support. It's like the idea of having 16bit/32bit software build few years before ;)

Keeping the compatibility would require too much effort, effort that could be used other things.

And if you don't trust you can make a survey, just don't forget to put the questions right ;)


If we want to continue Win9x support, the best option would be to have all scripts and plug-ins in Unicode but add support to the exehead to create ANSI dialogs on Win9x.

If this is not possible, the scripts should move to Unicode anyway but there will be two sets of plug-ins and exeheads.

However, it may indeed be better to drop Win9x support for NSIS 3.0.


the script should stay like they are, BOM for unicode files and no BOM means ansi, otherwise tricks played with !system that you include again will have to be unicode aswell, and the console is not unicode by default


But then you also need two sets of langauge files etc., which will all be way to complicated.


Originally posted by Joost Verburg
But then you also need two sets of langauge files etc., which will all be way to complicated.
You can't convert the lang file on the fly from ansi to unicode when you know the codepage ?

Archive: Unicode


You can, when you know the codepage. You can't tell programmatically right now. Some languages use two scripts and have two different language files. The NLF files can be read and the codepage gotten but not the NSH files. It would require that we add something in the NSH files to store the codepage. Maybe a special comment?

Also, no the console is not Unicode by default but outputs to the console through the stdlib will convert them to ANSI before displaying them on the console. Hence, you really want to use the makensisw.exe to see the Unicode characters.

All new scripts for the Unicode only NSIS should be UTF-16LE but we can make makensis.exe read ANSI scripts and 1. convert them to Unicode using the system codepage or 2. provide a parameter that provides the codepage to use to convert to Unicode.


But there are also scripts that contain strings with different code pages. This is impossible to convert to Unicode.


Yes. I was looking at some just today. They cannot be automatically converted.


There is nothing that can't be done. It's complicated, I'll give you that. But at the very minimum, we could have two string tables - ANSI and Unicode. Those strings that can't be translated could be saved in the ANSI table.


Okay, we are talking about the automation of the conversion of a script into Unicode. There isn't a language that ANSI supports that Unicode cannot which I know you know but someone can misread your statement and think that.

The complication comes in when a script has multiple ANSI codepage strings. The strings themselves do not tell you what codepage they are encoded in. But it may be possible if the script always explicitly uses the $LANGUAGE variable that we can deduce the codepage for conversion in makensis itself. If someone just encodes a string in their native codepage and never says in the script what language it is in, then you really can't do anything except assume things like that it's the system default codepage which can be wrong.

Where as the Unicode to ANSI conversion can be done if the ANSI codepage that supports the language exists, because the script is inherently implied in the Unicode codepoint. So it's much safer to store the scripts as Unicode and then convert them to ANSI if desired and possible.

So just to be clear:

Unicode can encode a superset of the scripts that are supported by ANSI codepages. Each codepoint implies a script. Therefore, for any shared subset of scripts, a Unicode string can be converted to the ANSI codepage encoding.

An ANSI codepage string does not imply any codepage / script. So unless hints are given, we cannot automatically convert to Unicode.


Of course. And so the very minimum we can do is have two tables and have the ANSI table act as it does now. This way, you could still build these old scripts.

Most language files already provide codepage identifiers and some string paths can give you a good enough hint on the codepage for the ANSI string. But as a first very basic solution, two tables is enough.


Can you clarify what you mean by two tables? Do you mean that the installer executable will have two tables? Then yes, I think that was the proposal -- two exeheads that accesses two different string tables.

So we have this path:

ANSI script generates:
1. best guess at Unicode string tables.
2. ANSI string tables.

Unicode script generates:
1. Unicode string tables (of course).
2. ANSI string tables. (Multiple languages in the Unicode script will not work if they are meant to be shown together. Not a big surprise because ANSI NSIS can't do this either.)


I do mean two string tables in the installer executable. But there's no need for two different exeheads/stubs for the job. The ANSI table could simply be empty for installers generated from Unicode scripts.


And all variables will also be in ANSI when an ANSI script is compiled? Otherwise it won't be backwards compatible anyway.


NSIS 2.38
NSIS 2.38 released.
What about Unicode version of NSIS 2.38? ;)


vcoder, I'm on vacation right now. So I won't be on it for a few weeks. :) But I'm sure you can survive in the interim.


Originally posted by jimpark
vcoder, I'm on vacation right now. So I won't be on it for a few weeks. :) But I'm sure you can survive in the interim.
Thank you for your response.

Have a good vacation! :up:

Bug in FileWrite?
This script work well on ANSI version of NSIS and failed on Unicode version:

PHP Code:


OutFile"GetVersion.exe"

!defineFile"Resources.dll"

RequestExecutionLeveluser

Function.onInit
GetDllVersion
"${File}"$R0$R1
IntOp$R2$R0
/0x00010000
IntOp$R3$R0
&0x0000FFFF
IntOp$R4$R1
/0x00010000
IntOp$R5$R1
&0x0000FFFF
StrCpy$R1
"$R2.$R3.$R4.$R5"

FileOpen$0"$EXEDIR\Version.txt"w
FileWrite
$0'!defineVersion"$R1"$\n'
FileClose$0
;MessageBoxMB_OK"$R1"
Abort
FunctionEnd

Section
SectionEnd

Result file:
ANSI: !define Version "3.4.0.96"
Unicode: !def

I'll look into it when I get back from vacation.


I am back from vacation today.

Just released Unicode NSIS 2.38. Fixed vcoder's FileWrite bug as well. And modified the Unicode Mongolian translation to use some characters that do not exist in the ANSI Cyrillic codepage as per request by the OpenOffice project.


Just modified the System plugin so that the 't' type specifier acts like a TCHAR*, that is, it will be an ANSI string in the ANSI version of NSIS, but will act as a wide-char string (Unicode string) in the Unicode NSIS.

I introduced an 'm' type specifier to specify an ANSI string. Why 'm'? Well, I can't do 'a' or 's'. So 'm' stands for multi-byte string which ANSI strings can be (although, usually not). Also 'm' looks like an upside down 'w' which stands for wide-char string. :)

This means that the conversion from your ANSI NSI script to the Unicode one will be mostly straightforward file conversion unless you were naughty and used the ANSI Windows API specifically, such as MessageBoxA. Some Windows API like GetProcAddress only take ANSI strings, so you should still look at your System calls carefully.

The new version is dubbed 2.38.1. You know where to get it.

I've also added an FAQ page (http://www.scratchpaper.com/unicodensisfaq). If there are any missing topics you'd like to see or there's some erroneous info, please e-mail me.


there was already a feature request for this (http://sourceforge.net/tracker/index...49&atid=373088) would be great if you could submit a patch so the system plugin stays in sync


I didn't know there was already a feature request. The feature request mentions 'z' as the potential type specifier. I did consider that but favored 'm'. Can we settle with 'm'?


BTW, I can't add the modified source to the feature request. I can only submit comments. (Permissions issue?) Or do I have to submit a new patch request?


yeah, go with 'm', and you would have to submit it as a patch, not sure why its not possible to attach files


2.39
We a waiting for Unicode version of NSIS 2.39... ;)


I'll probably work on it this weekend.


This code work well on ANSI version of NSIS 2.38 and failed on Unicode 2.38.1 version:

!define HAVE_UPX
!ifdef HAVE_UPX
!packhdr tmpexe.tmp "UPX --best -f -q -v --ultra-brute --all-methods --all-filters --compress-icons=0 tmpexe.tmp"
!endif


Have you tried packhdr just on its own to see whether it works on Unicode NSIS generated installers?


I tried using a2u.exe on a Windows 2003 Server (SP1, Standard English) it gives me an error: "This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem."

It works fine on XP Pro. (SP2, German).


You most likely need a newer verison of the C++ redistributables installed: http://www.microsoft.com/downloads/d...DisplayLang=en

(If that doesn't work, try the 2008 one)


Well, if it is because of the redistributable, then it's a silly mistake on my part. I tried rebuilding the project. Can you try downloading a2u.zip and trying the executable inside it again?

As for the Unicode NSIS update, I don't think it's happening this week. I've got way too much other work to do. Hopefully, next week some time.


It works now,thanks a lot!


Have you tried packhdr just on its own to see whether it works on Unicode NSIS generated installers?
This code I found in ansi-NSIS help file. I converted my ansi script file to unicode by a2u.exe and tried to compile it by unicode-NSIS. Result is error when part of code with UPX is processing.

Okay, so can you try calling UPX itself to see if it works on Unicode NSIS installers? What does the errors say? I just don't have a lot of time to work on Unicode NSIS these days, so you're going to have to help me find the problem a bit.

- Jim


I found this error on WinXP SP3. But in Vista SP1 the same error.

This is report:
Error name: APPCRASH
Application name: makensis.exe
Version: 0.0.0.0
Time stamp of application: 4885ef1f
File with error: makensis.exe
Version: 0.0.0.0
Time stamp of file: 4885ef1f
Code of exception: c0000005
Stack of exception: 00006c6b
OS version: 6.0.6001.2.1.0.768.2
Language: 1049
Advanced info1: 8a1e
Advanced info2: 9e3d13911997afb9af7c20e45db763b3
Advanced info3: 3aba
Advanced info4: 1beeb7f8f0c5e3f3f9c6c0f5ff97433e


1. In script: CRCCheck off
2. Generating installer with name 'example.exe' without code with UPX.
3. Command line: upx -9 example.exe
4. Installer is working! Files is copying to folders. But the error when uninstall.exe creating.


Is there a plugin you are using to get this UPX capability? Or are you calling the UPX tool straight as part of your script? I only support the default plugins that come with NSIS. If you are using a third-party plugin, you will need to contact them for help or convert the third-party plugin to interface with the Unicode NSIS using Unicode strings. It seems like either a plugin that has not been converted for Unicode support or an UPX issue. Either way, I don't think I can help you in this case. I apologize.


I calling the UPX tool straight as part of script.

See in file NSIS.chm:
"NSIS Users Manual" -> "Chapter 5: Compile Time Commands" -> "5.1 Compiler Utility Commands" -> "5.1.10 !packhdr"

Sorry, there is new code now. I will try the new code and will write the result here.


Same result.

Script file contents:


;...<code before>...
page licensepage
;...<pages>...

!packhdr ....

Section First
;Some data
SectionEnd
;...<code after>...


Output in makensis:
;...<code before>...

!packhdr

Section First

Its all! Why?
Application error at this place.

I'm an engineer on Second Life (www.secondlife.com). We've been using NSIS for years and love it.

Recently we've noticed that installers built with NSIS 2.39 (and earlier) fail to launch on east Asian systems when the path to the installer has a non-roman character in it. For example, if a Korean user has a Windows username with Korean characters in it (í•œêµ_ì–´), and the installer is on the desktop, it fails to launch with the error:

NSIS Error: Error launching installer

I've noticed that I don't get this error if I build the installer using NSIS Unicode.

Is this a known issue with the mainline NSIS? Do I have to switch to NSIS Unicode if I want my Asian users to be able to run the installer off their desktops?

Thanks,

James Cook
james@lindenlab.com


If the user had used MBCS (ANSI codepage) for their path names, it might have worked. But usually when Windows creates the User's documents and settings directory using the user name, that user name is going to be Unicode, not ANSI codepage, even if the operating system codepage is Korean in this case. So that's why ANSI NSIS has trouble with it. However, Unicode NSIS has no problems at all.


Not entirely true, Jim. In NTFS, all path names are Unicode (actually, this also applies to LFNs in FAT), so Windows will convert to ANSI when applications call the relevant *A-APIs.

However, the ANSI codepage used is dependant on the setting selected for non-Unicode applications in Regional and Language settings in the Control Panel. If you've selected a codepage that doesn't contain the required characters, it'll fail to make a proper conversion because the unmapped characters are converted to a ?, which in turn means the path won't be found.

Now, although I kind of doubt James posted the actual text in question, it would appear to contain 2 characters that aren't Korean (at least they appear as good old "I don't know what this character looks like" to me). That might indicate characters outside of CP949, the Korean codepage.

(There's also an underscore, but that's in the 0-127 area, where they've only changed \ to ₩ - just like CP932, Japanese, has ¥ at that position.)


Ah, thanks for the clarification Pidgeot. So the user name issue then must be that people like to have their user names in their native language which naturally causes Unicode characters to be included into the desktop path which is something like c:\documents and settings\<user name>\desktop or on Vista c:\Users\<user name>\desktop.

Now what if the system codepage is not Korean but the person still wanted to have a Korean user name? How does the system convert that to codepage for NSIS to manipulate and back to Unicode? I think that's also a legitimate issue.

So perhaps recently, there was a change in NSIS which required that it needed to know where itself was not just where the user wanted to install the program.


Just to elaborate how it all works with an example:

If the system codepage is not Korean, and you use ANSI functions to fetch a path containing Korean characters, you'll get a question mark in place of each non-convertable character.

Having your username in your native language is usually not an issue, because you'll usually have the same language set for non-Unicode applications, and when you aren't using that language in path names, it's usually English anyway, which can be expressed in all code pages.

To give an example, the issue can pop up if I (being a Danish user) wanted to run a Japanese non-Unicode program. Running it with the non-Unicode language set to Danish would result in mojibake for text, so I'll need to set it to Japanese to run that program.

(There's a tool by Microsoft, called AppLocale, which can set this for a single application - and because changing the global setting requires a reboot, that's a good thing - but that's a different story.)

If my username then included a Danish letter, such as Ã…, any non-Unicode application attempting to get the username would read that character as a question mark, because Ã… does not exist in the Japanese codepage. The folder "Ã…l" (Eel) would thus become "?l" - but because the folder is called "Ã…l" and not "?l", it can't access the folder.

You might argue that it could try to guess - after all, ? is a wildcard - however, there are issues with that:

1) If there are multiple matches (let's say we also have a folder named "Øl", meaning beer), you don't know which one you're supposed to get. Acting differently in the case of "only one match" is not consistent behavior, so that's not that good of an idea.
2) Strictly speaking, NTFS does allow a file to be called "?l" - the only character not really allowed is NUL, much like in Linux. Win32 adds several characters to the "disallowed" set due to backwards compatibility


Archive: Unicode


Thanks for the details, guys. I agree the issue is people with paths to the installer like C:\Documents and Settings\<username-with-extended-chars>\Desktop. I presume this means I'm not doing anything wrong with my plain-old-NSIS installer?

I wish there was a workaround for plain-NSIS because we've got it installed on a plethora of developer and build machines.

Could it be because we're using CRC check?

Can you guys think of a workaround other than switching to NSIS Unicode?

James

(I attached our installer script below in case you're curious.)


I made a very small installer to test, and I get the same result with CRCCheck On and CRCCheck Off - so that's not it.

Now, this is just my opinion, but I would probably go Unicode now. It's not something you can really avoid, since the Unicode branch will eventually be moved into the main build, so not doing it now would only postpone the migration.

The alternative would probably be an addition to a readme or FAQ or whatever, with a guide on how to change to the appropriate codepage, or suggesting the workaround to place it in a path only containing ASCII.

I'd like to stress again: The problem is not with paths that contain characters <= U+007F, since those are unchanged in all Windows codepages (except the issue with \, but that one is handled gracefully). The problem is with characters that are >= U+0080, and are not representable in the user's codepage.


Thanks guys. I really appreciate the help. I think we'll switch the NSIS Unicode for the time being.

Enjoy the coffee, Jim! :-)

James


I'm having another problem...

An installer built with NSIS Unicode 2.38.1 does not show the "unpacking data: ##%" dialog during load. The same installer built with NSIS 2.39 does show the dialog.

My installer is 21 MB compressed with LZMA, so the unpack takes a noticeable amount of time. Is there a way to get that dialog back, or put up some other sort of progress display?

Thanks,

James


Does that happen with any compression setting? I'm imagining that it does. And also, does it print anything at all like "unpacking data:?" or is there no analogous printed line at all?


Jim,

Should I run the installer from the command line to look for output? I get no window open of any kind during unpack. The first window I get is language select (from our .onInit() function).

I've only tried it with SetCompressor /solid lzma

Tomorrow in the office I'll try it with other compression settings. Any particular ones to try?

James

(I attached the shell of our installer -- all the %%FOO%% sections are automatically populated with our lists of files.)


It doesn't look like installers print anything to the console -- running from the command line shows nothing.

Also, I have the same problem (no "unpacking" dialog) with:
SetCompressor zlib
SetCompressor /solid zlib
SetCompressor /solid bzip2

Hope that helps.

James


A workaround could be to add all files in the right order in the solid archive (using ReserveFile). Then there will be no need to extract a lot of data at startup.


Thanks for the suggestion, Joost. I ended up reordering my .nsi script so all the functions are at the start of the file, followed by all the files. We used to have the .onInit function at the end of the .nsi file, which I think was the cause of the slow unpack. This seems to eliminate the startup delay that caused the "unpacking data" dialog to appear and solves my problem.

I didn't see this documented anywhere -- does NSIS interleave the packed file data and the function definitions? If so, it seems important to put all your user-written functions before the first big list of files. Should that go in the docs somewhere?

Jim, just so you know, NSIS Unicode definitely doesn't show the "unpacking data" dialog, though, even on slow unpacks.

Thanks again, guys. James


2.40?
Are you going to update NSIS' Unicode version up to 2.40?
There are some important (for me) improvements and bug fixes.

Thank you.


I'm currently at the busiest time of the year for development. We have what we need right now as far as an installer is concerned so I'm not going to be able to update this any time soon. I'm hoping if the NSIS guys are really going to support Unicode, they pick it up soon.


kichik, whats the status of this Unicode support? Needs reviewing? Needs something fixed? Debian folks are wanting this too.


I've been looking at the code changes just now.

First thing that got me was all the ANSI/Unicode dirs all over the place with the files in UTF-16. I didn't think that is a good idea, so I converted all the Unicode directories into UTF-8 and looked at the differences between them.

Based on that, here are some suggestions:

Don't duplicate stuff that doesn't have non-ANSI characters in it.

Where the only change between Unicode and ANSI is to re-encode the © (Copyright) symbol, just replace it with the word "Copyright", since that is as legally valid as "©".

Where differing parameters to System::Call are needed, add an !ifdef UNICODE or something.

Some more comments:

Don't add/remove whitespace where it isn't needed.

Menu/images/Unicode/create_header.py isn't needed, Scripts/release.py does that stuff.

Please don't include proprietary formats like Paint.NET in nsis, standardised PNG, BMP or ICO are best.

For the languages, store everything in UTF-8 in the source code and convert to ANSI at makensis time if the user sets a specific code page (and error out if all the characters are not available in that CP). On Linux the iconv function can help and for people running makensis on Win9x the unicows library could be useful:

http://libunicows.sourceforge.net/

Switching to gettext style translations and the PO file format for translations might be useful too, dunno how well Windows supports it though.

Use ISO 639-1 codes for language type instead of English words for the languages.

For the Contrib/UIs, add ifdefs, compile them twice and install them into Unicode and ANSI directories. Alternatively makensis could detect a "RichEdit20" class and replace it with "RichEdit20A" or "RichEdit20W" as appropriate.

Examples/Unicode/bigtest.nsi doesn't need to use the ¢ character.

Examples/Unicode/makensis.nsi doesn't need ANSI/Unicode in the registry/etc does it? Just make one version of makensis that does both ANSI and Unicode depending on a command-line switch or a script command. Same for a lot of other stuff.

Will make some of these changes myself, look at the rest of the changes and add more comments.


I just filed issue 2230926 related to NSIS 2.40 (and earlier) installers throwing an error on startup if the path to the installer contains wide characters.

https://sourceforge.net/tracker/inde...49&atid=373085

NSIS Unicode doesn't seem to have this problem. I wonder if the parts of the Unicode patch that fix this issue could be ported over separately, or as an initial step.

James
james@lindenlab.com


Hello, I was using ANSI version of NSIS until I faced the issue with unicode user name. Then I found unicode nsis and it solve the issue.

I just read though this entire thread, and now a bit confused. Looks like I was supposed to convert scripts to unicode?

I did not converted my scripts to unicode, they are still ANSI. But installer now works fine with UNICODE user names, so, definitely, I compiled unicode version.

Why I was able to compile ANSI scripts? Isnt it should be mandatory converted to UNICODE?

I did nothing else but reinstalled NSIS and recompiled it under unicode NSIS. Looks like everything is fine.
What I am missing?


What it does is try to convert the ANSI script to Unicode using the system's codepage. That's probably okay if you are the only developer and you never change your system codepage to something else but technically this is incorrect. In order to make your NSIS script portable, you should convert your script to UTF-16 (using notepad.exe, for instance). If you don't, you might find that later down the line, users report that your installer produces junk text and you might not remember why. It's simple to do, so just load it up with notepad and save as Unicode and save yourself future grief.


Thanks. One more question, if you please.
We are planning to do some localization.
Question regarding existing language files: I've noticed, that with unicoed nsis you ship non-unicode language files. Correct?

How non-unicode language files survive with unicode installer? What shell I do about it? We server like 9 languages, like Gernam, French, Spanish, Simplified Chinesse, Traditional Chiness, Korean, Dutch, Italian, Japanese, Portuguese.


No, the language files that come with Unicode NSIS are indeed Unicode. I'm not sure why you think that wasn't the case, unless you've been looking at the "regular" NSIS files.


In unicode NSIS installation files in ".\nsis\Contrib\Language files" are not unicode. See attahcment.

/edit: attachment deleted. I was looking into the wrong NSIS installation folder.


Those aren't the Unicode NSIS files. I've looked at the exact same files here (2.38.1-Unicode), and they are Unicode here.

Note that, by default, Unicode NSIS installs itself into Program Files\NSIS\Unicode - it does not overwrite the ANSI NSIS files (but the context menu is changed to point to the Unicode NSIS instead). The path you give suggest you're looking at the ANSI NSIS files.


Oops. Please disregard my last question, you were absolutely right, files are unicode. I was looking into the wrong NSIS installation folder. Sorry for the confusion.


Is that possible to declare and use (as a plugin feed) ANSI string within UNICODE version of plugin?


Not really.


Lazy to rewrite plugins...
Do you know any encryption plugin that will work with unicode?
Like sha1 or md5?
stuck in this.. During the installation i have to verify downloaded package. Was easy under ANSI version.


NSIS installers already do CRC checks on themselves as I understand it. If that's not good enough, you will have to port some plugin over to work with Unicode. It's not too hard. Look at how the standard plugins were ported.


Jim,

How hard do you think it would be for me to write a Unicode-path-to-installer patch for ANSI NSIS? If I wrote it, do you think the mainline NSIS would accept it?

The inability of ANSI NSIS installers to run from the Desktop and Download folders of Windows users with Unicode account names is a major blocker for us. We use your Unicode version from www.scratchpaper.com, but it would be nice to stay up to date with the recent ANSI NSIS versions.

I'm way too new to the NSIS code to attempt to update the Unicode NSIS version to the latest ANSI codebase. But I might be able to hard-code all the file open commands to use the FunctionW() versions and convert the path to the installer to WCHAR.

What do you think?

James
james@lindenlab.com


James,

I was no expert when I started. I never heard of NSIS nor cared about installers until out of necessity I had to. I just downloaded the source and dove right in.

The problem of trying to make NSIS partially Unicode might be even harder than making NSIS completely Unicode. For example, you'll need to be able to specify Unicode strings in the ANSI script files. From a UI standpoint, there are a million ways to get the path strings from the user and all those paths need to be covered as well. So supporting two methods of data entry support might be a very daunting task.

And even if you were able to shoehorn it -- I don't know how easy it would be to keep it up-to-date. And it probably won't be accepted by the mainline NSIS.

There are two bright sides, though. One, I'm starting to get freed up from my current project so I might be able to spare some time in the coming weeks to update the Unicode NSIS. And two, the main NSIS project needs to adopt some strategy of supporting Unicode soon or it will really lose relevancy -- the only thing saving NSIS is that there is no other open source option. But that can change. Some big project out of necessity can create one that supports Unicode and that would be the end of NSIS. If that happens, I'm not going to keep up Unicode NSIS. I'll just adopt the new open source installer.

- Jim


There are other open source installers, Inno Setup etc.

It would also be possible to create a plugin that can call ansi nsis plugins from the unicode build


Inno Setup I believe is free but not open source. They may have changed that recently.

Also, I don't think they support Unicode either.

I guess a plugin that calls ansi plugins might be doable, but it would not work in the general case. How do you convert to ANSI? Do you assume that the system default codepage is the right one? That would be wrong on many machines. And it won't help you if you went to the Unicode NSIS because you need Unicode in the first place.

Really, if you need Unicode and you need a particular plugin, you really should create a Unicode version of the plugin. It's not that hard.


http://cvs.jrsoftware.org/view/issrc/


So they did open it up. But Pascal? They should have kept it closed. :) Just kidding. Still, no support for Unicode. :/


Open it up? It has been open source for years


Yes, it seems like you are correct, Anders. They have been open source for years. I misremembered reading somewhere that Inno Setup is free but not open source. Maybe I was just remembering that there was no easy way to build the source. You needed to purchase Delphi development tools from Borland. Is there any free development tools that can build Inno Setup, currently? Not that I care to learn Delphi or Object Pascal at the moment...


Using Unicode NSIS in Firefox
Hi all,

I'm trying to convert Firefox's installer from NSIS to Unicode NSIS. I'm facing a problem which I don't know how to solve.

When I build Firefox installer using Unicode NSIS, it appears that some of the images/strings/formattings don't make it into the final installer. For example, the header and watermark which the Firefox installer uses do not appear on the screen; the welcome page strings don't show up, and formattings such as bold header for MUI pages are lost.

I'm not really sure what causes the problems. The same source files can be successfully built using NSIS 2.22 (which Mozilla uses by default). I'm trying the Unicode NSIS version 2.33, but I have tried this with the latest version as well, without any luck.

Any help is very much appreciated!

Thanks!
Ehsan


Are all the files you are using now UTF-16LE? The NSI files as well as all the readme text files you are using should be UTF-16LE. Make sure that's the case. (You can use a2u.exe or notepad.exe to make the conversion.)


Yes, the text files are all encoded in UTF-16LE. Do you have any ideas on what else could be wrong?


Are you generating these text files using a script of some kind? I remembering looking at some of the Mozilla build stuff and it seems heavily scripted. Any possibility that some of these files are generated files? I've noticed that while Python and to some extent Perl does internally support Unicode strings, printing them out to a file needs to be done correctly or they get converted to ANSI inadvertently.


Some of these text files are generated by a script, and some others are simply copied using cp. All those files are in UTF-8 encoding. For ANIS NSIS, we would use iconv to convert them to the correct code page, but with Unicode NSIS, I have used iconv to convert them to UTF-16LE. I prepend the UTF-16LE BOM bytes to the files manually using a very simple Perl script:

my $line;
print "\xFF\xFE"; # UTF-16LE BOM mark
while( $line = <STDIN> ) {
print "$line";
}

I have opened all these files in Notepad++ and they all shown properly with the correct encoding.

I assume I don't have to run the BMP files through iconv, right?

Would you say that iconv's output might be different to that of your a2u program? Does a2u support UTF-8 as the input format?


Nevermind, I found out the problem. In our nsi files, we were !defining MUI_INSERT, which, in newer versions of NSIS, caused the MUI_INSERT macro not to expand to anything useful. Removing that !define did the trick!

By the way, did I mention that Unicode NSIS is *awesome*? Keep up the good work, Jim!


I'm glad things worked out. Thank you for the encouragement.


Archive: Unicode


Originally posted by jimpark
Yes, it seems like you are correct, Anders. They have been open source for years. I misremembered reading somewhere that Inno Setup is free but not open source.
This is partially right. Inno Setup is free (as in money) and the source is available, but they are using a license that is not approved by OSI (opensource.org). Inno Setup uses its own unique license:
http://www.innosetup.com/files/is/license.txt

The sticking point is point 3: "All redistributions in binary form must retain all occurrences of the above copyright notice and web site addresses that are currently in place (for example, in the About boxes)."

This requirement doesn't appear to be compatible with the GPL and doesn't appear to exist in other OSI-approved open source licenses.

I'm not sure if the actual installer itself has a about box, but the version info has a innosetup link in one of the extra items, no big deal IMHO (how far off topic are we now? :) )


I just wanted to inform everyone that I'm actively working on the Unicode port of 2.42. I've got most of it done and working but I'm having a bit of trouble with the System plugin samples. More code got turned into assembly code since 2.38 and I had brush up on my assembly code skills. So expect a release sometime next week.

Also, if anyone wants to help debug my System plugin woes, please drop me an e-mail and I can provide you with the latest code and binaries.


We'll be using Unicode NSIS 2.33 in Mozilla-based software installers. What would we gain if we update Unicode NSIS to a newer version?


From what I can see (and others may feel free to chime in), they've made the writing of the install script easier. For example, when you include an NSH file, you no longer need to declare all the macros you will you use. When using a plugin, you no longer have to remember to put /NOUNLOAD. When writing the uninstall section and using macros, you don't have to remember to put "un" in front of everything. While this probably won't help you if you already have a working NSI script, for someone creating a new one, I can see how it can be a real benefit.


Pardon me for sounding like a broken record here, but um... Is there any plan for this Unicode branch to be incorporated into the trunk?

Due to the sheer popularity of this thread (#1 in replies and views), it seems that I'm not the only one who is a user of this branch. And, it also seems as though I'm not the only one a bit frustrated by the delay in the Unicode branch receiving the latest features added to NSIS.

Some of the reasons first proffered for this not being added to the trunk when first released were due to the question of its stability, but inasmuch as nearly every major open source project is ALREADY using it (Winamp (the host of this site), Mozilla, Firezilla, Open Office), I think the question of its stability has been answered.

NSIS Unicode works, and it works well. I'd like to voice my request again to have regular builds of this from the NSIS site.


Hello folks. Unicode NSIS 2.42 is now out. Get your copy at http://www.scratchpaper.com. If you find bugs, please let me know.


Well, it's been two days and now Unicode NSIS 2.42.1 is out. My sincere thanks to everyone who reported bugs and worked with me to get the fixes done! Go get the latest from http://www.scratchpaper.com.


Differences in 2.42.1

I took the libery to add a few more features to this albeit short release.

Added the ${NSIS_CHAR_SIZE} macro which will be 1 in ANSI and 2 in Unicode. (This only exist in my versions of NSIS, but it would be nice to get into the trunk as well.) This allows the installer developer to write code that uses IntOp to manipulate string pointers to do it the same way for both Unicode and ANSI scripts.

I've also added System::StrAlloc which allocates strings of N characters not bytes. This means that System::StrAlloc 1024 in ANSI will be 1024 bytes but in Unicode it will be 2048 bytes.

Related to the above, I've made changes to the System plugin such that for &tN and &wN, it refers to N characters not bytes. And I've made documentation changes in System.html to that affect.

I've also changed the names of the shell commands to reflect which version of NSIS you will be compiling against. So instead of getting the generic "Compile NSIS Script" in the context menu when you right click on an NSI file, now you get "Compile Unicode NSIS Script" or "Compile ANSI NSIS Script" or both if you have both installed. You may have noticed that Unicode makensisw.exe is smart enough to try to convert an ANSI NSI file to Unicode and compile it anyway. So if you are sure that your ANSI script needs no modifications, but just a file conversion to Unicode, then this is a nice feature. But it can also be a source of confusion so by making the shell commands more descriptive, you'll know what you are getting.

Enjoy.


I would suggest putting a SystemStrAlloc macro somewhere so that the ansi version can continue to call Alloc and save a couple of bytes in size


Well, you might be surprised to find that adding the function did not grow System.dll by a single byte. The code section grows in some increments and it just fits nicely into the space available. And if you really want to fuss over bytes, if you compare my version of the ANSI System.dll, it's actually 512 bytes smaller. And the Unicode version is the same size as the official one.


Auto conversion of text encodings:

Here's basically what the Unicode NSIS does as far as auto-encoding/decoding of text types.

  • If the file has no BOM in the beginning, it assumes it's an ANSI text file. (It will probably use the system codepage to convert it internally to UTF-16LE -- but I'm not sure about this. It may just strip out anything beyond ASCII and replace them with '?').
  • If the file starts with a 16 bit encoding of the BOM (the usual kind 0xFEFF), it will treat the file as a UTF-16 file and correct it for endianness, internally storing everything as UTF-16LE.
  • If the file starts with an 8 bit encoding of the BOM (0xEF 0xBB 0xBF), then it treats the file as a UTF-8 file and not an ANSI file. So those who desire to store their scripts in UTF-8, it's fine as long as it starts with a UTF-8 BOM. I understand that it's not usual for UTF-8 files to have a BOM, but since in the Windows world ANSI/ASCII files are more prevalent than UTF-8, this is what I chose. If people want UTF-8 to be the default, I'm open to that. Just let me know what you opinion is. Incidentally, notepad.exe, when saving files as UTF-8 will add a BOM to the start of the file for you.

The way I would prefer codepage detection to work is like this:

BOM Present: Use the corresponding encoding/codepage.
No BOM present: Check to see if the file can be fully parsed as UTF-8 (that is, no invalid byte sequences exist). If this succeeds, then it is very safe to assume that the file is indeed UTF-8. If it fails, fall back to ANSI/system codepage. Although we could, at least in theory, try to figure out UTF-16/UTF-32 without a BOM, due to the nature of the files, I suggest not doing so (since they are exceedingly rare, unlike BOM-less UTF-8).

To use this information efficiently, MakeNSISW should be able to perform this check (not only the actual MakeNSIS command-line compiler), and the command-line compiler should have a parameter to tell it the codepage in advance (or, at the very least, a parameter that tells it to skip the "valid UTF-8" check if no BOM is present). The parameter would probably be a good idea anyway, as it would more easily allow for a "Always assume UTF-8 if encoding unknown" option in MakeNSISW.


Well, there is no BOM for ANSI. There never has been and never will be. Basically, what we can do is make Unicode NSIS never work with ANSI with the exception of ASCII only scripts. If there is no BOM, assume UTF-8. If there is a BOM, assume UTF-16. This is the only other option in my mind.


Older unix tools might not support BOM for UTF-8, but that's not our problem ;)

If you really want UTF-8 without BOM, maybe makensis could check the first line for a comment like #NSIS: encoding=UTF-8


Well, my guess is, if they don't like putting 3 bytes in front of their file, they will like having to put "#NSIS: encoding=UTF-8" even less.


but the header is just ascii text that any editor can write, you don't need any special support


I think a lot of these guys are scripting their NSI files and so it's not a big deal to emit three bytes at the front. If they are writing their scripts via a text editor, they may be able to simply save as UTF-8 and it may embed the BOM for them (e.g. notepad.exe does). Or they may have a way to type hex codes: EF BB BF (decimal: 239 187 191). For example in vim, I just do this command:


:dig b: 239 o: 187 m: 191


And then I hit 'i' to insert text and type their ctrl-k b: ctrl-k o: ctrl-k m:. And there I have it.

Or paste this guy: 

Long post ahead...

Originally posted by jimpark
Well, there is no BOM for ANSI. There never has been and never will be.
Well, no, but there doesn't need to be. I've yet to encounter an ANSI file (with characters > 128) that could be mistaken for a BOM-less UTF-8 file, because the odds of an ANSI file not containing any invalid UTF-8 character sequences are astronomically low, due to the way UTF-8 works on a byte level: not only do all bytes > 128 have to be placed together, but all bytes in the range 128-192 must be after a valid lead byte (194-244), and there has to be a very specific amount of high bytes following - it can be expressed as a very simple state machine that checks each byte (worst case executions-wise is that the file is UTF-8, so we end up running through the entire file, but even so, that process shouldn't take more than a second).

That's why I prefer not having to choose between "allow ANSI, but require UTF-8 BOM" or "disallow ANSI, always assume UTF-8", instead getting the best of both worlds by checking for UTF-8 validity if there is room for doubt: it generates the highest possible amount of backward compatibility, while still allowing people to save their UTF-8 the same way they've always been doing (That doesn't mean we can't issues a warning or something if they use an ANSI file, but that's still a hell of a lot better than temporarily breaking all existing scripts by forcing a conversion). Yes, it requires a bit of extra work to read the file into memory and checking it before starting the compilation, but I think the value in that feature would be pretty great compared to the time it takes to develop it. (Of course, that's easy for me to say, since I'm not likely to be the one programming it. :p )

Basically, what we can do is make Unicode NSIS never work with ANSI with the exception of ASCII only scripts.
I'd certainly prefer this to not supporting BOM-less UTF-8 (partially because I no longer use Notepad for editing scripts, but gVim, and having to remember to add the BOM will be a pain - particularly since not getting these useless BOMs is one of the reasons I switched), but see above.

If there is no BOM, assume UTF-8. If there is a BOM, assume UTF-16.
I assume you mean "If there is a UTF-16 BOM". ;)

This is the only other option in my mind.
Originally posted by Anders
Older unix tools might not support BOM for UTF-8, but that's not our problem ;)
Except NSIS can do cross-compilation on *NIX systems and, to the best of my knowledge, most *NIX editors don't place a BOM unless explicitly told to (the only advantage is for identification, and most Unicode-aware tools try UTF-8 anyway, so no real benefit is observed).

Forcing people to do this also means forcing them to change their usual editing routines beyond just having to select a different encoding - they need to know "does my editor add a BOM" (and how they can add it if it doesn't), and I think that could potentially put some people off from going with Unicode NSIS.

If you really want UTF-8 without BOM, maybe makensis could check the first line for a comment like #NSIS: encoding=UTF-8
It's not really logical to have to force a Unicode-aware application to use Unicode like this. You don't have to use files with signatures for Notepad to recognize UTF-8, to give an example.

This sort of thing could be more useful if extended to allow specifying non-Unicode code pages, such as sjis or big5, in which case Unicode NSIS would convert from that particular codepage, since that would allow Unicode NSIS compilation of an ANSI script on a random system - but I'm not convinced this is a relevant feature for very many people.

Originally posted by jimpark
For example in vim, I just do this command: <snip>
Just a comment on this: check out :h bomb.

Or paste this guy: 
If you use the same codepage as Jim, that is. ;)

Just a comment on this: check out :h bomb.
Wow. I did not know that. Very cool. Vim rocks. Now, is there a universal font that is available in Vim that can work on a lot of different scripts? For example, I'd love to see Arial Unicode work for me but I can't seem to choose that font for my guifontwide setting. Is there a font like that out there that I can download and have it work for Vim or somehow get Arial Unicode to work?

If you use the same codepage as Jim, that is. ;)
True but everyone should have Latin-1 as their codepage! What else could we possibly need? :p

Originally posted by jimpark
Wow. I did not know that. Very cool. Vim rocks. Now, is there a universal font that is available in Vim that can work on a lot of different scripts? For example, I'd love to see Arial Unicode work for me but I can't seem to choose that font for my guifontwide setting. Is there a font like that out there that I can download and have it work for Vim or somehow get Arial Unicode to work?
The help file states that all GUI versions of Vim, with the sole exception of the GTK2 version (you might be able to compile a GTK2 version for yourself and use that, but even there), only allows monospaced fonts. Personally, since my editing is pretty much restricted to Danish, English and Japanese, I use MS Gothic: it contains all the characters I need (although not all of the non-CJK characters are particularly pretty). A Chinese font like SimHei would probably give better Kanji/Hanzi support, though, since the Chinese use a much larger set of characters (I haven't tested this, though...)

We're getting pretty off-topic, though, so let's try to get back to NSIS... :blah:

Problem with non-Unicode InstallerOptions INI files
Another problem which we are facing in order to use Unicode NSIS in Mozilla is about InstallOptions INI files. We use WriteINIStr, which seems to write the INI file as ANSI and convert to a Windows codepage on the fly.

Is there any way to write Unicode INI files from within the Unicode NSIS, or any other similar effects? Without this, our custom installer pages will not benefit from being Unicode...

Thanks!
Ehsan


Your best bet is to move to nsDialogs

IIRC, NT will write unicode ini files if the .ini already has a UTF-16LE BOM


http://blogs.msdn.com/michkap/archiv...15/754992.aspx

..so I guess you can use the nsis FileOpen/FileWriteByte commands to write the unicode BOM before you use the ini functions


Well, it actually wouldn't be too hard to support WriteINIStrUTF16LE. And after thinking about Pidgeot's idea, and taking a moment to write the ValidateUTF8 function myself, I think it's probably best to not require a UTF-8 BOM. I'm starting to see a potential for 2.42.2 already.


I wonder if I shouldn't just make the INI files it generates as UTF-16LE instead of creating a new function. I think Windows supports INI files as UTF-16LE with no problems.


I dug a little deeper about WriteINIStr and this is what I found.

WriteINIStr uses WritePrivateProfileString() in kernel32.lib. If you look at the documentation for this function it states: "If the file was created using Unicode characters, the function writes Unicode characters to the file. Otherwise, the function writes ANSI characters."


did I not already say this?


You may have. This thread is long and my memory is short. Ah, I see your post now. I just missed it.


I've semi-secretly released a prototype to 2.42.2 which can read UTF-8 NSI files. It does not require a UTF-8 BOM. If it finds a BOM in the beginning of the file, it uses the BOM as a cue to how to read the file. If it does not find a BOM, it checks the file to see if it looks like a valid UTF-8 file. If it's valid UTF-8, the file is read as UTF-8 and internally converted to UTF-16LE so that it can call the Win32 API. If it does not have a BOM and it does not validate as UTF-8, then it is opened as ANSI. But if this latter thing happens, anything that is > 0x7f probably gets converted to question marks.

http://www.scratchpaper.com/nsis-2.4...code-setup.exe

So basically this version of NSIS officially supports:
UTF-8
UTF-16LE

It may work with UTF-16BE with the BOM (I'm relying on Microsoft on this one). But it does not work with UTF-32 of any endian flavor. Not sure there's a big demand for it anyway.

Let me know if you run into any problems. If it looks solid, I will release it to the general public some time next week or so.


Now that it can use UTF-8, wouldn't it be best to use that everywhere instead of having ANSI and Unicode versions of almost every file? If so, please be sure to strip the BOM character from the beginning of each UTF-8 file.

${NSIS_CHAR_SIZE} sounds like a bad idea, nsis shouldn't assume any specific size for characters - each character in UTF-8 can be multiple bytes long and same for UTF-16, that can have characters that are more than two bytes long. Only UCS-2 and UTF-32 have constant-sized characters.


you are missing the point with NSIS_CHAR_SIZE, some windows api structures have hardcoded buffer sizes (GetVersionEx etc.)


Pabs, NSIS_CHAR_SIZE is the internal char size representation. You absolutely need it. Unicode NSIS does NOT internally store things as UTF-8. It can read UTF-8 files but internally it works with strings as UTF-16LE which is the native Windows encoding of Unicode.

Also UTF-8 is not a superset of ANSI. UTF-8 is a superset of ASCII but as soon as you hit codepages, UTF-8 and ANSI are foreign to each other. So anything with localizations cannot be shared. But a lot of the NSH files and samples CAN be reduced. You still have to read the code very carefully to see if they are strictly the same. If it uses the System plugin, for example, the strings you allocate needs to be adjusted for Unicode vs. ANSI. Unicode should be double the width. They must call the appropriate API -- the wide versions vs. ANSI. They must use the right messages -- sometimes there are two messages with the same #macro but different values for ANSI and Unicode because one message ID returns ANSI string values while the other returns UTF-16LE values. Using things like System::StrAlloc, ${NSIS_CHAR_SIZE}, and ${NSIS_UNICODE} can help you write code that will port easily between the ANSI and Unicode NSIS but it will require quite a bit of care to make sure it is done right.

Here's an example where it's not so simple to share the same code for both Unicode and ANSI NSIS.

http://forums.winamp.com/showthread....hreadid=300271

Albeit, if you do the basic stuff only in your installer, you probably won't see these kinds of problems and keeping your installer just UTF-8 may be quite doable. (If you don't have any localized strings in the NSI script and rely solely on the Language Files distributed by NSIS, then even non-Latin installer may be able to go this route.)

If you really want to give it a try, please feel free. One thing I've noted is that the further you stray from the trunk, the harder it is to merge the changes for each release.

BTW, I definitely appreciate any help I can get. So I do value your input (and enthusiasm).


OK, fair enough on the size thing.

Are there any changes that can be merged now? Perhaps the conversion of the plugins/makensisw/etc to TCHAR instead of char?

Is there a need to duplicate the .nsh, .nlf etc files? This creates a maintenance headache IMO.

For example, the Unicode/ANSI versions of COPYING and nsisconf.nsh are identical after conversion to UTF-8. Same for the following directories containing Unicode & ANSI versions: Contrib/zip2exe/ Contrib/InstallOptions/ Contrib/Splash/ Contrib/Banner/ Contrib/AdvSplash/ Contrib/BgImage/ Contrib/VPatch/ Contrib/Modern UI/ Contrib/MultiUser/ Contrib/UserInfo/ Contrib/StartMenu/ Contrib/nsDialogs/ Contrib/Modern UI 2/.

I think it would be best to have one version of them (encoded in UTF-8 for compatibility with legacy text editors and Unix tools).

The differences due to System::Call or structure/string sizes can be wrapped in !if ${NSIS_UNICODE} or similar.

makensis can be made to convert .nlf & .nsh strings to ANSI/UTF-16LE at runtime.

Not sure how to deal with the Contrib/UIs/ differences, presumably you could set the class to RichEdit20, use a define of some sort or makensis could set the class at installer creation time.

I think the differences in due to registry key names and stuff are gratuitous and not necessary.


I agree with you Pabs. I'd like to see that done as well. When they are the same, we should keep them the same. Only fork them if they need to be different like the Language Files.

One thing about the !if ${NSIS_UNICODE} checks are that I'm afraid it might unnecessarily make the scripts harder to read. I know I hate it when I see a lot of #ifdef's in the C/C++ code. Also, the !if ${NSIS_UNICODE} checks may bloat the resulting installer with extra code as well. I'm not sure I like that.

As for the NSIS source code, I'd still like to be able to generate both the Unicode and ANSI builds of NSIS until everyone decides that ANSI is dead. Then we can do what the guys at PortableApps are thinking [1] -- eventually, wrap up a final build of ANSI NSIS for those writing installers for legacy systems and continue on with Unicode only. :)

[1] http://portableapps.com/node/17291


I was under the impression that !if was an installer-build-time statement like #ifdef in C, if it isn't, I mean whatever is available for installer-build-time statement conditionals.


Looking at the documentation, !if and !ifdef are exactly what is needed; they are conditional compilation, the ignored bits don't end up in the executable.

There aren't that many differences between the Unicode and ANSI examples and include files in the Unicode NSIS for you to be worried about making them harder to read.


UAC + Unicodebuilt
Hi

its my first post here ;)

Anders + jimpark

you did both a great job ;)

BUUUUUUUUUUUUUT

i use Anders UAC plugin together with your unicode nsis

with the ANSI build everything works

but in unicode mode...
during install for all users:
${UAC.RunElevatedAndProcessMessages}
only opens an adminpassworddialog at first run of the setup :(

if u start the setup again for all users u wont see that dialog anymore :(



next problem is:
if i call the uninstaller from the unicodebuild this line makes troubles:
${UAC.U.Elevate.AdminOnly} "${UNINSTALLER_NAME}"

error pops up:
unable to elevate, error 05536


hope u can help me :(

ty


Software:
OS: XP Pro
NSIS: 2.42.1 ANSI + UNICODE (jimpark)
UAC: v0.0.10a - 20081004 (Anders)


"if u start the setup again for all users u wont see that dialog anymore" does not make any sense, the UAC plugin does not save any state, so it should work the same every time. Can you explain what you mean by "for all users"?

Could you post a minimal example script with these problems?


Quote:


I took your UAC RealWorld example script
there u show a dialog where u can choose between single user and ALL USER installation

All user means -> NEED ADMIN RIGHTS

if a normal user choose that
the admin PWD dialog pops up

if i try this with ANSI build everything works
I can start the setup as often as i want and i get allways this ADMINPWD dialog

if i try the same script build with the UNICODE NSIS
the dialog will only be showen at first start :(

UAC::IsAdmin returns $0 >=1
if i run it as normal user!

thats the strange thing for the installation

maybe you can contact me in skype
ing30er

than i can show u my script

Originally posted by Anders
"if u start the setup again for all users u wont see that dialog anymore" does not make any sense, the UAC plugin does not save any state, so it should work the same every time. Can you explain what you mean by "for all users"?

Could you post a minimal example script with these problems?


Archive: Unicode


I only see one version of the UAC.dll in the UAC plugin package. UAC will need to be built linking to the Unicode NSIS library using the Unicode string NSIS stack.


v0.0.10a is not the latest version (and like Jim said, you need to recompile for use with nsis unicode)

I just tested the UAC RealWorld example and it worked fine (XP SP2 (non admin), NSISU 2.42.1)


....nsis.sourceforge.net/UAC_plug-in....

i see u updated that page anders ;)

last week it showed v0.0.10a ;)

can u tell me how i can convert your UAC.dll to unicode build?

sorry im noob in building dlls :(

would be nice if u can put both dlls in your UAC.zip Anders

or offer a unicode version

that would be veeeeeeeeeeeeeeeery nice :)


yeah I updated the page, but v0.0.11 has been on stashbox for months. To build the unicode version, just make sure UNICODE is defined (remove the comment in uac.h, or set the define in the project options)

I am planning to include the unicode version in the "official" package, I just need to automate the build


ok thx for your help

pls tell us than if u have done your new release with the unicode build included ;)


I just released 2.42.3 which has support for UTF-8 with and without a BOM for all the NSI, NSH, and license files. Enjoy.


What variable should be used in the .nsi/.nsh files? !ifdef NSIS_UNICODE? I'd like to commit the needed changes to the .nsh files that need to be different between Unicode and ANSI builds.

PS: could you please post future versions as a patch against latest SVN rather than against the latest release?


I also think that the changes to the plugins and makensisw code could be committed now since we don't yet compile with _UNICODE defined. Would that be OK kichik?

I note you use nsis_tchar.h instead of the Win32 tchar.h, why is that?

No need to add your name to every file, just credits.but is enough.

I notice you add some function documentation, please split stuff unrelated to the Unicode changes into other patches, so we can commit them earlier.

In some cases you comment out string definitions or code and add new ones afterwards, please just replace the old ones with new ones.

Please remove the ANSI/Unicode dirs and convert the files in the Unicode ones to UTF-8 without a BOM and have them replace the current ANSI files. Please also remove the related changes to the SConsripts.

I'm unsure about the changes to the cross-platform tools, on Linux at least, we just use UTF-8 everywhere.

Please don't refactor functions or add comments in the Unicode patch, but instead create a separate patch so we can commit that earlier.

Best to not remove existing comments or change whitespace either, thats another change that is unrelated to the Unicode stuff.

TypeLib.cpp seems to use PopStringW, doesn't it need to work in ANSI mode too?

Don't need to add comments like // reviewed for unicode support.

Don't need to add .aps files, ascii2utf16.py, Visual Studio 2008 directory, create_header.py, header-unicode.pdn, *.bat, zipsource.pl, modify_copyright.pl etc

Docs don't need to link to the unicode-nsis website.

s/th ANSI/the ANSI/ in intro.but

history.but doesn't need to be changed.

Scons directory should be named SCons.

Why the MSVS_VERSION change in mstoolkit.py?

2.42.3 fails to build under Debian, some of the errors:


Using GNU tools configuration
In file included from Contrib/Library/LibraryLocal/LibraryLocal.cpp:11:
Contrib/Library/LibraryLocal/../../../Source/tstring.h:24:21: error: windows.h: No such file or directory
In file included from Contrib/Library/LibraryLocal/LibraryLocal.cpp:10:
Contrib/Library/LibraryLocal/../../../Source/Platform.h:692: error: 'TCHAR' does not name a type
scons: *** [build/debug/Library/LibraryLocal/LibraryLocal.o] Error 1



In file included from Source/manifest.h:22,
from Source/build.h:30,
from Source/build.cpp:26:
Source/tstring.h:24:21: error: windows.h: No such file or directory
In file included from Source/build.h:22,
from Source/build.cpp:26:
Source/strlist.h: In member function 'int SortedStringListND<T>::find(const TCHAR*, int, int, int, int*)':
Source/strlist.h:377: error: there are no arguments to '_stricmp' that depend on a template parameter, so a declaration of '_stricmp' must be available
Source/strlist.h:377: error: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
Source/strlist.h:385: error: there are no arguments to '_strnicmp' that depend on a template parameter, so a declaration of '_strnicmp' must be available
In file included from Source/build.cpp:26:
Source/build.h: At global scope:
Source/build.h:99: error: 'TEXT' was not declared in this scope
Source/build.cpp: In member function 'int CEXEBuild::add_string(const TCHAR*, int, WORD)':
Source/build.cpp:461: error: '_strdup' was not declared in this scope
Source/build.cpp: In member function 'int CEXEBuild::preprocess_string(TCHAR*, const TCHAR*, WORD)':
Source/build.cpp:606: error: '_strdup' was not declared in this scope
Source/build.cpp:615: warning: suggest parentheses around + or - in operand of &
Source/build.cpp:615: warning: suggest parentheses around + or - in operand of &
Source/build.cpp:647: error: '_strdup' was not declared in this scope
Source/build.cpp: In member function 'int CEXEBuild::add_label(const TCHAR*)':
Source/build.cpp:963: error: '_strdup' was not declared in this scope
Source/build.cpp: In member function 'int CEXEBuild::add_function(const TCHAR*)':
Source/build.cpp:1025: error: '_strnicmp' was not declared in this scope
Source/build.cpp: In member function 'int CEXEBuild::add_section(const TCHAR*, const TCHAR*, int)':
Source/build.cpp:1181: error: '_strnicmp' was not declared in this scope
Source/build.cpp:1187: error: '_stricmp' was not declared in this scope
Source/build.cpp: In member function 'void CEXEBuild::warning(const TCHAR*, ...)':
Source/build.cpp:3253: error: '_vsnprintf' was not declared in this scope
Source/build.cpp: In member function 'void CEXEBuild::warning_fl(const TCHAR*, ...)':
Source/build.cpp:3274: error: '_vsnprintf' was not declared in this scope
Source/build.cpp: In member function 'void CEXEBuild::ERROR_MSG(const TCHAR*, ...) const':
Source/build.cpp:3298: error: '_vsnprintf' was not declared in this scope
Source/strlist.h: In member function 'int SortedStringListND<T>::find(const TCHAR*, int, int, int, int*) [with T = uservarstring]':
Source/uservars.h:76: instantiated from here
Source/strlist.h:377: error: '_stricmp' was not declared in this scope
Source/strlist.h:385: error: '_strnicmp' was not declared in this scope
scons: *** [build/debug/makensis/build.o] Error 1

Pabs, I can see how without knowing the motivations for the changes I made, you might think I'm an ego maniac. For example:

No need to add your name to every file, just credits.but is enough.
Don't need to add comments like // reviewed for unicode support.
I used a Perl script to keep status on what files I've modified and which ones still need to be reviewed. That string with my name in it is what the script looks for -- it was my own sign off that the file was finished. It was natural to use my name in such a comment. Those comments are still useful for me. I'm not removing the comments. But when they get into the official release, feel free to purge the comments.

Don't need to add .aps files, ascii2utf16.py, Visual Studio 2008 directory, create_header.py, header-unicode.pdn, *.bat, zipsource.pl, modify_copyright.pl etc
They are useful for me. Most of these scripts are helper scripts. If they are not useful for you, don't include them in your build.

Docs don't need to link to the unicode-nsis website.
If my stuff doesn't link to the unicode-nsis website, what will? My site is the only place to get the Unicode source and build at the moment.

history.but doesn't need to be changed.
Ah, but it should have been changed even more. It should be kept up to date with all the changes I've made and features I've added on top of what's happened in the official build. I've just been too lazy to keep it up to date.

Best to not remove existing comments or change whitespace either, thats another change that is unrelated to the Unicode stuff.
I found that the files were inconsistent. Some were saved as DOS text files, others Unix style files. Same with white space. I just made them consistent because I use one text editor and me touching the file made the diffs too different. Again, it's a change I made to make updating the build with the changes of the official release easier.

Please don't refactor functions or add comments in the Unicode patch, but instead create a separate patch so we can commit that earlier.
You are just teasing me with future promises. :) I refactor when my engineering sensibilities says I should. (It's probably what happens when you work on something by yourself for 2 years.) Personally, I think more refactoring and restructuring would do the source code good. But again, I do this so it would help me to continue to update the build.

Why the MSVS_VERSION change in mstoolkit.py?
I made the change to be able to build on MSVS 2005 and later MSVS 2008.

2.42.3 fails to build under Debian, some of the errors:
Sorry, I don't cross build on Debian. It's difficult enough to get this working on MSVS and update the code every release all by myself. So if you'd like to figure it out and let me know what needs to be changed, I'd appreciate it.

having been reading through this thread for some time, have to say that how jim is running the unicode package pretty much works fine as far as end-users are concerned and that really is the main thing overall. yes it'd be nice as pabs says to have patches, etc but considering the work done and other things that need to be changed there's not much point until it is decided for definite how things are going to go as an official unicode nsis package.

with the thing about refactoring, i know kichik has done loads of that over the process of the nsis project so there's no real issue with it being done in a branched version of the code (as this ultimately is) and is no more different than the noobjs branch or the one justin did a few months back - is good to try things out especially in a side-project before/if it is rolled out into the main distro.

-daz


The list was merely a bunch of things I thought that should be changed before including the Unicode stuff in trunk and releasing it as part of NSIS officially. I do understand you made those changes because they were useful as part of a separate branch though. I certainly didn't take you for an ego maniac Jim :)

I'd like to start the merge of the patch with the changes to the plugins, then some of the changes to the include scripts and examples. These changes should have zero impact on the current ANSI version of NSIS (because _UNICODE isn't defined anywhere yet) and will make the diff afterwards much smaller and easier to review. I'd like to hear some devs opinions on that first though, kichik, anders etc?

After that, I'm not sure what was decided what the Unicode support should look like.

The Debian build failure looks like some of the cross-platform code is trying to include windows.h through tchar.h/tstring.h, which obviously fails on non-Windows platforms. Not sure where to start on fixing that though.


My offer of developing this as a branch on the main SVN still stands. This will allow for better logging so there will not be any need for comments on changed code. It will also be much easier to back port refactored code to trunk.

As for modifying code in the plug-ins that won't affect the ANSI build, I wanted to keep that for later because I feel we're going to need some kind of nchar_t, at the very least in makensis, so it'd all go smoothly on all four flavors of win32/posix and ansi/unicode mix. Paul's problems with tchar.h are part of what I want to avoid.


I'm thankful for the offer but I think you have the roles reversed. I'm offering you my modifications to enable Unicode support in NSIS. I've been offering it to you for a year and a half. So I would be more thankful if you said that you will take the code and create a branch that you and others in the development team will henceforth maintain as being an official build of NSIS. Then I can be out of the picture knowing that NSIS will officially support Unicode and in good hands. And you can do whatever you'd like with it.

So as Pabs mentioned, I think I've done quite a bit to make it as easy as possible for you to take it and run with it. With UTF-8 support, most of the scripts don't even need to be converted to anything. Just a dab of !ifdefs in there to make sure things are calling the right API. But as long as this does not become part of the official NSIS releases maintained by the official team, as long as I have a personal need for it, I will have to maintain my fork as long as I can. To that end, I will do whatever is necessary to make that task easier for me even if it means offending some sensibilities of the official developers.

So to be absolutely brunt: Take the code, do what you like with it. If you don't plan on working on it, then you forfeit the right to complain about things that don't affect you.

But I sincerely hope you take it.


I'm not offended. I'm very grateful you keep maintaining this branch. I'm simply suggesting a method that would make it easier for the both of us. It'll be easier for me to track changes, back port non-Unicode related changes and insert my own changes to the Unicode branch that can later help with the merge. It will be easier for you to to keep a log of changes, affect the trunk where needed and make new version merges easier. Overall, I think it'll save time for the both of us and benefit the community.


Okay, I think that would be meeting in the middle. So how do we proceed? Does this mean taking the most updated source I've got and committing it to the official repository as a fork? And then we all work with that? Or does it mean, I have to do another month of work so that it's in the form of patches to the current official source code? If it's the latter, I don't think I've got the time and resources to do that.


I give you access to SVN, you open a branch and commit your changes in any way you like.


Sure. Let's do it.


SVN already contains a UNICODE branch from SVN that hasn't been updated in 2 years, perhaps that should be deleted and a new one created?


I don't know about deleting it but the plan is to add a new one. I've been busy again so I haven't looked at the newest release. My plan is to check the Unicode source in once the code is updated to the current release of NSIS.


Sounds good.


variables
How can I make variables in Unicode NSIS also store Unicode stuff? Is there any special declaration required for this?
Here is the example code:


var StorageDir
!define MUI_DIRECTORYPAGE_VARIABLE $StorageDir
!insertmacro MUI_PAGE_DIRECTORY

section
fileOpen $0 "$INSTDIR\bin\StorageDir.bat" w
fileWrite $0 $StorageDir
fileClose $0
sectionEnd


When StorageDir with a non-ASCII name is selected by the user, its value is replaced by question marks later in the script, and question marks are placed to the output file. Is there a way to make it work? I tried recode the script to UTF8 and UTF16LE, does not help.

Thanks,
Alex

Use FileWriteUTF16LE to write Unicode text to the file.


Awesome! What about matching strings? I need to replace a string in a file, and I am using a solution from
http://<br /> http://nsis.sourcefor...ext_File<br />:


StrCmp $2 "sometring$\r$\n" 0 +2
StrCpy $2 "$storageDir$\r$\n"


How do I match end of line there?

I think at least a short list of these new commands deserves a place in the FAQ...

I figured it out by splitting file to chunks.

But it turns out UTF-16LE would not work for me, I need output file in UTF-8. There is an NSIS plugin which converts between ASCII and various Unicodes, but not between UTF-16 and UTF-8. Is there a way to make it work without using an external converter?


Currently, there is no way to do it without a plugin or additional functionality written to the NSIS exehead. We currently only support the default Unicode encoding of Windows which is UTF16LE.


you can call WideCharToMultiByte with the system plugin, CP_UTF8 is supported on XP and later IIRC


Anders is right but unfortunately, all the strings are actually stored as UTF16LE internally. And when FileWrite is called, I do an internally UTF16 to ANSI conversion to write it out. So after the conversion, you can't just use FileOpen and FileWrite to write out this new text. You would have to use the System plugin to open your own file and then write out the new UTF8 buffer you just created using the Windows API yourself.


there should really be a FileWrite /ascii ... flag or something like that


Error compiling example

Originally posted by jimpark
I just released 2.42.3 which has support for UTF-8 with and without a BOM for all the NSI, NSH, and license files. Enjoy.
Hi Jim,

I'm playing around with your Unicode build. Nice work.
While trying to compile all the examples, compilation of gfx.nsi leads to an exception upon the first !insertmacro BIMAGE....

Compilation using ANSI-NSIS V2.41 works, although the images are weirdly stretched.

New Unicode.dll

Originally posted by akopts
I need output file in UTF-8. There is an NSIS plugin which converts between ASCII and various Unicodes, but not between UTF-16 and UTF-8.
Since I had to cope with the Unicode.dll last week, I added the missing routine FileUnicode2UTF8 to Unicode.dll.


int WideCharToMultiByte(
__in UINT CodePage,
__in DWORD dwFlags,
__in LPCWSTR lpWideCharStr,
__in int cchWideChar,
__out LPSTR lpMultiByteStr,
__in int cbMultiByte,
__in LPCSTR lpDefaultChar,
__out LPBOOL lpUsedDefaultChar
);

According to Microsoft's documentation WideChar2MultiByte should return the BufferSize needed if parameter cbMultiByte is 0, i always got 0.

So to convert a file to UTF-8 I make the buffer as large as the inputfile (Unicode-16) and if the function returns "ERROR_INSUFFICIENT_BUFFER" I reallocate the buffer with double the size and repeat.

I hope It will be of use.
Axel

Thanks for reporting the bug, AxelMock. I've found the problem and with a few other fixes, I've bundled it up as 2.42.4 which is now available on my site. I have yet to work on the 2.43 version though.


Thanks for the advices! What I ended up doing though was writing a small Java program to do the conversion and substitution. Since I am installing Java software, Java is always available on the system...


How can I disable this output from makensis.exe?
----
File 'C:\Program Files\NSIS\Unicode\nsisconf.nsh' has a BOM marked as UTF-16LE.
Opening 'C:\Program Files\NSIS\Unicode\nsisconf.nsh' as UTF-16LE.
Opening 'C:\Users\tolan.ALAWAR\AppData\Roaming\nsisconf.nsh' as UTF-16LE.
File 'bigtest.nsi' has a BOM marked as UTF-16LE.
Opening 'bigtest.nsi' as UTF-16LE.
----


Why is that output a problem?


I'm employing home brewed make_installer script to make installer. One thing it does it checks NSIS output for errors with /V1 option. If there is something - script grabs it and displays it to user in a nice MessageBox with big red buttons.

Sure, I can check for errors other ways.

However it seems weird to me to see even /V0 option ignored.


Well, I considered it necessary to know what you were getting. makensis.exe will now pretty much read anything so without that message, you won't know what sort of encoding was used. If everybody feels that /V0 and /V1 should remove these messages, it can be done.

As for your script, you can probably grep for "Error:" or "error" and get your error messages.


I think those messages are OK for /V3 or /V2. No need to completely remove them.


MakeNSIS.exe returns an exit code that can be used to determine if there were errors. You can check it with batch or vbscript commands (whichever way you run makensis).

I do my builds with /V3 and check the exit code for 0.

Don


I believe some messages are already printed regardless of your verbose level. So how do you deal with those tolan?


jimpark, such cases should be reported as bugs. Everything makensis prints should be controlled by the verbosity level. As for those specific messages, I believe SCRIPT_MSG would be best.


Archive: Unicode


Yes, that's what I was thinking but it's not worth making another release. It should be fixed in the next release when I get around to it.


Hi Jimmy,

Unicode NSIS is awesome, I have just compiled Quick AVI Creator with Unicode NSIS and everything seems fine. The only problem which seems that I have is unrelated to Unicode NSIS. It is with the Unicode plugin which I used for converting ansi text subtitles but I guess I will find some workaround.

Unicode NSIS definitely worths to merged with the main NSIS project.

Congrats Jimmy.


It appears that somebody did add Unicode support to the AccessControl plug-in but the download is missing from the plug-in page. http://nsis.sourceforge.net/AccessControl_plug-in
Does anybody have a copy of this somewhere or know of something else I can use from the Unicode install to change file user permissions?


Unicode + WriteIniStr?
I've got a legacy installer that uses InstallOptions, and WriteIniStr to choose exactly what that dialog will display at runtime. Doing i.e.

StrCpy $R1 "${ALREADY_TEXT}"
MessageBox MB_OK $R1
WriteIniStr "$PLUGINSDIR\already.ini" "Field 2" "Text" $R1


I see the right text in the message box, but I see munged text in the installer. The .ini file contains the munged text full of question marks.

Does WriteIniStr not handle Unicode? Is there an alternative that does, or a workaround?

Unfortunately, the Windows API that deals with INI files saves everything as ANSI even when the wide-char versions of the API are called.


did we not cover this already? the .ini file needs a UTF16LE BOM before you start writing to it with ini functions IIRC


Oh yeah, we did cover this. Sorry Anders. My memory is damaged or something. Yeah, so as Anders says, if you create the file yourself and add the BOM, then you can use the WriteINIStr afterwards and things work. Somewhere in this thread or elsewhere in this forum, we have the same conversation where Anders corrects me.


That was exactly it, thanks. I tried to search, and tried to wade through all the pages here, but missed it. Everything's working wonderfully now.


Anybody knows, what about unzip plugin for unicode nsis?
NsisUnz and ZipDll does not work ;(


Guys, please anybody help me!
I try to migrate my nsi script to Nsis unicode, but I can't unzip file with data. I download from from web site, but I can't extract it. Please, help me! :(


Each plugin must be modified and recompiled to work with Unicode NSIS. If those plugins are a must for you, you should contact their authors or modify the plugins yourself. What is it that you are actually needing to do? Perhaps, there's another way to do it without using the plugins.


I try to rebuild nsisunz and ZipDll plugins, but all of my attemps was failed :( I don't sure that authors supports their plugins.

In depends of file name, I like to download in realtime from server different archive file, and extract it in user's computer.
Archive created manually - so I can be use any archiver.
Do you see other way to fix this problem?

Thanks.
PS: Sorry for my english.


How plugin can detect that it has been called from Unicode NSIS or from Ansi NSIS? Maybe add some new parameter in "extra_parameters" structure.


@Instructor, funny you should ask that, I was thinking about it the other day. When hwndNSIS is non NULL you can check that, but if you also support .onInit or silent installers, the only thing I could come up with was checking $pluginsdir:
if the 2nd BYTE is 0, you can be (pretty) sure its UNICODE

It will be true for any default $pluginsdir:
C:\foo
.\foo
\foo

now if someone changed $pluginsdir to a relative path like "unicodefoldername\foo" it could break, but I don't see a valid reason to do that

$pluginsdir is INST_LANG+2 and not documented, but thats another issue (the same goes for $temp)

The final option is to check the import table of the .exe at runtime, but that is a bit more work


V2.45 as Unicode build??
Hi,

I might get into the situation having to support a bunch of message box texts, and customized texts in 25 languages.

Translation will be done by a translation service which will surely deliver all results as Unicode.

So the Unicode built of NSIS will become a MUST.

The only problem is, that we must support Windows7 and as I understand only NSIS 2.45 has the necessary changes to support Win7.

Therefore: Is a 2.45 build of Unicode NSIS in sight or will Unicode build be merged to the standard build of NSIS?

Sorry, I couldn't follow all the discussions for the last 2 or 3 months.

Regards,
Axel


I, too, hope that NSIS will go Unicode officially at some point in the near future.


I agree. Jimpark's work is great. Why is NSIS still using non-unicode branch as the default? I know, I know - Win9x doesn't support unicode. But, who the hell uses Win9x anymore? It's older than 10 years old, and frankly the only people using it are using it to support old technology (not to download & install the latest program from the internet).

And if NSIS is going to continue support old technology that no one uses anymore, why doesn't it support Windows 3.1. Or, heck Windows 1.0. I'm sure someone uses those.

The first mainstream Windows to support unicode was Windows 2000 - nearly 10 years old itself. So this is hardly newfangled technology.

Kichik, what would it take to make jimpark's code the default branch? And what's stopping this changeover?


Originally posted by vbguy
I agree. Jimpark's work is great.
Fully agree!
And I agree with you that Win9x support nowadays is a bit difficult to understand.
Even W2K drivers can only be certified by MS as an addendum to XP drivers.

Is there any good reason, why NSIS support for Win9X could not stop with some version (e.g. 2.45) and all newer versions only support Unicode capable OSes?

Perhaps we should start a poll in the NSIS forum WHO is still developing for Win9x and why, and if these developers could live with a frozen ANSI NSIS.

To start with myself:
W2K is the minimum required OS.

Originally posted by vbguy
And what's stopping this changeover?
time and resources i'd go with.

-daz

Originally posted by DrO
time and resources i'd go with.

-daz
Right, but that's why I was asking. Tell us what needs to be changed or fixed so the community can chip in and get the Unicode branch officially supported.

In other words, Jimpark already did most of the work. What can the rest of us do to get this thing finalized?

I'm not saying "fix it for me". I'm offering my help. But I don't want to waste my time if there are no plans to officially support Unicode.

Also, I know it was mentioned elsewhere, but I don't think we should wait for an arbitrary version (e.g. version 3.0) to switch to Unicode. I think it should be done in the next 1 or 2 point releases.

Kichik: Can we get a comment from you?

2.45 is out!
I've just released the Unicode version of NSIS built around NSIS 2.45 along with the source code and an ANSI version built from the same source. You can find it at http://www.scratchpaper.com as usual.


And I recently created a plugin that can call a ansi plugin in a unicode installer. It is very much in the testing stage, so get it @ http://nsis.sourceforge.net/CallAnsiPlugin_plug-in and let me know of any problems


Re: 2.45 is out!

Originally posted by jimpark
I've just released the Unicode version of NSIS built around NSIS 2.45 along with the source code and an ANSI version built from the same source. You can find it at http://www.scratchpaper.com as usual.
Congratulations for the new release. I don't understand why this is not merged into mainline.

The code is using the generic-text mappings found in tchar.h (http://msdn.microsoft.com/en-us/libr...8VS.71%29.aspx). This is the documented way of having the same code compiled for ansi and unicode on Windows platforms.

Cheers,
Cristian.

I have a small problem with NSISu: it does always print messages about including files.

The makensis log is full on inclusions like the one below and I discovered that using /V does not help. Is there any way to hide this and still be able to log any warnings or errors?

Opening 'C:\dev\pj\src\contrib\nsis\Contrib\Language files\English.nsh' as UTF-16LE.
File 'C:\dev\pj\src\contrib\nsis\Contrib\Language files\English.nsh' has a BOM marked as UTF-16LE.


Fonts
I'm writing a multilingual install script that needs to support English, Chinese, Japanese, and Turkish among others. I've been able to convert my nsi files to utf-16.

However it seems that the font that the installer is displayed in doesn't have the Japanese characters that I need, instead I just see boxes where the characters should be (the Turkish seems to render fine). I know the computer must have some font that displays these Japanese characters because when I open the .nsi file in wordpad.exe I can view the characters.

Any suggestions? Is it possible to make the installer force display in a certain font? Thanks.


Either create a unicode installer, or set your windows's default codepage for non-unicode applications to Japanese. (Or run the installer through applocale, which is equivalent.)


ngroman, are you posting your problem here because this is a problem you are seeing in the Unicode NSIS? Or are you trying to use the standard NSIS to achieve this? With the standard NSIS, you aren't going to be able to create one installer that can display all those languages together. You'll need Unicode NSIS for that, as MSG stated.


jimpark:
I'm using nsis-2.45-Unicode to compile my .nsi file. My nsi script file is encoded in utf-16LE (I saved it as unicode in wordpad). I verified with chardet in python that it is in fact saved as utf-16LE. Also the output from makensis showed that my file was being opened in utf-16 mode.

The computer I'm running this has XP with the MUI packs installed for French, German, and Spanish (I do not have the MUI for any Asian languages). However the fact that I can see the Japanese characters in wordpad means there is something on my computer that can display these characters.


Have you tried installing the Asian language support? If not, why not?


I realize that I could do that, and it would probably solve my issue...for me; however, most of my users will have standard XP installs that don't have Asian language support either. If I'm going to show a drop-down menu of languages it wouldn't look very good to have some of the languages show up as a bunch of blank boxes to some users.


The NLF files in "Contrib\Language files" specifies which font to use. These have been chosen by the translators so that they look right. But if you want to change the font for a language, you do have the option of using SetFont /LANG=lang_id.


Originally posted by ngroman
I realize that I could do that, and it would probably solve my issue...for me; however, most of my users will have standard XP installs that don't have Asian language support either. If I'm going to show a drop-down menu of languages it wouldn't look very good to have some of the languages show up as a bunch of blank boxes to some users.
Don't show the drop down list. Just detect locale or Windows UI language and auto set installer language to be consistent with locale/UI language.

Originally posted by grzech
Don't show the drop down list. Just detect locale or Windows UI language and auto set installer language to be consistent with locale/UI language.
Autosetting the language to the Windows locale is very bad practice. I for one am running in Japanese locale, while I can't read Japanese without a dictionary (and copy-paste, at that). I keep Windows in Japanese locale because of our translation projects, but it is also very common practice in general in communities of Asian pop-culture fans.

Installers such as nVidia's drivers are extremely annoying in doing everything in Japanese automatically, so the user ends up having to guess which buttons to push.

Yeah, autosetting to Windows UI language is safer. Guys who can't read Japanese rather won't be running Japanese Windows. :D


I agree with MSG on that one. I hate it when the installer thinks it knows in which language I'd prefer to read my installation instructions. I may have the locale set to some other language that I can read and write but not comfortable enough to speak "geek" in. I'd much rather be given a choice: at the very least, the locale or English? (Okay, maybe I'm being English-centric here.) Or maybe at the download site, I should be given a choice as to which language installer I want.


Originally posted by sealite
I have a small problem with NSISu: it does always print messages about including files.

The makensis log is full on inclusions like the one below and I discovered that using /V does not help. Is there any way to hide this and still be able to log any warnings or errors?

Opening 'C:\dev\pj\src\contrib\nsis\Contrib\Language files\English.nsh' as UTF-16LE.
File 'C:\dev\pj\src\contrib\nsis\Contrib\Language files\English.nsh' has a BOM marked as UTF-16LE.
I have this problem too... It's very annoying... I want to turn off this messages, because they spamming my command window. How can I do it?

jimpark:

All of your downloads have a wrong file extension. .exec instead of .exe.

You did a great job, except for some old plugins, like delay.dll, which won't work, it's just great. I've converted all my scripts to Unicode, including the Essentials Pack.

The abilities with the Unicode version are just awesome.

Keep on your good work, I hope you'll provide updates in future too.

-Chris


In the next release I will try to address both of the issues you've listed above. The "exec" extension is just because of google sites which keeps me from uploading anything with the .exe extension as I've explained on the site. I may have to find another server, though, because it is highly annoying.


Originally posted by scully13
It appears that somebody did add Unicode support to the AccessControl plug-in but the download is missing from the plug-in page. http://nsis.sourceforge.net/AccessControl_plug-in
Does anybody have a copy of this somewhere or know of something else I can use from the Unicode install to change file user permissions?
I just a made a NSIS-Unicode port of it, and uploaded it on
http://nsis.sourceforge.net/AccessControl_plug-in

Originally posted by Anders
When hwndNSIS is non NULL you can check that, but if you also support .onInit or silent installers, the only thing I could come up with was checking $pluginsdir:
if the 2nd BYTE is 0, you can be (pretty) sure its UNICODE
Hi Anders, I don't see how this could work :
To reach variable INST_LANG+2, you need to compute g_variables[(INST_LANG+2)*g_stringsize]
but you don't know if you're dealing with CHAR or TCHAR g_variables (which would need to multiply the index by 2)

Archive: Unicode


@Wizou: If you check my BgWorker plugin, you can see that I'm checking for kernel32::lstrcatW in the import table.

I also have some sample code in there that checks $pluginsdir, but like you said, you don't know the offset, so it has to use IsBadReadPtr() and there could be false positives.

Unfortunately, it turns out that writing hybrid plugins is a huge pain in the ass, and I can't recommend doing it on anything except very simple plugins


For the next release, I plan on adding to pluginapi.h:

// True if NSIS is built with Unicode support.
bool NSISCALL IsUnicodeNSIS();

Also, Olivier noted that you could use IsWindowUnicode(hwnd). But this may not work on silent installers.

However, I would still encourage that the plugin writers use TCHARs or a strategy similar to it and compile and link with the right Unicode declarations and definitions. I know there is a temptation to try to generate a single binary that could work on both the ANSI and the Unicode NSIS variants but that seems a bit dangerous to me.


@Anders: IsBadReadPtr won't help you here as it will be valid memory range (.bss variables area), just you're not checking a valid Unicode 2nd byte.
Checking 'lstrcatW' import from main module is however a good idea :)

@jim park: I still wonder how you would create such a IsUnicodeNSIS, unless you pass special exec_flags_t value from NSIS to the DLL


If we have such a way for the DLL to detect the version of NSIS, I'm thinking about creating a new pluginapi.h/pluginapi.lib that could transparently convert variables and stack from ANSI or Unicode to the plugins choice depending on whether it is called from Unicode-NSIS or not
This would make plugins creator's life easier.


@jimpark: What about turning on/off "messages about including files" ? NSIS Unicode is great tool and I ask You to add this option, please.


Noooooooooo, we don't need a IsUnicodeNSIS() function or any kind of pluginapi helpers to do both.

It has already been done with the unicode layer for win9x, just use that.

I'm with Jim on this, plugin devs should use TCHAR and compile two versions. There is also my CallAnsiPlugin plugin for older stuff with no source.

@Wizou: well, I'm sure it can be done without checking the import table. The IsBadReadPtr usage was just a big hack and I never finished that type of detection.


@Wizou: That's a good point. It's very difficult to write an IsUnicodeNSIS(). The only thing I could do is try validating the string (if anything) to see if it's good UTF16LE. Look in validateunicode.cpp which is in Source. You'll see a class there that can validate UTF16LE/BE as well as UTF8.

@Anders: I agree with you in that this API, even if accurate, is a bad idea. I'm pulling IsUnicodeNSIS() from pluginapi.h. It will NOT be in the next release.


Does someone know why I'm always getting empty unicode files when I'm using:

unicode::FileUnicode2Ansi "${_HDDS_countFile}" "${_HDDS_countFile}_tmp" AUTO
Pop $R0
${If} "$R0" != 0
# error handling
${EndIf}

The source files type is "UTF-16LE|UCS-2LE".

-- Problem solved ---

I was using Unicode plugin v1.1.
I've tried version 1.0 and everything is fine now.

If we are going to have any chance of a merge, the examples and header files need to be 100% the same, meaning that both the current A and U builds need some changes:

*The ansi version needs to change its handling of &wXX so that XX is a count of characters and not bytes (and update the docs) (The unicode version already has this fixed)

*The unicode system plugin needs to default to the W version of functions when autodetecting (foo >> fooW and not fooA)


To reduce the NSIS_MAX_STRLEN confusion, we should probably make it really clear in the docs that NSIS_MAX_STRLEN is a count of TCHAR's (Or maybe even rename it). We should also add a predefine for the count of bytes, I suggest we name it NSIS_MAX_STRCB

Both versions currently fail this little test script


Originally posted by Anders
*The unicode system plugin needs to default to the W version of functions when autodetecting (foo >> fooW and not fooA)
We actually do default to the "W" version. Let me explain a little with how the "T" versions of the functions work for those who don't know.

We write in our code:


::CreateFile(...);


When you look at the definition of CreateFile, you see that there really is no function called "CreateFile" but two functions called "CreateFileW" and "CreateFileA." This is true for MOST of the functions that have both the A and the W versions.


#ifdef UNICODE
#define CreateFile CreateFileW
#else
#define CreateFile CreateFileA
#endif // !UNICODE


There are functions which only have one name, with no W or A prefixes.

So the logic that is in the system plugin currently is to try to load the function with the name as is, then failing that, try to add "W" for the Unicode build (or "A" for the ANSI build) and try again.

Unfortunately, for older string functions like lstrlen, the lib you are trying to load DOES have a lstrlen which is ANSI only. It's probably there for backward compatibility since it mimics standard library string functions. So for lstrlen, you need to specify kernel::lstrlenW. But as far as I've seen, it's only these string functions that look and behave like the standard library ones that have this problem.

A possible change could be that we always suffix with a "W" or "A" and see if that loads something and try that function first before we try the name of the function unmodified, but I'm uncomfortable with that since I'd rather give the programmer the benefit of the doubt and let him/her decide what s/he really wants to load.

I don't mind the plugin second guessing after failures but I don't want it to think it's smarter than the programmer.

To reduce the NSIS_MAX_STRLEN confusion, we should probably make it really clear in the docs that NSIS_MAX_STRLEN is a count of TCHAR's (Or maybe even rename it). We should also add a predefine for the count of bytes, I suggest we name it NSIS_MAX_STRCB

Both versions currently fail this little test script [/B]
I agree that we should note that NSIS_MAX_STRLEN is in TCHARS. But going through the documentation, they all state that NSIS_MAX_STRLEN means a number of characters, not bytes. So I'm actually okay with the documentation as-is.

As for adding NSIS_MAX_STRCB, I think that's a great idea, although, it can easily be calculated by NSIS_MAX_STRLEN * NSIS_CHAR_SIZE. I like NSIS_MAX_STR_BYTES better as a name, though.

Unless the NSIS trunk has NSIS_MAX_STR_BYTES, it would be useless for me to add it just to the Unicode version as it's meant for portability across the two versions.

So perhaps we can request NSIS_MAX_STR_BYTES to be added to the NSIS trunk?

For your information, I'm starting to merge Unicode port of NSIS version into NSIS repository.

I will try to take the most of what has been discussed here into account.
However my plan is first to make it the most transparent possible for the user and the plug-ins developer.


That's excellent news! Is there an ETA on when this will be complete?


No ETA yet.. I just started the SVN branch I will work on..
I guess the first merge/port will be easy especially as your work will greatly simplify the effort.
But then there will be a phase of discussion/arbitration/tweaking/testing as to what should be the user & developer experience regarding this new version.
I will keep you updated on my progress.


well, if we don't look for "functionW" before "function", we are going to need some kind of define that is "A" or "W" and maybe a list of API's where it is required. IMHO that is the wrong approach, and the reverse is needed, they need to look for the char type specific version first unless a (new) option is provided in the system call


my opinion is to check whether the parameter type contains a "t", then we know we are facing an API which has two variant..
and only in this case, we start checking for 'functionW' or 'functionA' variant first.. if entrypoint is not found, then we check for 'function'


@Wizou: the problem with that check is that if the user calls CreateFile() with 'w' in the Unicode version of NSIS, that's not wrong either. CreateFile should still resolve to CreateFileW.

Like I said, the only problems I've seen are the string functions that mimic standard library calls. There aren't that many and they are all lowercased like C standard library functions. If the person has programmed C at all, functions like strlen should jump out as taking char* and not wchar_t*. So it should raise a warning flag and the user should realize he should put a W at the end.


Windows programmers don't think about that, I'm so used to just calling lstr*, I don't think about the char type. Whats wrong by defaulting to the W versions and only falling back if W does not exists or a option to disable it was set?

Do you have a single example of a function that needs to be called without W when running as unicode (meaning foo should be called with WCHAR* even if fooW exists)?


Originally posted by jimpark
@Wizou: the problem with that check is that if the user calls CreateFile() with 'w' in the Unicode version of NSIS, that's not wrong either. CreateFile should still resolve to CreateFileW.
I don't agree with you. This is wrong.
If the user explicitely wants to call the Unicode version using 'w' as parameter type, he has to specify CreateFileW as the function name.

Summary:
* the user wants to call explicitely CreateFileA => he uses function name CreateFileA and parameter type 'm'
* the user wants to call explicitely CreateFileW => he uses function name CreateFileW and parameter type 'w'
* the user wants to call CreateFile whatever suits best according to NSIS variant => he uses function name CreateFile and parameter type 't'
* outside the presence of parameter type t, we can't suppose the existence of a A/W suffix, and it seems to me it would be wrong to try to add one (it would risk to reach another API which is unrelated)

Yeah, I agree with Wizou


That would be nice if everyone did that. But remember, when people read the Windows documentation, what do they read? The documentation says to use CreateFile(). It does not say to use CreateFileW() for Unicode. We're just lucky that Microsoft has been largely consistent with these W and A suffixes.

If you write C/C++, you will also be calling CreateFile() with wchar_t*. Because that's what the documentation tells you to do. You just have to make sure you define some macro like _UNICODE or UNICODE. If you read the fine print you get the "is implemented as CreateFileW() and CreateFileA()" but no example tells you to use these directly.

And remember, there's really no bearing on type-safety here. I can use any old function signature and as long as the name of the function matches, the System plugin will push your arguments, whatever types they may be, onto the stack and happily call the function. The arguments can be totally wrong types and have bogus values. So it's not even useful in that respects either.


To the best of my knowledge, there IS no CreateFile-function, only CreateFileA and CreateFileW. CreateFile is #defined to CreateFileA or CreateFileW in WinBase.h, depending on whether or not UNICODE is defined.

As such, if the Sytem plug-in allows you to just write CreateFile, then it must either already be checking for *A, or it has some predefined mappings.


Originally posted by Anders
Windows programmers don't think about that, I'm so used to just calling lstr*, I don't think about the char type. Whats wrong by defaulting to the W versions and only falling back if W does not exists or a option to disable it was set?

Do you have a single example of a function that needs to be called without W when running as unicode (meaning foo should be called with WCHAR* even if fooW exists)?
I believe there are calls in the Windows API where there is no ANSI counterparts so the "W" suffix version would fail. But I don't know of a case where the "W" suffix version would also be a different function. That doesn't mean that they don't exist or will never exist.

However, remember, this plugin architecture can call other libraries than those provided by Microsoft. And who knows what other people do.

Basically, as I mentioned before, it is wrong to second guess the user from the onset. Don't you hate it when your word processor "corrects" your typing when you type some acronym thinking it's SMARTER than you? I don't believe programs should behave that way. And I certainly don't want a programming tool to behave that way.

User: "I want to call Foo()."

Computer: "No, you don't. You want Bar(). Bar is nicer."

User: "No, I want Foo()!"

Computer: "Hey, I found Bar()! Look at that. It's a function that exists. Let's just call Bar()."

User: "I said, call Foo()!!!"

Computer: "Silly user. I called Bar() for you which is much nicer and that caused a booboo. You better go fix it."

What a nightmare!

@Pidgeot: Yes, you are right. There is no CreateFile() function in the library you link to. There's a macro. But the documentation tells you to call CreateFile().


Originally posted by jimpark
That would be nice if everyone did that. But remember, when people read the Windows documentation, what do they read? The documentation says to use CreateFile(). It does not say to use CreateFileW() for Unicode. We're just lucky that Microsoft has been largely consistent with these W and A suffixes.

If you write C/C++, you will also be calling CreateFile() with wchar_t*. Because that's what the documentation tells you to do. You just have to make sure you define some macro like _UNICODE or UNICODE. If you read the fine print you get the "is implemented as CreateFileW() and CreateFileA()" but no example tells you to use these directly.
You will not be using wchar_t*, you will be using LPTSTR. t is the system plugins version of LPTSTR. The minute you start using w / wchar*, you already know you are working with unicode only. In 99.99% of cases, if there is a t, you need to look for the W/A versions first. This is the most compatible approach (And the only way to achieve A/U compatibility with existing code)

@Anders: What you say is true. But if you are coding a Unicode only app, you will notice that people use wchar_t with CreateFile() and not CreateFileW().

Either way, I don't like the system to think it's smarter than the programmer.


but we sort of have to be, in windows/C world, this stuff happens at compile time, for us it happens at runtime, and so, t is our clue that the user trusts us and expects us to the right thing

(OT: Why don't you two hang out in IRC, this discussion would go a lot faster that way)


Unfortunately, my access to the internet is very limited. So I can't IRC. This is also what keeps me from being able to work on the NSIS repository directly (and hence that's why I haven't created a branch in the repository for my stuff here). So once the group actually takes my code into the repository, I can't work on it anymore. Anyway, that's my situation and will be for the foreseeable future.


Well, there are HTTP IRC "clients" that work just fine, try mibbit


You are both right...
The user doesn't want to care or know about CreateFileA or W, he will just use CreateFile... and parameter type 't', and there will be no problem.

I think the few users that care to write their own System::Call (rather than copy/pasting them from somewhere) should know about TCHAR/LPTSTR and A/W variant.


About recompiling plug-ins ?
Hi jimpark,

Lately, I've been studying some C, and got the hang of recompiling plug-ins. I've done 2 so far, and almost/basically got the registry plug-in done.
Using an NSIS .nsh script instead of the plug-ins just ended up being to slow

My question is about including <nsis/pluginapi.h>, if I just use this, the script is missing the pop- & push string/integer functions.

As I read topics about this issue in this forum, one stated I should include pluginapi.c, another stated I should use the pluginapi.lib.

Myself, I only got the scripts recompiled including <nsis/pluginapi.c>. Which includes pluginapi.h, anyway. And as I understand including Exdll.h, isn't necessary anymore?

Is this the way you advise me to compile Unicode NSIS plug-ins ???

Or should I implement pluginapi.lib some how ?
I only found this file in the installed version of Unicode NSIS, in the Examples\Plugins

I am using Microsoft Visual Studio 2008 express edition.

See my forum topic at : portableapps.com/node/21879
EnumINI.7z, is using the pluginapi.c
NewTextReplace, still uses it's own pop & push functions, but maybe I should change that one time.


It's very admirable of you to take the initiative. I'd advise that you download the source code for the Unicode NSIS and look at the Contrib folder. All of these are plugins. For example, a simple one to imitate is the Banner plugin and see what I did there to support Unicode.


Done all of that a while ago. I needed the examples.

As well I got pluginapi.lib linked by VC++ now. So it's all fine ! But including pluginapi.c worked as well.

Finally I got an reply of the developer of the MD5dll plug-in, he will do this plugin soon !


@Jim Park
Are you gonna compile NSIS 2.46 based Unicode build?

@All
Could someone with skills compile this ANSI NSIS plugins to UNICODE version?
1. CrcCheck.dll
http://nsis.sourceforge.net/CRCCheck_plug-in
2. Linker.dll
http://nsis.sourceforge.net/Linker_plug-in

-Pawel


GJ with new version jimpark!

However I still get messages about BOMs with /V0 and /V1 switches.

Details here: http://forums.winamp.com/showthread....08#post2497708
Mr. kichik opinion: http://forums.winamp.com/showthread....55#post2498455

Hope it will be included some day :)


nsisunz+CallAnsiPlugin
@Anders:
I use your plugin for unzip file to folder, but it consumes 2 gb memory, without result.
Can you show me how to unzip file?


NSIS Unicode with "m" and "t" System types
Hi Jim, Anders and others,

I have a question about NSIS Unicode (I'm using v2.45.1):

I have a DLL that I need to call with System::Call and some of the function prototypes in that DLL have parameters that are explicitly "char*" because the DLL expects some strings in UTF-8 format and also sometime return strings in UTF-8 format.

From what I understand I have to use the "m" type in these System:Call prototypes to explicitly specify char* but what happens when I actually call them and pass a NSIS variable or register?

From my testing it just works (incidentally this is probably because the strings I pass in contain only english characters) but consider if I did the following:

StrCpy $0 "some chinese characters"
System::Call 'SomeDLL:SomeFunction(m)i(.r0).r1? c'

Does the Unicode string contained in $0 get converted in any way before the DLL function is called or is it just called with a typecast to (char *) with whatever data is in $0?

Since my calls are working, I suspect it's being converted to multibyte, but how and when ?

I'm asking because I'm having trouble with this when doing it the other way around: i.e. the DLL returns some UTF-8 to me that I need to display in NSIS.

For example if there is a function in that DLL defined as:

int SomeDLLGetMessage(char *buffer);

Which expects you to pass a pre-allocated char* buffer of at least 1024 bytes in order to place a utf8 string in it...

...and if I call this function this way:

System::Call 'SomeDLL::SomeDLLGetMessage(m)i(.r0).r1? c'

I find that it completes and that $0 contains something that I am able to display with a simple MessageBox MB_OK "$0".

Since the DLL call returns UTF-8 I would have expected $0 to contain single-byte data and therefore to be mangled when I attempt to display it. However it displays (the utf8 chars are displayed "as is" but it displays nonetheless).

Is there some kind of conversion happening for me inside NSIS ? What is the proper way of doing this?

I'd be more than happy to convert the UTF-8 data i get back to Unicode myself by calling MultiByteToWideChar but it doesn't seem to work. When I do this, I am able to convert $0 to Unicode but when I try to display it, NSIS only displays the first character (as if it was treated like a single byte string and stopped at the first 0 byte)...

I guess I am confused at how this needs to be handled. Is there a way to call a DLL, retrieve a UTF8 string and convert it to Unicode to display it ? or do I need to have the DLL modified to return Unicode data instead ?

Sorry for the long post :-(

Cheers,
Damien.


yes, m will convert the buffer. In your case, you should allocate 1024 bytes with System::Alloc and pass it to SomeDLLGetMessage as i, not m. When the call returns, you can pass it to MultiByteToWideChar yourself (Utf8 is only supported on XP+ in MultiByteToWideChar)


There is an implicit conversion happening with 'm'. In fact, it uses the system default codepage for the conversion. If the return value is also 'm', (ANSI codepage) it will convert it back to utf16.

Are you sure the API sets the buffer with a UTF8 string? Often times people mix up UTF8 and ANSI. UTF8 is a byte-based encoding of Unicode. UTF8, UTF16-LE, UTF16-BE, UTF32 are all Unicode but different encodings of it. ANSI is a different beast altogether requiring the codepage information to really figure out which 'character'* each of the codes represent. This means that 'm' does not do UTF8 to UTF16 conversion at all. This would ONLY work for English since the ASCII set is shared by all ANSI codepages and UTF8. So make sure that the API really does set the string with UTF8 encoding. (This may be unlikely since it requires work to actually get a UTF8 string in Windows.) But if it really is UTF8, what Anders suggests might be the way to do it. Otherwise, if it's ANSI, then 'm' is fine as long as you are okay with the assumption that the string returned is going to be in the system default codepage -- if it's not then it won't display correctly on the system anyway, so it's really not a big concern.

* not necessarily character but really 'code' since diacritics and other character modifying codes exist


Yes the DLL really does return UTF-8 but thanks to you guys I got it working !

I was breaking down the process in two functions, one to call the DLL and one to convert to Unicode and in the process the results were transiting into NSIS variables and therefore being converted all around. Now that I do both the DLL call and MultiByteToWideChar in the same block of code using allocated buffers and the 'i' type, everything is pretty :-)

Thank you guys you saved my friday ;-)

Cheers,
Damien.


Hi Jim, there seems to be an issue with the myatoi and myatou in the pluginapi library (and possibly the myatoi_or) too. If the plugin is built as Unicode then they all work fine but they fail to parse a valid integer for multi-byte builds. Sometimes they return 0 or they return some hideously large number (like an unsigned has been negatived). Same goes for popint().

I will probably look into fixing it myself but I'd rather you had a look. If you want me to test, let me know.

Stu


Hi Jim, there seems to be an issue with the myatoi and myatou in the pluginapi library (and possibly the myatoi_or) too. If the plugin is built as Unicode then they all work fine but they fail to parse a valid integer for multi-byte builds. Sometimes they return 0 or they return some hideously large number (like an unsigned has been negatived). Same goes for popint().

I will probably look into fixing it myself but I'd rather you had a look. If you want me to test, let me know.

Stu


Archive: Unicode


@Afrow: the unicode merge has started, so whatever changes you make, add them to the official branch as well


Stu, I'm looking at the code right now and I don't see anything that could cause the behavior you are seeing. But the behavior is in line with a Unicode string being fed into an ANSI version of myatoi. Is there a simple NSI file that would test the problem?


I'll see if I can knock something up.

Stu


Ok here is the problem being reproduced:

test_myatoi: Pushed: 999, Result: 0
test_myatou: Pushed: 999, Result: 0
test_popint : Pushed: 999, Result: 0
test_my_atoi: Pushed: 999, Result: 999
Completed
This only occurs when using the ANSI plugin build (Compile NSIS Script). Note that my_atoi is a different implementation.

I've attached the plugins and NSIS script.

Stu

Cleanup command line output by removing/disabling file encoding messages.

Originally posted by dariann
I have this problem too... It's very annoying... I want to turn off this messages, because they spamming my command window. How can I do it?
I observed that over time lots of users complained about the file encoding messages.

Please drop the command line spamming regarding file encoding or move them to /V3 this is my only desire in order to consider NSIS Unicode perfect.

I'm using NSIS Unicode for more than two years and my only complain is regarding this, so called feature. A good command line utility should not spam you with few pages of useless text.

This "feature" can make you miss potential warnings or even errors when compiling. Redirecting STDOUT is not an option.

Also, I hope that NSIS will switch to NSISu as default instead of keeping the ancient ANSI version.

In my merging of NSIS Unicode in NSIS sources trunk, I removed most of these "File has a BOM"-type messages, keeping them only when it's a useful warning (no BOM).
So this will be solved in the future NSIS release that include my Unicode merge.

As for switching to Unicode installers as default, I'm thinking more about adding a command-line option (that can be set as default, like the compressor in MakeNSISw) and an equivalent attribute command (like SetCompressor), for choosing ANSI or Unicode target.


@Wizou: I posted a unofficial nsis installer ( http://forums.winamp.com/showthread.php?t=316842 ) that includes both ansi and unicode + a loader that tries to guess based on BOM or forced with command line param (Required hacked makensis.exe to allow different stub dirs) Once the unicode branch is fully merged, we could of course implement that functionality in a less hacky way (We need StubA/StubW and PluginsA/PluginsW folders and maybe for languages unless we are able to transform the ansi versions on the fly. Include dir should be shared (Some unicode fixes need to be added to make that come true IIRC))


2.46 RC1 Is Out!
2.46 RC1 Is Out!

The Unicode port of NSIS 2.46 is out as a release candidate. Apart from the straight port, I've made some improvements.

New African languages are supported:
•Cibemba (ANSI and Unicode)
•Efik (Unicode only)
•Igbo (Unicode only)
•Malagasy (ANSI and Unicode)
•Yoruba (Unicode only)

As you can see, some of these languages are supported in my new ANSI build also. Please note that the translations aren't complete but "sufficient" as told to me by the translators so further enhancements may be forthcoming in a later release.

Also, the System calls to lstr* family of functions in the kernel32.dll has been modified so that there is no longer the need to always suffix with "W" the lstr* functions for the Unicode build. The problem stemmed from the fact that in the kernel32.dll, the lstr* functions such as lstrlen, had three callable versions: lstrlenW, lstrlenA and lstrlen. (This is different from other win32 functions which only have two callable versions: the A and the W suffixed calls.) However, in kernel32, lstrlen is the same as lstrlenA. The reason for this strangeness in kernel32.dll was presumably for backward compatibility. So an NSI script that used System::Call kernel32::lstrlen would get lstrlenA in both the Unicode and the ANSI build of NSIS! Please note, this only happens with kernel32::lstr* functions. But this has been remedied by special code in the System plugin that looks for calls to kernel32::lstr* functions and adjusts the calls according to the build of NSIS so that there is no longer any confusion. The TCHAR-like names for the functions should now work for all of the win32 calls which should make porting of the standard NSIS scripts to the Unicode version much easier.

I've also removed the "chatty" info messages regarding the Unicode encoding of the compiled NSI and NSH files. This was bothering some.

In my personal testing, everything looked decent. Please take some time to check if your project works well with it and let me know if you find any problems.


Stu,

I've tried looking at your TestMyatoi zip file and found only one version of pluginapi.lib there. Since you are saying this fails for the ANSI, you must be using the Unicode version of the lib. If you are building for ANSI, you need the ANSI version of the lib. You can't link to the same lib for both ANSI and Unicode version of your plugin.

If this is something that is desired, I can create a unified lib with both ANSI and Unicode versions of the function existing in the lib with the pluginapi.h having #defines that map the different function calls to the right functions. This would basically mimic what win32 does.


Are you saying you have a hardcoded check for "kernel32::lstr"? How do you know those are the only functions that are like that? I still say that if "t" type is used, unicode versions needs to go FooW() > Foo() > Fail


I'm pretty sure those are the only functions like that. Do the strings command on the various dlls and look for yourself. As for using 't' as the hint, would that work if the return type is what's different? It also won't work if the string type is actually a member of a struct and the pointer to the struct is what's being passed in. It's an interesting idea but I'm right now not sure if that is enough of a check.


@Jim: Yes, it would fail for structs. Not sure about return, I don't think you can have ...(...)t.s etc

Maybe we have to settle for some sort of flag. With the check for lstr, we can probably get by with your code until a problem actually shows up.


Jim,

bother didn't think about that. If you could make a define that would be handy.

Stu


Stu,

It is possible to do but probably not this release. It does require quite a bit of reworking for the pluginapi.h and I don't want to delay the 2.46 release any longer. But I'm sure 2.47 is around the corner and I will incorporate it then.


Semi-on-topic...

First.. As I understand it, plugins written for the ANSI version of NSIS won't work in the Unicode version, and vice-versa.
Am I correct in that understanding - or is this limited to plugins that perform specific tasks?

Second.. As I understand it, the two compiles would typically result in two plugin files (Anders' BgWorker notwithstanding).
Am I correct in that understanding - or would this again be limited to plugins that perform specific tasks?

Just to use a particular example.. the GetVersion plugin by Stu (Afrow UK) has as part of its version notes "Better Unicode build support". However, the download only contains a single DLL.
Is there anything specific one might look out for in a plugin's source - if available - that would tell a person that the plugin it would build would be ANSI, Unicode, or Hybrid (a la BgWorker)?

The main reason I'm asking is because I'm working on compiling a list of plugins with some basic information - including whether it supports ANSI, Unicode or both - and GetVersion is the first one that made me realize that it might not be as trivial as simply checking if there's two DLLs (BgWorker makes its Hybrid method clear in both the wiki page and in the source).


You can pretty much assume that no other plugin is a hybrid, it is a big ass pain to do.

If you look at the GetVersion source, you see that it uses the TCHAR type and _T/TEXT macros for strings. This is a indicator that it supports both unicode and ansi output. If UNICODE & _UNICODE is defined when building, you end up with a unicode dll. (@Stu: GetProcAddress does not take a unicode string!)

If all you have is a compiled dll, open it in dependencywalker.com If a lot of the used functions in kernel32/user32 ends with A its ansi, or W for unicode.


Cool - thanks Anders. I've used dependency walker before - just not that part of it.

Sounds like basically although the GetVersion plugin is Unicode-ready in terms of the source code, no compiled version is available in the official download at this time.

'll see about finishing that list later (fair bit later) and doing something with it in the wiki.


Yeh I did make sure the source was ready for Unicode (except GetProcAddress, thanks Anders) but never built (I don't think we had Unicode NSIS when I wrote that plug-in). I will build a Unicode version when I have time - in fact quite a few of my plug-ins needs updating but I'm stuck for time at the moment.

Stu


CallAnsiPlugin and nsisXML
Hi All

For my installer I use Anders' plugin "CallAnsiPlugin" with NSIS Unicode and everything's OK except with nsisXML by Wizou.

OutFile "test_xml_UNICODE.exe"

!define TempDir "C:\TEMP"
!define nsisXML "${TempDir}\nsisXML.dll"

Function .onInit
SetOutPath $INSTDIR
;Get All ANSI dll to the local machine
File /ONAME=${nsisXML} "${NSISDIR}\Plugins\nsisXML.dll"
FunctionEnd

Function test_xml
CallAnsiPlugin::Call ${nsisXML} create 0
CallAnsiPlugin::Call ${nsisXML} load 1 "sample.xml"
CallAnsiPlugin::Call ${nsisXML} select 1 '/main/child[@attrib="value2"]'
IntCmp $2 0 notFound
CallAnsiPlugin::Call ${nsisXML} getAttribute 1 "active"
DetailPrint "Attribute 'active' is $3"
CallAnsiPlugin::Call ${nsisXML} getText
DetailPrint "Tag <child> contains $3"
CallAnsiPlugin::Call ${nsisXML} parentNode
CallAnsiPlugin::Call ${nsisXML} removeChild
CallAnsiPlugin::Call ${nsisXML} save 1 "SampleU.xml"

notFound:
FunctionEnd

Section myFoo
Call test_xml
SectionEnd
Is there anybody to tell me what's wrong in my code please?

Originally posted by webtubbies
Is there anybody to tell me what's wrong in my code please?
You are not checking that the nsisXML calls don't fail, if you did that you could maybe narrow down to the line that fails. What happens exactly, does it crash for example?

(There is also a chance that CallAnsiPlugin has a bug, it was never tested with this plugin)

Originally posted by Anders
You are not checking that the nsisXML calls don't fail, if you did that you could maybe narrow down to the line that fails. What happens exactly, does it crash for example?

(There is also a chance that CallAnsiPlugin has a bug, it was never tested with this plugin)
Hi Anders,

Actually, the first call (create) seems to be OK and then there is a crash of the exe.

nsisXML has a unicode version in the zip file, why can't you use that?


@Anders:
Yeah you're right, I was using an older version. I'll try tomorrow the Unicode version.
Thank You for your time and sorry for the incovenience


I also found a varsync bug in CallAnsiPlugin, so I will update the wiki later today with the fixed version


Ok, the unicode nsisXML plugin works like a charm! thank You


Originally posted by webtubbies
Ok, the unicode nsisXML plugin works like a charm! thank You
you're welcome ;-)

Unicode NSIS 2.46 is finally released!


Jim, thanks!
-Pawel


Hi JimPark, first of all I want to thank you for the updates lately !
as well I should let you know about TextFunc.nsh
it misses support for UTF-16LE files.

I done a re-write of ${ConfigRead/Write} my self for now(see the attachment). I added the utf-16LE support within the same instruction by having it look for a BOM and handling the file accordingly. Hopefully all the other instructions can be done in a similar way.

[edit: I just realized my re-write was a little buggy, as I forgot to reset the variable what indicates if there is a BOM or not. I re-uploaded, but just see it as an example !]


I'm done porting NSIS sources to Unicode. It is now possible to compile NSIS sources (from Sourceforge repository) either as ANSI or UNICODE.
I'm now working on making the Unicode version able to generate both ANSI & Unicode installer, so we can make a switch and have everybody use a single (Unicode) version of makensis.exe


Very nice. On the plug-in side of things, will ANSI plug-ins be defunct now?

Stu


No, because people wanting to target Windows 9x & such still need ANSI installer & ANSI plugins.
However I have created a new version of the plugin API, mostly compatible with the current version, that allows plugins to work with both ANSI & Unicode installer (even if the plugin itself is compiled as ANSI or Unicode), making the conversions transparently.
I will commit that soon to SVN.


Hold it. Are you saying that there could be an official release from the main NSIS page with both ANSI and Unicode versions in the same package?

If this is true, this is HUGE for us, as we are only allowed by our legal department to use the versions of NSIS that come from the main page/official site. We are not allowed to use other versions (Unicode NSIS from scratchpaper, etc.)


@CrushBug: Official SVN has working unicode build (It is pretty much the same code as the fork, at some point the merge will be complete hopefully)


Originally posted by Wizou
No, because people wanting to target Windows 9x & such still need ANSI installer & ANSI plugins.
However I have created a new version of the plugin API, mostly compatible with the current version, that allows plugins to work with both ANSI & Unicode installer (even if the plugin itself is compiled as ANSI or Unicode), making the conversions transparently.
I will commit that soon to SVN.
Wizou,
Is it possible to write src\Docs\src\*.but files with utf8 encoding (and Halibut could parse it and create output html files with default utf8 encoding, that HTML Help Help Workshop could compile to chm)?
Would be much simple for me, if it use this codepage, not default windows.
Could someone with skills look into it? Or, this is to much work, and it is not worth...
-Pawel

I'm not much aware about Halibut, and your demand is quite independent from porting NSIS code to Unicode.. I guess I won't be doing that soon..
Maybe someone else will look into it..


In Unicode NSIS this plug-in works incorrectly.
http://nsis.sourceforge.net/IP_plug-in
Is there any other plug-in to get IPs of the computer or may be anybody could help with the solution for this plug-in?


If the plug-in is not Unicode then you will need to rebuild it.

How about this? http://www.codeproject.com/KB/vbscript/ipaddress.aspx

Stu


Originally posted by Afrow UK
If the plug-in is not Unicode then you will need to rebuild it.
Does rebuild mean anything more than compiling nsi-script via Make NSISW (Unicode) program? This plug-in is dll - how can I rebuild it?
And also I failed to understand how and what for to use a2u.exe program.

Originally posted by Afrow UK
How about this? http://www.codeproject.com/KB/vbscript/ipaddress.aspx
Actually dll on C++ with code for determining IP-interfaces of the computer I'll create by myself. I was interested why former plug-in don't work.

To get the IP address, have a look at the IpConfig plugin:
http://nsis.sourceforge.net/IpConfig_plugin


Archive: Unicode


Here the NSIS Armenian translation!


Building under Linux
I have downloaded nsis-2.46-Unicode-src.zip and am attempting to build the makensis utility under Linux. The command I am using is:

scons UNICODE=yes SKIPSTUBS=all SKIPPLUGINS=all SKIPUTILS=all SKIPMISC=all NSIS_CONFIG_CONST_DATA_PATH=no PREFIX=/xp-e/Program\ Files/NSIS PREFIX_DEST=/tmp/nsis install-compiler
One minor problem was that I had to rename the 'Scons' directory to 'SCons'.

A more serious problem are these errors:

In file included from Source/build.cpp:24:
build/urelease/config/nsis-version.h:3:19: error: tchar.h: No such file or directory
In file included from Source/manifest.h:22,
from Source/build.h:30,
from Source/build.cpp:26:
Source/tstring.h:24:21: error: windows.h: No such file or directory
In file included from Source/build.h:22,
from Source/build.cpp:26:
tchar.h and windows.h are presumably MS Windows files, which wouldn't make a lot of sense for building a Linux executable.

I can build a Linux version of makensis using the official (non-Unicode) version of nsis 2.46 without any problems.

You could try the official SVN, it (should) compile as unicode now


Thanks Anders. I've just tried that, but a Linux build fails with the error "No version of Visual Studio compiler found".

There's a note in the TODO.txt file: "Make Unicode version compile on other compilers / platforms.", so I guess that it currently only builds on Windows.


I don't really see how a VS compiler error message is related to unicode. Are you able to compile the ansi version? (TODO.txt has not been touched in years AFAIK)


I get the same error if I try to compile the SVN ANSI version (i.e. without the 'UNICODE=yes' option).

The note in TODO.txt was added by 'wizou' on 2010-03-26.


The problem might be with scons and not nsis (scons did some major changes recently, try both 1.2 and 1.3 and whatever the latest version is) It might also help if you post the full scons command you used (I can't build the unicode version myself, so I can't really help there)


I was using scons 1.3.0, but I've just downloaded and tried the latest version (2.0.1) - unfortunately with the same symptoms. I don't have a copy of 1.2 to try. Version 1.3.0 works fine with nsis 2.46 - which makes it seem likely that something has changed in nsis.

The full scons command that I am using is in my original post.


Originally posted by timufea
The full scons command that I am using is in my original post.
Doh!

I know I had to revert to scons 1.2 to build, but that is on windows. Why it tries to use the VS toolchain on POSIX, I have no idea. IIRC there is some scons switch you can use to force the toolchain, TOOLSET= maybe... (There are several older forum threads about building on POSIX, have you checked those for hints?)

It looks to me that the "No version of Visual Studio compiler found" issue was introduced by the "added MSVS_VERSION option to scons command-line to specify which compiler to use" change committed into the subversion repository as revision 6110.

nsis.svn.sourceforge.net/viewvc/nsis/NSIS/trunk/SConstruct?r1=6102&r2=6110

A Microsoft Visual C++ build environment is explicitly defined In line 180 of the SConstruct file.
In my humble opinion the build environment should be automatically detected by scons
and an explicit definition of the build environment is counterproductive.

I assume the same problem also occurs if you try to natively build on a Windows platform with mingw32 (www.mingw.org).


Yeah, I don't think wizou has done any testing on anything other than recent versions of VS


I tried with VS 6.0, VS 7.1 Toolkit, VS 8.0 Express.
Scons seemed to work with them.

However, note that only VS 8.0 can be used to compile Unicode version of NSIS as only VS 8.0 contains a C++ Standard Runtime Library that contains the necessary Unicode classes.


Plans to Merge Unicode NSIS with Mainstream NSIS?
Quick question: Are there any plans to merge Unicode NSIS with mainstream NSIS?


Originally posted by nsnb
Quick question: Are there any plans to merge Unicode NSIS with mainstream NSIS?
The merging is almost complete, with the right setup, you can compile unicode nsis from the official SVN today

Greetings :)
Tell me please how to solve this problem:

In the script calls
Code:
System::Call 'msvfw32.dll::MCIWndCreate(i 0, i 0,i 0x0070, t "C:\1.mp3") i .r0'

Depending on your line when you compile the setup file, the function either does not find the file 1.mp3, or can not identify the driver device via MCI, or compile the setup file does not complete and displays a message that the file test. Exe can not be created. :(

In version Nsis ANSI this function works correctly. :rolleyes:
Using version Nsis-Unicode (2.46)


MfG MaGoth, WoG.ru-Community.


MCIWndCreateW maybe?


Originally posted by Anders
MCIWndCreateW maybe?
Tnx.
No, it does not WIN API, a generic driver for sound without using external players, only the operating system.
This option will not work. He was tested the day before the first, like many others.

The function works correctly, it was tested in parallel and in Nsis ANSI 2.46, the result of the compilation of which was fully working file test.exe. In this version the sound is!

When compiling the same file in version Nsis UNICOD 2.46, errors occur.
Hence the conclusion, this script unicode itself is not correctly passes parameters to this function. :(


MfG MaGoth, WoG.ru-Community.

does not work in 2.46 ansi either for me, so what do I know


OutFile"$%temp%\temp.exe"

Section
System
::Call'msvfw32.dll::MCIWndCreate(i0,i0,i0x0070,t"c:\1.mp3")i.r0'
SendMessage$00x049000$1

IntCmp
$10nosup
SendMessage
$00x04650"STR:play"
Sleep1000/*matchtosoundlengthotherwisethesoundimmediatelystopsplaying?*/
nosup:
SectionEnd

        
Edit: the 'sleep' only applies to Sections, presumably.. stick it in .onInit or .onGUIinit and it'll work fine (but that little window will be somewhere back in z-order)
                

Originally posted by Anders
does not work in 2.46 ansi either for me, so what do I know
Test example:

PHP Code:
OutFile"test.exe"
SetCompressor/solidlzma
!include"MUI2.nsh"
!insertmacroMUI_PAGE_INSTFILES
!insertmacroMUI_UNPAGE_INSTFILES
!insertmacroMUI_LANGUAGE"English"

Name"Test"
ShowInstDetailsnevershow
ShowUnInstDetailsnevershow
Varhmci

!defineAPP_NAME"mci.sound"
!defineSND_NAME"1.mp3"

Function.onInit
SetOutPath$PLUGINSDIR
File
"${SND_NAME}"
System::Call'msvfw32.dll::MCIWndCreate(i0,i0,i0,t"$TEMP\${SND_NAME}")i.r0'
StrCpy$hmci$0
SendMessage$hmci0x049000
$0
IntCmp
$00nosup
ShowWindow$hmciSW_HIDE
SendMessage$hmci0x04650
"STR:play"
nosup:
FunctionEnd

Section
"DummySection"SecDummy
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
Sleep1000
SendMessage$hmci
${WM_CLOSE}00
SectionEnd

For tests:
1. Locate the file - 1.mp3 in the same folder as the script *.nsi.


PS. Everything works fine in ANSI, but it does not work in UNICODE. :cry:


MfG MaGoth, WoG.ru-Community.

Well, Anders' 'MCIWndCreate' suggestion is at least correct - and it should load the file just fine.
A problem pops up getting it to actually play.. I have no idea what (combination of) flags 0x0465 is supposed to be.. but it's possible that it is an ANSI-specific (combination of) flags, with Unicode equivalents somewhere.

But eyeing the MCI docs at MSDN tells me that it's probably better to just send MCI_PLAY for a simple single play anyway.

But just to take things from the top...

1. Your Script is setting the output path to $PLUGINSDIR, but you don't have InitPluginsDir anywhere before that. Add that first, just to be safe.

2. Your system call then tries to load your sound file from $TEMP... replace that with $PLUGINSDIR.

So now on to the actual commands... If you compile this as is, it will still say it can't find the file. That's because of using MCIWndCreate instead of MCIWndCreateW.

3. Replace MCIWndCreate with MCIWndCreateW

If you try now, you'll see that the file does get loaded okay, but it can't understand the "play" command ("The driver cannot recognize the specified command."). I don't know why not. But, as mentioned above, we can use MCI_PLAY instead.

4. Replace...
PHP Code:


SendMessage$hmci0x04650"STR:play" with
!defineMCI_PLAY0x806
SendMessage$hmci
${MCI_PLAY}00

        
The sound file should then play.
                

Originally posted by Animaether
Well, Anders' 'MCIWndCreate' suggestion is at least correct - and it should load the file just fine.
A problem pops up getting it to actually play.. I have no idea what (combination of) flags 0x0465 is supposed to be.. but it's possible that it is an ANSI-specific (combination of) flags, with Unicode equivalents somewhere.

But eyeing the MCI docs at MSDN tells me that it's probably better to just send MCI_PLAY for a simple single play anyway.

But just to take things from the top...


1. Your Script is setting the output path to $PLUGINSDIR, but you don't have InitPluginsDir anywhere before that. Add that first, just to be safe.

2. Your system call then tries to load your sound file from $TEMP... replace that with $PLUGINSDIR.

So now on to the actual commands... If you compile this as is, it will still say it can't find the file. That's because of using MCIWndCreate instead of MCIWndCreateW.

3. Replace MCIWndCreate with MCIWndCreateW

If you try now, you'll see that the file does get loaded okay, but it can't understand the "play" command ("The driver cannot recognize the specified command."). I don't know why not. But, as mentioned above, we can use MCI_PLAY instead.
Yes, oddly enough, but the function you need to write it with a "W".
MCI_PLAY - for me turned out to be useless, because I want to play multiple files at once. :rolleyes:

1,2. It was just an example, but even using his program itself finds the necessary files. ;)

4. Replace...
PHP Code:
SendMessage$hmci0x04650"STR:play" with
PHP Code:

!defineMCI_PLAY0x806
SendMessage$hmci
${MCI_PLAY}00

The sound file should then play.
This code is not useful for me.
But for the code change sound driver, thank you very much. I have changed a bit your code, using only the 0x806, and now I have everything works fine. :p


@Animaether, Anders:
Thank you guys. :up:


MfG MaGoth, WoG.ru-Community.

Greetings, :)
I have a small problem appeared again. Use this one plug-in for creating additional buttons with the conclusion of the message box:
http://nsis.sourceforge.net/ButtonEvent_plug-in

Also, changed the file "modern.exe", adding to the 105-th section of the last row of the following:
PHP Code:


CONTROL"",1190,BUTTON,BS_PUSHBUTTON|WS_CHILD|WS_VISIBLE|WS_TABSTOP,7,201,50,14

Example script:
PHP Code:
!include"MUI.nsh"
!include"LogicLib.nsh"
!defineIDC_BUTTON_TRYME1190

Name
"ButtonEventExample"
OutFile"ButtonEventMUI.exe"

!defineMUI_PAGE_CUSTOMFUNCTION_SHOWInstFilesShow
!insertmacroMUI_PAGE_INSTFILES
!insertmacroMUI_LANGUAGE"English"

FunctionTryMe
MessageBoxMB_OK
"Bla-Bla-blaaaa"
Abort
FunctionEnd

FunctionInstFilesShow
GetFunctionAddress$R0TryMe
ButtonEvent
::AddEventHandler/NOUNLOAD${IDC_BUTTON_TRYME}$R0
GetDlgItem$R0$HWNDPARENT
${IDC_BUTTON_TRYME}
SendMessage$R0${WM_SETTEXT}0"STR:Info"
EnableWindow$R01
FunctionEnd

Function.onGUIEnd
ButtonEvent
::Unload
FunctionEnd

Section
"Dummy"SecDummy
Sleep1000
SectionEnd

When you compile a version Nsis ANSI message when you click on the button displayed.
When you compile a version Nsis UNICOD, no message, although compilation is successful.

Can somebody adapt this plugin version ANSI to UNICODE?! :rolleyes:

Ps. Sorry for my dablpost.


MfG MaGoth, WoG.ru-Community.

String returned from install location text box
I think the string returned from the install location text box used to be UTF16. But now it seems to be UTF8. Is this assumption correct?


compilation problem
uname -a
FreeBSD nw3 7.3-STABLE FreeBSD 7.3-STABLE #0: Fri Oct 15 18:26:08 MSD 2010 root@nw3:/usr/src/sys/i386/compile/GREENH i386
nsis unicode nsis-2.45.1 and 2.46

# scons
scons: Reading SConscript files ...

scons: warning: Ignoring missing SConscript 'SCons/utils.py'
File "/root/nsis3/SConstruct", line 79, in <module>

scons: warning: Ignoring missing SConscript 'SCons/config.py'
File "/root/nsis3/SConstruct", line 98, in <module>
KeyError: 'NSIS_CPPDEFINES':
File "/root/nsis3/SConstruct", line 122:
if 'NSIS_CONFIG_CONST_DATA_PATH' in defenv['NSIS_CPPDEFINES']:
File "/usr/local/lib/scons-1.3.0/SCons/Environment.py", line 411:
return self._dict[key]
files SCons/utils.py and SCons/config.py - exist
help me pls. What can it be?

My HwInfo plugin returns junk now that I switched to unicode. is there a unicode build of HwInfo available?


I don't know if there's a unicode version, but have you tried using the CallAnsiPlugin plug-in?


no, but I solved that by writing my own .net dll, where getting this info is pretty easy.
Had to modify this plug-in to make it work with Unicode though. Added a link to the modified version to the Viki page.


Mind you that using .NET requires .NET to be installed. Your installer would never work on any of my PC's.


the product I'm installing requires .net 2.0, so I check for that first thing in the install and install it if it's not there.


is logging enabled in the Unicode nsis build?


7-zip expects a very specific format from the installer. One of its assumptions is that the string table is saved as ANSI. It's not such a faulty assumption when you add in the no-real-nsis-binary-specification factor.


___________________
my name: mrTuan|My site:du hoc my


Hey guys,

Does anyone of you might know where I can find the source code for unicode build version 2.35? It's the version we are currently using. Unfortunately, it has the PCA problem. We don't really want to spend the time upgrading to newer version because it's working fine for us except for this one problem. So we'd like to just apply the manifest change and rebuild NSIS. But it seems that all the history got lost when the source code repository was moved to Google Code. You can only get version 2.46 from there. So I'd really appreciate if somebody still has the source code for 2.35.


I see no reason to keep using that version. Upgrading takes a couple of seconds and you are guaranteed backwards compatibility unless you have modified NSIS's own files yourself.

Stu


Originally posted by Afrow UK
I see no reason to keep using that version. Upgrading takes a couple of seconds and you are guaranteed backwards compatibility unless you have modified NSIS's own files yourself.

Stu
The old versions will compile with vc6, the new versions do not

How about VS 2010 Express?

Stu


Logging support for Unicode version
Hello,

Thank you for this port. I am interested in enableing logging using LogSet and then logging using LogText. It seems that the Unicode distributions don't turn on logging. The non-unicode distributions have a "special build" for makenisis which has logging enabled. Is there a similar "special build" with logging enabled for the unicode port?

Thanks,

mzd


@Wizou: If you check my BgWorker plugin, you can see that I'm checking for kernel32::lstrcatW in the import table.

I also have some sample code in there that checks $pluginsdir, but like you said, you don't know the offset, so it has to use IsBadReadPtr() and there could be false positives.

Unfortunately, it turns out that writing hybrid plugins is a huge pain in the ass, and I can't recommend doing it on anything except very simple plugins





_______________

[ADMIN EDIT]
Warning: Spam(?) links removed!


@mrphantuan: I don't know which post you are referring to exactly..
However, regarding hybrid ANSI/Unicode plugins, I have changed my mind and consider it now also a bad idea to try to create such plugins.
And so I don't plan anymore on committing a plugin API that would work transparently with ANSI & Unicode.
It is just simpler for plugin author to build 2 separate DLL variant, one for ANSI, one for Unicode.

And my recommendation for compatibility with Unicode NSIS automatic plugin variant detection, is to name the ANSI version "MyPlugin.dll" and the Unicode version "MyPluginW.dll" and offer them in the same folder.


Originally posted by Wizou
@mrphantuan: I don't know which post you are referring to exactly..
I believe that is something I posted, why he posted it now, I have no idea.

And yes, a single dll for both versions is possible, but just to much work in practice, just use TCHAR and compile two versions...

Archive: Unicode


Logging support for unicode version
Hello All,

I appears to me that all queries regarding logging support in the unicode port are being ignored. So, here goes again: is there any support for logging using LogSet in the unicode port. I have not yet understood the motivation behind turning off such a basic feature in any scripting language and especially in an install tool.

Thanks,

-mzd


It is turned off because it adds extra overhead to the installer. You'll have wait for kichik or someone with greater knowledge of the internals of NSIS to explain why such a feature cannot be toggled dynamically at installer compile time though. Perhaps when logging was implemented the developers didn't think of allowing dynamic enabling, or that the amount of work required was unnecessary. Sometimes using C compile time directives (#if ... #endif) with separate builds of a product is much simpler. Now that the code base is so large it'd probably be a lot of work to change now.

Edit: When the next version of NSIS is released which is Unicode, you will have a logging build then (http://nsis.sf.net/Special_Builds). For now you will just have to rebuild. You just need to install a few things to build; nothing that will cost you anything.

Stu


@afrowuk.

Appreciate the response. Rebuild is an option, but not for me, since I don't own the build tools. The NSIS distribution is shared with other products and I am not at liberty to replace it with a custom build. It needs to be a published release ( a published special build distribution is ok ).

You mentioned that the next version of NSIS will be unicode enabled. Is that 2.47+?
Any insights in to the timeframe? Anyway, these are queries for the general NSIS Discussion forum.

Thanks,

-mzd


Go, go, NSIS Unicode version!


Originally posted by mzd
It needs to be a published release ( a published special build distribution is ok ).
as this unicode build isn't really an official version then it's not too surprising no logging version was offered and would assume that any 'official' unicode versions would have a logging version providing along with the large string version (unless plans, etc have changed in that respect).

Originally posted by mzd
You mentioned that the next version of NSIS will be unicode enabled. Is that 2.47+? Any insights in to the timeframe?
it looks like it'd be 2.50 based on what's in cvs / help docs though i guess that might change and i think the timeframe is when it's done and ready. as work keeps starting / stopping on it, things seem to be taking longer than i guess it was hoped to have been.

-daz

Originally posted by DrO
it looks like it'd be 2.50 based on what's in cvs / help docs...
I think that number was just a placeholder used by the fork

status?
Whats the status of the Unicode support in SVN?

Actually, what is the status of SVN itself, right now I can't even seem to build makensis from SVN on Linux, it complains about tchar.h being missing.


mzd, sorry about not replying to this forum earlier. For some reason, my auto notification to a new message got lost. I just uploaded a logging version for you. You should see it in the download section of www.scratchpaper.com.


Originally posted by pabs
Whats the status of the Unicode support in SVN?

Actually, what is the status of SVN itself, right now I can't even seem to build makensis from SVN on Linux, it complains about tchar.h being missing.
Nobody has touched the unicode stuff in a while, don't get your hopes up about this ever getting fixed. (You might be able to compile with VS2005+, otherwise you are out of luck)

So, I guess it's unlikely that there will ever be a new version of NSIS, then?


Originally posted by MSG
So, I guess it's unlikely that there will ever be a new version of NSIS, then?
We can still do ansi builds...

I'm all for that, I'm sure, but I'm thinking that would require some amount of decisive effort on the devs' part, what with the year and a half hiatus?


Originally posted by MSG
I'm all for that, I'm sure, but I'm thinking that would require some amount of decisive effort on the devs' part, what with the year and a half hiatus?
All the unicode changes are inside #ifdef UNICODE blocks...

Originally posted by Anders
We can still do ansi builds...
Except for on Linux (needs tchar.h)... hmm, mingw-w64 has tchar.h, I guess the cross-compile checks are in need of an update for the new toolchains.

Originally posted by pabs
Except for on Linux (needs tchar.h)... hmm, mingw-w64 has tchar.h, I guess the cross-compile checks are in need of an update for the new toolchains.
I'm going to try to fix all the ansi problems, but I'm not sure if we should move to mingw-w64 or stick with the original. Anyways, we probably need our own tchar.h or hacks in platform.h since not all compilers/CRT agree on how swprintf works.

NSIS Unicode High-DPI Awareness?
Hello all,

Is there any way to enable High-DPI awareness in an NSIS Unicode installer? Currently it seems to rely on display scaling in Windows 7 which does not look really good on high-DPI screens.

High-DPI awareness requires a manifest as describe in the article at:
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx (Declaring DPI Awareness)

I tried to use mt.exe to apply the DeclareDPIAware.manifest but after that I get a message that the installer integrity check failed.

If this is not currently supported, could this be considered for an upcoming version? Thank you.


Originally posted by noeld
Hello all,

Is there any way to enable High-DPI awareness in an NSIS Unicode installer? Currently it seems to rely on display scaling in Windows 7 which does not look really good on high-DPI screens.
You could try system::call 'user32::SetProcessDPIAware()' in .onInit, if that does not work you two other options:

A) Use !packhdr and resource hacker to change the manifest (Or maybe mt.exe, who knows)

B) Recompile from source

Originally posted by pabs
Except for on Linux (needs tchar.h)... hmm, mingw-w64 has tchar.h, I guess the cross-compile checks are in need of an update for the new toolchains.
I'm now able to compile (ansi) with MinGW, get latest source from SVN and add these changes:

Quote:
And you need to rename Call.S to Call.sx if you want to compile system.dll

I did not add the substart change to SVN since it is a bit of a hack, it would be better if we actually fixed the SCons stuff.

Not sure what to do about the Call.S issue, it might be SCons/MinGW version specific.

--- SConscript-HEAD
+++ /trunk/Contrib/SubStart/SConscript Thu Jun 16 19:42:13 2011
@@ -9,8 +9,7 @@
libs = Split("""
""")

-Import('BuildUtil')
-
-substart = BuildUtil(target, files, libs)
-
-env.DistributeBin(substart, names=['makensis.exe'], alias='install-compiler') # install as makensis
+if env['PLATFORM'] == 'win32': #Using cross_platform = True just to force console app in PE header...
+ Import('BuildUtil')
+ substart = BuildUtil(target, files, libs, cross_platform = True)
+ env.DistributeBin(substart, names=['makensis.exe'], alias='install-compiler') # install as makensis

Thanks Anders,

system::call 'user32::SetProcessDPIAware()' in .onInit and un.onInit seems to work just fine. Some text looks a bit small but this should be ok for now.


Changing default font size
Now that I am able to build a High-DPI aware NSIS, I'd like to be able to make the default fonts larger (and consequently the size of the dialog boxes) so that the text (all statics, buttons, etc.) is easier to read on large screen with high DPI.

Is it possible to change the default font size when the installer starts so that the size of the default font is adjusted based on the current custom DPI setting in Windows Control Panel? If yes, how can I do that?

Thank you.


For Win95 to 2003, the correct font to use is MS Shell Dlg @ 8pt and that is what we are using. MS messed things up with Vista (Segoe needs to be 9pt)


Could you recommend a way to handle this? Is this enough to correctly handle both low and high DPI displays? Thank you.


Originally posted by Instructor
How plugin can detect that it has been called from Unicode NSIS or from Ansi NSIS? Maybe add some new parameter in "extra_parameters" structure.
Found the workaround, maybe for someone it will be useful:
stack_t **g_stacktop;
char *g_variables;
unsigned int g_stringsize;
extra_parameters *g_pluginParms;
BOOL g_unicode;

#define EXDLL_INIT() \
{ \
g_stacktop=stacktop; \
g_variables=variables; \
g_stringsize=string_size; \
g_pluginParms=extra; \
{ \
wchar_t wszPath[]=L"C:\\"; \
g_pluginParms->validate_filename((char *)wszPath); \
g_unicode=(wszPath[2] == L'\\')?FALSE:TRUE; \
} \
}

Or more optimized:

stack_t **g_stacktop;
char *g_variables;
unsigned int g_stringsize;
extra_parameters *g_pluginParms;
BOOL g_unicode=-1;

#define EXDLL_INIT() \
{ \
g_stacktop=stacktop; \
g_variables=variables; \
g_stringsize=string_size; \
g_pluginParms=extra; \
if (g_unicode == -1) \
{ \
wchar_t wszPath[]=L"C:\\"; \
g_pluginParms->validate_filename((char *)wszPath); \
g_unicode=(wszPath[2] == L'\\')?FALSE:TRUE; \
} \
}

Hello.

The unicode installer works great for me, but in some locations strings are not displayed correctly.

All the calls which are displayed wrong seem to be coming from
!insertmacro INSTALLOPTIONS_WRITE "DatabaseServer.ini" "Field 1" "text" "$(MSG118)"
type calls.

Does the InstallOptions.nsh script not work for Unicode?

Thanks


Does this .ini start with a UTF16 BOM ?


The ini files were UTF-8 encoded, I converted them to ut6-16-le and it works. Thanks.


2.46.2 Released
2.46.2 is actually a fairly major release. Although based on the official NSIS 2.46 release, there has been some extensive improvements.

  • For example, we've added two new commands: GetFontVersion and GetFontVersionLocal. These commands can be used to get the version information from a TTF font file. This command is valuable for those who are distributing and updating fonts with their products.
  • Changed linear look-up of keyword tokens to a hash table which greatly increased NSI compilation speed. Although NSI script compilation is generally not a bottleneck in the build process, you will notice a marked improvement in NSI compilation speed.
  • The log generated by the logging build now outputs a BOM at the beginning of a new log file so that many text editors can recognize the file as UTF-16LE.

Apart from these major changes, there has been some general bug fixes and improvements to translations. The source code also now builds on Microsoft Visual Studio 2010's version of C++, as well as previous versions of VC++.

You can get it from the usual place: www.scratchpaper.com.

Thanks Jim.
Already installed :D
-Pawel


while i'm personally interested in unicode versions of these, please take this as documentation of plugins that aren't yet available ;)

-FontName
-Locate
-TextReplace
-ToolTips (source)

the sources come with each plugin unless mentioned otherwise


just find out about CallAnsiPlugin, but i can't get it to work with the locate plugin.

ansi plugin call:
locate::_Open /NOUNLOAD `${_PATH}` `${_OPTIONS}`

callansi call
CallAnsiPlugin::Call "$PLUGINSDIR\Locate" _Open /NOUNLOAD `${_PATH}` `${_OPTIONS}`
CallAnsiPlugin::Call "$PLUGINSDIR\Locate" Open /NOUNLOAD `${_PATH}` `${_OPTIONS}`

both don't work (Invalid command: locate::_Open), what am i doing wrong?


@Yathosho: You need to look at the CallAnsiPlugin documentation on the wiki, you can't just copy and paste a working ansi plugin command!

(Hint: /NOUNLOAD is not valid, read about using the * prefix, also, you need to specify the parameter count)

Even if you get all that stuff correct it might not work since CallAnsiPlugin will not work with every plugin.


Problem with the parameter /CMDHELP (Missing Commands).

When run this: makensis.exe /CMDHELP, only show this:
MakeNSIS v2.46.2-Unicode - Copyright 1995-2009 Contributors
See the file COPYING for license details.
Credits can be found in the Users Manual.

Abort [message]


Hello. First of all: Thanks for the new release!

But unfortunately I just noticed that there is major bug in the 2.46.2 release:
The installer won't run under Windows 2000 anymore, while previous version (2.46.1) worked perfectly fine :cry:
Error message is "Not a valid Win32 application" and installer won't even start up.

I was able to track down the problem with the help of CFF Explorer:
The OperatingSystemVersion field in the PE header is now set to 5.1 rather than 5.0 :(

Has support for Windows 2000 been dropped intentionally (I don't hope so!) or has this happened by mistake?

(And yes, I still have users on Windows 2000. And I just figured out how to make VS2010 binaries work on Win2k)

Important info
In order to make VS2010 compiled binaries work on Windows 2000 (and WinXP RTM!), two simple steps are required:
(1) You need to set "Minimum Required Options" (in the Linker/System options) to 5.0
(2) You need to link your binary against EncoderPointer.lib in order to remove a dependency on a certain export from Kernel32.DLL which did not exist in systems prior to Windows XP with Service Pack 2 (which includes Windows 2000, of course)

Regards,
MuldeR.


Thank you, Lord Mulder. Perhaps, the best thing would be for me to build with Visual Studio 2008 for now. I personally don't need to release any software to Win2K users but I see how this is a problem. I will rebuild and re-release 2.46.2 soon. Thanks for letting me know.


Originally posted by jimpark
Thank you, Lord Mulder. Perhaps, the best thing would be for me to build with Visual Studio 2008 for now. I personally don't need to release any software to Win2K users but I see how this is a problem. I will rebuild and re-release 2.46.2 soon. Thanks for letting me know.
Thank you :up:

(BTW: You know that VS2010 can use the VS2008 toolset? The toolset can be selected in project properties)

I just released a beta version of 2.46.3. I've still compiled it using MSVC 2010 and I did not need the EncoderPointer.lib since I do not link to the standard library in exehead. So NSIS itself still requires Windows XP+ to build installers but the installer it generates should be able to run under Windows 2000. Lord Mulder, can you check this when you have the time? I've also added GetFontName and GetFontNameLocal to round out the font commands.


Originally posted by jimpark
I just released a beta version of 2.46.3.
Thanks!

Originally posted by jimpark
I've still compiled it using MSVC 2010 and I did not need the EncoderPointer.lib since I do not link to the standard library in exehead.
I see.

Originally posted by jimpark
So NSIS itself still requires Windows XP+ to build installers but the installer it generates should be able to run under Windows 2000.
Actually it needs Windows XP with Service Pack 2. The RTM version or Service Pack 1 won't work either.

(That's why you still may consider linking EncoderPointer.lib into makensis.exe)

Originally posted by jimpark
Lord Mulder, can you check this when you have the time? I've also added GetFontName and GetFontNameLocal to round out the font commands.
I just compiled an installer with NSIS v2.46.3 and it ran through on Windows 2000 just fine :)

Thanks for checking so quickly. If after a week of testing, I don't see any problems, I will release it. I don't think it's a problem to request WinXP SP2 for the developer boxes do you? I would like to keep moving forward with compiler updates as I move my other projects to the new tools.

On a side note, I'm considering dropping the third digit versioning and just moving forward to 2.47 and onwards. If and when the new NSIS comes out it is likely to be version 3, anyway. Besides, my code has forked enough that I don't think I can easily port over any changes from the trunk anyway. And there has been quite a lot of improvements to my code base since 2.46 came out two years ago. Kichik and I have very different philosophies on how NSIS should move forward. I personally think ANSI is dead. I don't believe in creating multiple versions of the build: big string version, logging version, logging with big string etc. I don't care about saving a few kilobytes on a setup that is a couple of megabytes big. So starting the next release, the builds will have the logging ability built in.


If the "new NSIS" you're hoping for is NSIS Unicode, I don't think it'll ever come unless the current unicode branch (your branch) gets merged into the trunk. Right now NSIS development is stalled because of this branching. If you want to continue development, please work on merging, instead of increasing the distance from trunk. You can't shoulder the entire development of NSIS on your own...


Archive: Unicode


Let's be fair MSG. If the merging of my code into the trunk is really what's holding the new NSIS up, then it's done. It was done four years ago when I presented my work. The reality is they are reinventing the wheel. They are redoing my work, their way.

And frankly, in the last four years, there hasn't been much work being done on NSIS at all. The biggest being the Unicode port I did four years ago. Just check the code changes here. Apart from the copyright year updates, I never had to struggle very much to update my codebase to reflect the trunk's. Usually, the changes per release was just a few lines here and there.

And really what does it matter now anyway? I personally don't know a major project that uses the ANSI version. Do you? All the products I use, if they use NSIS, use my Unicode version. The only motivation I can think of to use the ANSI version is if I needed to support Windows 98/ME customers. But what are these people connecting to the internet with abandoned OSes anyway? They don't even get security updates from Microsoft!

But you might be right about me being able to do it alone. Perhaps I should ask for help from the people who have been offering. But so far, it hasn't been that much of a workload. That may of course change with the coming Windows 8 support.


I'm not a programmer, so I can't comment on why work in your branch is or isn't or should or shouldn't be compatible with work in the trunk. The fact remains that the progress in trunk is stalled because of unicode. It's not guaranteed that progress will continue if the stalling factor is removed, but it's pretty certain that progress will not continue as long as it's stalled...

As for the trunk unicodification, if they're redoing your work their way, then wouldn't it be beneficial to all of us if you would help in redoing it? You have the experience, after all. The alternative is either to wait till trunk unicode is finished (unlikely) or to drop trunk development completely.

The latter would be, in my opinion, a bad move. Your work, while cool and very very usable (kudos), is almost source-closing NSIS because it doesn't use the existing NSIS development infrastructure. Fragmentation destroys human & technical resources.


Well, open source also has a way of moving forward when the product gets neglected, stalled, etc. It gets forked and the original dies. I don't know if this is what's happening but with the dearth of activity (releases), it's a possibility.

If you are asking me to redo my work, then no. It could be that my approach of how it should be done is different from what they want to do. And in that case, they have to take the time to actually implement it. If you are asking me to help them do their approach when I believe the approach I took is better, again, no. There is no incompatibility with my approach since I've done it and it works. And I think the fact that it's taking more than two years to do what I've done four years ago should speak for itself.


The official NSIS needs to support other compilers (and POSIX support), not just VS2005+. We cannot rely on the CRT to provide stdio ANSI/UTF8/UTF16 conversion like the fork does...


The CRT does not provide UTF-8, UTF-16, UTF-32, and ANSI conversions. Hence, the NSIS Unicode does not rely on them. What you may be facing is the fact that wchar_t is defined as different widths for the different platforms. The CRT in Linux is going to be 32-bits. The CRT on Windows is going to be 16-bits. If you are trying to do cross-platform building of win32 programs, then you will need to make sure that the CRT that you link to will also be for UTF-16LE. Otherwise, all your wide char string functions will not work right because they want wchar_t to be 32-bits. Or you will have to link in ICU which is a huge library and then convert everything to the "native" wchar_t.

I say forget building NSIS on other platforms. Why bother? NSIS is only a Windows installer solution. Build it on Windows. As for other compilers on Windows, I'm sure with a little work people can get my NSIS Unicode to build on gcc, et al. Is this what you guys are stuck with? I don't have any expertise in cross-platform building and I don't have much interest (no need yet) so I won't be much of a help there.


Originally posted by jimpark
The CRT does not provide UTF-8, UTF-16, UTF-32, and ANSI conversions. Hence, the NSIS Unicode does not rely on them.
Then why are you passing "ccs=<encoding>" to fopen?

You are right, Anders. I do remember using that MS extension to fopen. It was very convenient at the time. Are you guys adamant about not linking to MS CRT? But you aren't adamant about calling Win32 API, right?

Otherwise, you might have to wrap a new class for Unicode text file and simply pass that object around instead of using CRT IO API on the FILE*. That is a more extensive change. But is that all you are stuck on or are there other issues?


Originally posted by jimpark
The CRT does not provide UTF-8, UTF-16, UTF-32, and ANSI conversions. Hence, the NSIS Unicode does not rely on them. What you may be facing is the fact that wchar_t is defined as different widths for the different platforms. The CRT in Linux is going to be 32-bits. The CRT on Windows is going to be 16-bits. If you are trying to do cross-platform building of win32 programs, then you will need to make sure that the CRT that you link to will also be for UTF-16LE. Otherwise, all your wide char string functions will not work right because they want wchar_t to be 32-bits. Or you will have to link in ICU which is a huge library and then convert everything to the "native" wchar_t.
Actually on Linux you would not use wchar_t at all, because on Linux all the API functions support Unicode via UTF-8 encoding, so you can use char all the way. On Windows this doesn't work. The ANSI functions in the Win32 API function, that are defined with the char type, do not support UTF-8 (and thus do not handle Unicode strings). And the Unicode-enabled functions in the Win32 API are defined with the wchar_t type and require an UTF-16 encoding. Everything could have been so easy, if M$ had decided to use UTF-8 in their API!

The general solution for cross-platform applications that build on Windows as well as on Linux is using macros. Instead of char or wchar_t, you use a TCHAR macro. This macro will expand to wchar_t on Windows and to char on Linux. Needless to say that all string manipulation functions need to be replaced with macros too, when going this route. It needs a lot of code change.

The other solution is: Use char with UTF-8 all the way - even on Windows. With that solution you can use the normal char-based string functions and don't need any macro "magic". However, on the Windows platform, you will need to convert from UTF-8 to UTF-16 before certain API functions. For example, on Windows, you can't pass the UTF-8 string to fopen(), but have to write your own fopen_utf8() that internally converts to UTF-16 and then calls wfopen(). That is what I did with a lot of code that had been ported from Linux to Windows, but did not support Unicode yet. The Win32 API provides the needed functions to convert between UTF-8 and UTF-16.


This is the code I use to enable Unicode support on Windows in applications that use char all the way*:
http://pastie.org/private/hftbnzujnzfresihpnpa
(* as is the case with most code ported from Linux or written by people who don't care/know about Unicode support)

All fopen()'s must be replaced by fopen_utf8(). Also the "main" function has to be wrapped like this:
http://pastie.org/private/8fdvykqceqekz002bs9xza

Actually I saw that idea in the LAME code first :)

When I was referring to wchar_t width differences, I was referring to string functions found in <wchar.h>. http://pubs.opengroup.org/onlinepubs...h/wchar.h.html

I do think that _wfopen() not existing in Linux is an oversight, but one that can be easily reimplemented. Just convert your wide strings to UTF-8. Converting from UTF-8 to UTF-16 and vice versa is not hard. You can link to ICU, iconv or (if you have access to Windows API) use WideCharToMultiByte / MultiByteToWideChar using CP_UTF8 as your target. Or you could write your own from the Unicode specs which is also not hard -- I've done it before.

But since we are targeting Windows as the working platform, keep all the strings as UTF-16LE. We aren't really porting a Unix program into Windows, it's really the other way around. Also, if we restrict building of NSIS to Windows, then I don't think you have many problems. I think MinGW needs to link to MS CRT (from what I've heard), so you have all of Microsoft's extensions anyway.

I still don't really understand the motivation of trying to build NSIS on Linux. It's an exercise in cross-platform building and it might be interesting to some. But since NSIS itself needs to run on Windows, I don't really see how anyone would benefit from the exercise.


Originally posted by jimpark
I still don't really understand the motivation of trying to build NSIS on Linux. It's an exercise in cross-platform building and it might be interesting to some. But since NSIS itself needs to run on Windows, I don't really see how anyone would benefit from the exercise.
There are quite a few OpenSource projects that originate from the Linux world and the main developers don't care much about Windows. These people often support Windows ports, as long as it doesn't make too much trouble and as long as it can be done from their usual Linux build environment. So cross-compiling Win32 binaries from Linux is okay, having to set-up and use a Windows machine is not. That's why it may be convenient to have a "native" Linux port of MakeNSIS, even though the final installer will never run under Linux (outside of Wine).

Still I think when compiling the EXE headers on Linux one would have to use a cross-compiler anyway (because you need them to be compiled as Win32 binaries). And that cross-compiler would use 16-Bit wchar_t's, just like any C/C++ compiler on Windows...

Excellent points, Lord Mulder. But is that what we are aiming to provide, an MakeNSIS that runs natively on Linux? I've yet to see it. And if that's what we are aiming to do, I think that's admirable. But if we are just trying to build NSIS (including MakeNSIS) that only runs on Windows but built from Linux, who are we doing this work for? Ourselves, the builders of NSIS? But why? I have a Windows box. I need to actually run NSIS to build my setup and test my Windows products.


I agree that being able to cross-compile Win32 binaries of MakeNSIS from Linux isn't very useful. It certainly would be a nice addition, but IMHO it should be "low priority" feature. But then again you would be using the cross-compiler (not the native Linux compiler) to compile MakeNSIS for Win32 from Linux anyway. And again this eliminates most portability issues, I think. So I guess most problems you will be facing will be "MinGW/GCC -vs- MSVC" quirks rather than "Win32 -vs- Linux" issues...

(If something compiles with the MinGW/MSYS compiler on Win32, it probably compiles with the cross-compiler on Linux too)


i think the main point is that makensis is already provided with the ability to cross-compile, so dropping that probably isn't going to be liked by those who do rely on it. yes it doesn't quite make sense but it could be from part of an automated build system pulling things from a repository and building from there on a linux machine for example. obviously there was a reason why it was added as support so there must have been the demand.

as for the whole this build vs 'official' talk, i was under the impression when wizou was porting things back that it was generally a like for like thing. though that all seems to have stopped / lost any focus which probably isn't helping things. but i can see why people want to maintain some sort of legacy support as tbh, a unicode build doesn't offer that much advantage when you're only providing something for people where ansi will fit - i don't disagree that unicode is useful but there are reasons why some of us do still prefer to use the 'official' builds compared to the unicode ones.

-daz


First of all, if written properly, the very same code can be compiled as ANSI or as Unicode. So if you still want to support an ANSI version while offering a Unicode support, this is doable. It's not hard to do for new code, though it makes some work to "upgrade" existing code. But if you have to rewrite you code for Unicode support anyway, making it support Unicode and ANSI at the same time is minimal extra work.

But why would you want an ANSI version of NSIS nowadays? Every Windows system that is used nowadays does support Unicode. Even good old Windows 2000 (which is not provided with important security updates anymore for more than a year now!) did support Unicode just fine. I know that there are still a few people using Windows 2000, although nobody can seriously recommend this, but even older Windows versions, like 9x and NT 4.0, definitely are a lost case. Not worth bothering. Really! At the same time the lack of Unicode support causes serious problems on non-archaic systems: Can't install to directories that contain Unicode characters (and these may exist on every system!), can't access or create files whose name contains Unicode characters, can't display text that contains characters not available in the user's current ANSI Codepage (a huge limitation for NSIS' otherwise great translation system). And so on...

Why not make the cut:
Go for NSIS 3.xx to be fully Unicode and keep the 2.xx branch for backward-compatibility (for those who really care). It's not unusual at all in the software world to drop support for outdated OS in never versions and maintain the old version for backward-compatibility for a while.


(BTW: If current NSIS already supports cross-compiling, how will adding Unicode support destroy this?)

http://img18.imageshack.us/img18/885...unicode.th.jpg


Originally posted by jimpark
You are right, Anders. I do remember using that MS extension to fopen. It was very convenient at the time. Are you guys adamant about not linking to MS CRT? But you aren't adamant about calling Win32 API, right?
Well, to maintain POSIX support we try not to use too much Win32 stuff in makensis.exe since it has to be in a #ifdef _WIN32 block with a #else for the POSIX analog.

Using the basic CRT stuff is not a problem (fopen etc), but using compiler extensions can be a problem.

So the only solution I see is some sort of wrapper around FILE* that reads ANSI/UTF8/UTF16 and spits out UTF16 on the other side. (I was hoping to add UTF8 support to the ansi version but I'm not sure if that is going to happen (I already have the code for .nlf loading, the normal source files is the problem))

Now I am confused :confused:

Are we still talking about cross-compiling MakeNSIS.exe from a Linux system or are you talking about a native Linux port of MakeNSIS?


Originally posted by LoRd_MuldeR
Are we still talking about cross-compiling MakeNSIS.exe from a Linux system or are you talking about a native Linux port of MakeNSIS?
What would be the point of cross-compiling makensis and then having to use wine or whatever to run it? Even if you don't have a cross-compiler you can compile makensis on POSIX but you need to grab the stubs and plugins from somewhere else... (See "G.3 Building on POSIX" in the helpfile)

Originally posted by Anders
What would be the point of cross-compiling makensis and then having to use wine or whatever to run it?
Exactly my point :)

And that's why I was a bit confused about jimpark's statement "MakeNSIS that runs natively on Linux? I've yet to see it".

Still it is perfectly possible to write code that supports Unicode and compiles on POSIX (Linux, MacOS X) as well as on Windows.

Two common approaches have been mention here:
http://forums.winamp.com/showpost.ph...&postcount=488

(So while retaining cross-platform compatibility makes things a bit more complicated, it shouldn't be a showstopper for Unicode NSIS)

Sorry, I did not mean to confuse but to really show the lack of utility in having MakeNSIS build on Linux. I think the only use is if someone really wanted to build their own toolchain, including NSIS, on Linux as some sort of a build process as DrO mentioned. I still don't know why anyone would do that. And if someone does do that, does he also cross-build GCC or some other compiler for Windows on Linux and then decide to use that on their Windows builds? The fact is, in order to create a Windows installer, they need to run MakeNSIS on Windows.

Why can't we just forget catering to these hypothetical people. It's not worth stalling the project over. They are getting a Windows setup making toolchain that needs to be built on Windows. They will still need to have a Windows box as part of their resources to actually run MakeNSIS and make their installer. And if a few of these hypothetical people exist, do they matter so much to stall the whole development?

Again, I would not be making this argument if MakeNSIS did run natively on Linux, making NSIS a cross-platform setup builder. But that is not the case.


Originally posted by jimpark
Again, I would not be making this argument if MakeNSIS did run natively on Linux, making NSIS a cross-platform setup builder. But that is not the case.
If I understand Anders correctly, then MakeNSIS already can be build as a native Linux application (only the EXE headers can't, for obvious reason) and so it does run natively on Linux. And that, indeed, makes it possible to build Windows applications on Linux (using the cross-compiler) and then package them in an NSIS installer with MakeNSIS for distribution. No Windows machine required.

But once again: That is not the reason that prevents MakeNSIS from supporting Unicode. It is possible to write code that builds natively on Windows, builds natively on Linux and supports Unicode on both platforms. See the link in my previous post, for example.

Originally posted by jimpark
And if a few of these hypothetical people exist, do they matter so much to stall the whole development?
It is in the Debian repository IIRC.

Also, we got a patch from the Fedora/RedHat downstream this month so they are not hypothetical.

Even if we dropped POSIX support, VC6, VS2003 and MinGW would still need fixing... (I know you don't care about VC6/VS2k3 but fixing MinGW would probably also fix the older VC's)

Borland C++ and Open Watcom C/C++ don't not have official support and I don't know how long it has been since anyone tried those but they might also have similar issues...

The point is, POSIX is not stalling anything, the usage of MS extensions is the problem.

To help me understand, they can't build exehead on Linux or do they using a cross-platform compiler and somehow with stubbed libs? If they need to build their setups on Linux, can they do it under Wine?


Anders, also not just MS extensions but also Win32 API calls, as well. Has anyone looked into using iconv or ICU for Unicode support? The standard runtime does not provide enough support to do what we want.


Originally posted by jimpark
If they need to build their setups on Linux, can they do it under Wine?
They could. But that's probably not what they want :D :rolleyes:

(Also Wine is limited to x86 systems by the way it works. Linux also runs on many other architectures)

http://img6.imageshack.us/img6/6006/nsiswine.th.jpg

Originally posted by jimpark
Anders, also not just MS extensions but also Win32 API calls, as well. Has anyone looked into using iconv or ICU for Unicode support? The standard runtime does not provide enough support to do what we want.
What exactly is missing?

makensis has been fully portable to POSIX since version 2.01. The stubs are either cross compiled with MinGW or copied from the Windows build. Since I've added this feature, I've seen it ported to numerous platforms, adopted by some Linux distributions, ported to Darwin and pop-up on one of the *BSD trees. I received help requests to get it working on AIX and other non-mainstream platforms. The Debian and Redhat ports in particular saw caring maintenance with patches, fixes and even upstream patches.

I never stopped to ask why. People just wanted it. I don't have statistics, so I can't give out exact numbers. But they exist. I think the strongest selling point is for projects mainly aimed at Linux. By giving them the option to build everything on Linux, we make the Windows porters job easier. They don't have to maintain their own systems to build the application and can streamline the Windows port into the main project build cycle.

In the past, MinGW was also the only decent free solution for building on Windows. Visual Studio Express didn't exist or wasn't good enough.

As for Win32 API being used in makensis, any of them that still exist in the code are implemented for non-Win32 platforms. I think the LZMA compression module still has some threading functions implemented using pthreads.

--

All this is just technical details. Regarding the future, I do agree Unicode is required.


Thanks Kichik for the info. As to Lord Mulder's question, the standard library does not have a way to convert from one form of Unicode encoding to another, not to mention the normalization of Unicode. For example, Windows wants UTF-16 little endian and likes precomposed characters. Linux's wide chars are UTF-32 little endian if running Intel but possibly big endian on other processors. Mac OS X uses UTF-8 fully decomposed characters. So now you see the extent of the problem with supporting Unicode and multiple platforms. If we want to use the wchar library in the C runtime, we have to encode the Unicode strings to what they want to see in that platform. If the strings are going to be stored in a binary format that will be used on Windows, such as the strings in the setup, they need to be encoded to UTF-16LE.

The new C++11 standard provides more Unicode support, including expressing Unicode string literals as 8 bit, 16 bit or 32 bit, but we will still need to write some code for byte ordering. But using the new C++ standars also seems to be out of the question if we need to support legacy compilers.

No wonder the project is stalled. It's very difficult to move forward with so many restrictions. If this is what needs to be done, then keep the strings in the preferred encoding on the platform so that all the wchar string functions available since C++03 standard can be used during the NSI script processing. Then when saving the string, save them as UTF-16LE so that the exehead doesn't have to convert the strings. So string saved to disk is in the native encoding of Windows, but the string in memory is what's expected by the wchar functions.

The other option is to stick to a single encoding even in memory but that would mean not being able to use the wchar functions in the C runtime and hence having to reimplement them on platforms if the chosen Unicode encoding isn't what the platform CRT likes.


(Call me an uninformed smartypants, but I see Jim say that the problem is very complicated, Mulder says that it's already been solved, and Anders states that the real problem lies elsewhere entirley (namely at ANSI .nsi reading). So while POSIX makensis has been cleared up, I'm wondering if there's still some confusion on what exactly needs to be figured out / agreed upon?)


Originally posted by jimpark
Thanks Kichik for the info. As to Lord Mulder's question, the standard library does not have a way to convert from one form of Unicode encoding to another, not to mention the normalization of Unicode. For example, Windows wants UTF-16 little endian and likes precomposed characters. Linux's wide chars are UTF-32 little endian if running Intel but possibly big endian on other processors. Mac OS X uses UTF-8 fully decomposed characters. So now you see the extent of the problem with supporting Unicode and multiple platforms. If we want to use the wchar library in the C runtime, we have to encode the Unicode strings to what they want to see in that platform. If the strings are going to be stored in a binary format that will be used on Windows, such as the strings in the setup, they need to be encoded to UTF-16LE.
I see that converting between different Unicode encodings might be a problem. On Windows there is a Win32 API function to convert UTF-16 to UTF-8 or some ANSI Codepage (and vice versa). I'm not sure if there is an equivalent on POSIX/Unix systems.

But: I still don't understand why these conversions are needed at all :confused:

On Linux (and other POSIX systems too?) all the API/CRT functions use UTF-8. The NSIS source code files (.nsi) should be encoded in UTF-8 too - Unicode text files almost always are UTF-8 encoded. So on these systems we should be able to use the 'char' type with UFT-8 encoding all the way. And, as UTF-8 is a sequence of individual bytes, there should be no "endianess" problems either. No need to use the 'wchar_t' type with UTF-16 or UTF-32. And no conversions needed anywhere.

Now, on Windows, we can either stick with the 'char' type and use UTF-8 or we can use 'wchar_t' with UFT-16. In the former case we need minimal code change compared to the 'POSIX version', but before each Win32 API we need to convert the argument from UTF-8 to UTF-16 (not really a problem, because the Win32 API also has the required function to convert); it basically needs a simple UTF-8 wrapper function around each Win32 API function (at least around those that deal with strings). In the latter case we would have to replace any 'char' or 'wchar_t' with a TCHAR macro, that expands to 'char' on Linux/POSIX and to 'wchar_t' on Windows, in the code. Some string manipulation functions would have to be replaced by macros too. But generally you can simply put the required macros in a single header file (with a big "#ifdef _WIN32 .... #else .... #endif" around to switch between the different OS) once and include it everywhere...

Now remember that makensis also runs on Windows. In fact, most of the time it does. :) And because all Unicode text must go through UTF-16LE conversion to display correctly, we have this problem. Well, maybe this isn't that bad if we say that makensis is UTF-8 and the Windows GUI shell simply converts UTF-8 to UTF-16LE. The only thing about UTF-8 is that it is so variable in length. MS tries to keep one character per 16 bits which is why it prefers completely precomposed characters, i.e. one code point that has both base and diacritic characters. But UTF-8 is very variable. Each character could be one, two, three, or four bytes longs, or longer in the case with diacritics. And then exehead would have to do the same conversion from UTF-8 to UTF-16. I don't like that the exehead would have to do extra work to deal with all the strings. But this is an option that might then push all the conversion stuff out of the POSIX code and into the Windows code. Anders, this is what you were thinking, right? What did you find?


Originally posted by jimpark
Now remember that makensis also runs on Windows. In fact, most of the time it does. :) And because all Unicode text must go through UTF-16LE conversion to display correctly, we have this problem.
What problem?

On Windows a simple MultiByteToWideChar(CP_UTF8, 0, input, -1, Buffer, BuffSize) will do the conversion.

Originally posted by jimpark
Well, maybe this isn't that bad if we say that makensis is UTF-8 and the Windows GUI shell simply converts UTF-8 to UTF-16LE.
I think you would do the UTF-8 -> UTF-16 conversion yourself before each Win32 API call. The Shell API doesn't handle UTF-8.

Originally posted by jimpark
The only thing about UTF-8 is that it is so variable in length. MS tries to keep one character per 16 bits which is why it prefers completely precomposed characters, i.e. one code point that has both base and diacritic characters. But UTF-8 is very variable. Each character could be one, two, three, or four bytes longs, or longer in the case with diacritics.
Yes, in UTF-8 characters have variable length (in bytes). But that's not a problem. Each byte still "looks" like a normal ANSI character and at the end there always is a NULL terminator. So any function that was designed to handle ASCII characters will handle UTF-8 too. All ASCII characters are even identical between ASCII and UTF-8. Only problem I see with UTF-8 is that we need to allocate some "extra" space (if you want to ensure that n characters fit in the buffer, 4*n bytes are required). But on the other hand: If you allocate n bytes buffer, with UTF-16 you can store a maximum of n/2 characters. With UTF-8 you can store up to n characters, maybe fewer...

BTW: UTF-16 also may use pairs of consecutive 16-Bit words ("surrogate pair"), because 2^16 is not enough for all Unicode characters!

Originally posted by jimpark
And then exehead would have to do the same conversion from UTF-8 to UTF-16. I don't like that the exehead would have to do extra work to deal with all the strings. But this is an option that might then push all the conversion stuff out of the POSIX code and into the Windows code. Anders, this is what you were thinking, right? What did you find?
Now, if we talk about the EXE header again: Here we do not need to care about POSIX at all. It will never run anywhere else but on Windows (Wine). It will be compiled either on Windows or by a cross-compiler. Thus in the EXE header it is safe to write Windows-specific code, i.e. use 'wchar_t' with UTF-16 all the way and call Win32 API functions directly. The only conversions that will be needed is: Strings that have been generated by MakeNSIS and are stored in the installer EXE's "data" section need to be converted from UTF-8 to UTF-16 once. But I doubt that this causes too much performance overhead. And again MultiByteToWideChar() will do the job...

Maybe "problem" was a strong word.

Correct me if I'm wrong but here's what I've gathered so far:

  1. makensis should only call standard library (no win32 calls if possible).
  2. if win32/posix is needed, then we will need to do #ifdef for win32 and posix.
  3. makensis should be able to read various Unicode encoded files. NSI, NLF, NSH etc. These files may be UTF-8, UTF-16LE/BE, UTF-32LE/BE and we should ideally support them all.
  4. makensisw (Windows GUI) should be able to read outputs from makensis and display the output text to the user.
  5. exehead will only use win32 and will not link to any external libraries including C runtime to reduce size.
  6. plugins will need to handle Unicode strings.

I think point 3 is why I think we should consider linking to iconv or ICU. Point 4, I think we can deal with by using MultiByteToWideChar since we should only ever see UTF-8 coming out of makensis. And once we have iconv or ICU, I think we should convert the strings to UTF-16LE when storing them in the data section so that exehead will see UTF-16LE and not have to convert. This also means that for point 6, the plugin writers will only have to deal with the native Windows Unicode encoding.

Also, as a side note, I think we should throw out the possibility of building ANSI while doing this work. There really isn't a good argument for keeping ANSI support and it's a simplification that can help everyone conceptually get their heads around the problems.

To #2
If not done already, it may be wise to refactor OS-specific stuff into some "OS Support" class/file (i.e. have a "os_support.h" with the function declarations and have several "os_support_win32.cpp", "os_support_linux.cpp", etc. files with their OS-specific implementations), rather than have a lot of #ifdef's all over the place...

To #3
Why is that? Why not say that all text files (NSI, NLF, NSH) have to be UTF-8 and that's it? If required, any encoding can be converted to UTF-8 easily with a command-line utility, like iconv. Anyway, I can't remember to see any "plain text" Unicode files with a different encoding than UTF-8 in the wild. Is this really used? And do we really need to support it ???

To #4
Does makensisw need to compile natively under POSIX too? I don't think so. People that use Linux & Co generally know how to use a console! Especially because NSIS is targeted at developers! Anyway, the one and only way, that I am aware of, to properly output Unicode characters to the Windows console is outputting UTF-8 to the STDOUT. However you must also call SetConsoleOutputCP() and enable UTF-8. output. There is no UTF-16 mode for SetConsoleOutputCP()! Last but not least, you must not use printf() and friends, because they screw up UTF-8 strings with their "translation" functions. Use WriteFile() instead and write to the handle you got via GetStandardHandle(). This of course means that the GUI applications, that redirects the console application's output, has to treat the text as UTF-8 too.

(BTW: You may think that you can write UTF-16 strings to the Windows console by using wprintf(), but that's not the case. It internally converts to the ANSI Codepage and replaces all characters by '?' that are not available in the current ANSI Codepage *d'oh*)


Yes, we need #3 because UTF-16LE is the default encoding for anything Unicode on Windows. When you opened up a resource file in Developer Studio, you will find that it is encoded in UTF-16LE by default. If you save your text file as Unicode in Windows notepad, it is UTF-16LE by default.

Also, we may need to access data from other Unicode resources like OpenType fonts. By default, for the Windows platform, the encoding of strings in the names section is UTF-16BE. (In fact, everything in an OpenType font file is big endian.) So if we need to get some data from other sources, likely, we will need to convert.


Originally posted by jimpark
Yes, we need #3 because UTF-16LE is the default encoding for anything Unicode on Windows. When you opened up a resource file in Developer Studio, you will find that it is encoded in UTF-16LE by default. If you save your text file as Unicode in Windows notepad, it is UTF-16LE by default.
We should think about this. Any halfway decent Text/Code editor can save as UTF-8. Even the Windows Notepad can ;)

And, as said before, it is easy to convert between the different encodings before handing the file to MakeNSIS.

Also, if you don't restrict NSI/NLF/NSH files to a specific encoding, how do you detect a file's actual encoding at runtime ???

(I know that you may be able to guess the encoding via BOM character, but the BOM may be missing)


Originally posted by jimpark
Also, we may need to access data from other Unicode resources like OpenType fonts. By default, for the Windows platform, the encoding of strings in the names section is UTF-16BE. (In fact, everything in an OpenType font file is big endian.) So if we need to get some data from other sources, likely, we will need to convert.
That could be an issue :(

BTW: Why does MakeNSIS have to deal with OpenType font files?

[EDIT]

After a quick Google search I found this:
http://utfcpp.svn.sourceforge.net/vi...?revision=HEAD

Frankly, NSIS doesn't really need to handle OpenType files. But it would be nice to. I've gotten requests to get version information and font name information so that a font can be updated for install. It's a perfectly valid thing to install fonts. And so I've provided those capabilities in the last release of Unicode NSIS.

And all the existing users of Unicode NSIS probably have their NSI files as UTF-16LE.

Anyway, I have no qualms about linking to new libraries. I don't want to write Unicode conversion code myself, especially if I have to support Unicode normalization. If makensis runs on Mac OS X, for example, it decomposes all the Unicode strings so even if you convert the strings to UTF-16LE, the decomposed character may not render correctly on Windows and will probably fail string comparisons with those entered by the user since Windows prefers composed characters. So normalization is also an issue. This is a consequence of Unicode support plus multi-platform support.


Back to another topic for a moment:

Originally posted by jimpark
I just released a beta version of 2.46.3. I've still compiled it using MSVC 2010 and I did not need the EncoderPointer.lib since I do not link to the standard library in exehead. So NSIS itself still requires Windows XP+ to build installers but the installer it generates should be able to run under Windows 2000. Lord Mulder, can you check this when you have the time? I've also added GetFontName and GetFontNameLocal to round out the font commands.
Can you please also re-compile the plug-ins? I noticed the nsExec plug-in fails on Win2k.

By looking at the file dates, it seems the plug-ins included in v2.46.3 are the same as in v2.46.2.

I reverted to the 'nsExec.dll' from your v2.46.1 release and the problem is gone...

[EDIT]

Okay, it seems that 'nsExec.dll' creates a temporary executable that does the actual job.

While I don't understand why that is done, it explains the problem:

The temporary EXE won't run on Windows 2000 because its OperatingSystemVersion field is 5.1.

(...because it has been compiled by the VisualStudio 2010 compiler)

Thanks for the report, Lord Mulder. I thought I build everything with the correct SUBSYSTEM setting. I will check on that again. And I will clean out the objs. But you will have to wait until Monday for the new build.


I've updated the build and uploaded 2.46.3 Beta 2. Please let me know if you experience any problems.


The next version of VS will drop XP support as well: http://connect.microsoft.com/VisualS...details/690617


Originally posted by jimpark
I've updated the build and uploaded 2.46.3 Beta 2. Please let me know if you experience any problems.
Thanks! I will give it a try, as soon as I have some spare time...

[EDIT] Just quick note: The 'uninst' stub still has a file date from 2002 [/EDIT]

Originally posted by Anders
The next version of VS will drop XP support as well: http://connect.microsoft.com/VisualS...details/690617
:igor:

Too bad. This would make the next VS useless for most developers for a long time. While the market share of Win7 is growing, XP still has around 40% (and Vista never got any noteworthy market share). Unless, of course, there will be a workaround to fix XP-compatibility.

Archive: Unicode


That's too bad. Visual Studio 2011 is supposed to support std::atomics and a lot of the std threading library. I was looking forward to that. No XP support would effectively kill it for us also. We need to support WinXP. Time to write to Microsoft.


Originally posted by LoRd_MuldeR
There has been a new Unicode NSIS release shortly:
http://code.google.com/p/unsis/downloads/list
Unicode Setup from this location is detected as Adware : :(

http://www.virustotal.com/file-scan/report.html?id=82b3056fbbcf76cc6e177a22f48d0f48aa46e769039495ca2651ac29aa5e8c0b-1317715970

This is Avira response :

Dear Sir or Madam,thank you for your email to Avira's virus lab.
Tracking number: INC00846451.

We received the following archive files:
File ID Filename Size (Byte) Result
26326890 suspect_FALSE.zip 1.69 MB OK

A listing of files contained inside archives alongside their results can be found below:
File ID Filename Size (Byte) Result
26326891 nsis-2.46.3-Unico...up.exe 1.71 MB MALWARE


Please find a detailed report concerning each individual sample below:
Filename Result
nsis-2.46.3-Unico...up.exe MALWARE

The file 'nsis-2.46.3-Unicode-setup.exe' has been determined to be 'MALWARE'.Our analysts named the threat ADWARE/Adware.Gen.This file is detected by a special detection routine from the engine module.
Please note that Avira's proactive heuristic detection module AHeAD detected this threat up front without the latest VDF update as: ADWARE/Adware.Gen.


Originally posted by mrjohn
Unicode Setup from this location is detected as Adware : :(

http://www.virustotal.com/file-scan/report.html?id=82b3056fbbcf76cc6e177a22f48d0f48aa46e769039495ca2651ac29aa5e8c0b-1317715970
I've checked the link and saw that they had listed nsis-2.46.3-Unicode-setup.zip which is not a name of any file I've uploaded. The MD5 digest it has listed do not match any file I've uploaded either. So I can only conclude that whatever file they've tested is not mine.

I'd write a mail to virus_malware@avira.com in order to clearify that.

In my experience they are quite responsive...


I wrote to avira and they verified that the Unicode NSIS files are showing as being clean.


Can we get a direct link? I can't find the download link on the site.
Можем ли мы получить прямую ссылку? Я не могу найти ссылку на скачивание на сайте.


Originally posted by Zinthose
Can we get a direct link? I can't find the download link on the site.
Можем ли мы получить прямую ссылку? Я не могу найти ссылку на скачивание на сайте.
http://code.google.com/p/unsis/downloads/list

Originally posted by Yathosho
http://code.google.com/p/unsis/downloads/list
.... OPPS.... :stare:

I meant to post this in another topic... CURSE you multi-tabbed browsing!!

it seems obvious, but is that ANSI build fully compatible with the official nsis? just asking, cause i'd prefer to install it over my current installation and not in a seperate folder.


Yes, the ANSI build should be a superset of the official NSIS build.


GetVersion.exe is not a valid Win32 application.
Hello,

I compiled following Code with Unicode NSIS 2.46.3:


!define File "program.exe"
OutFile "GetVersion.exe"

Function .onInit
## Get file version
GetDllVersion "${File}" $R0 $R1
IntOp $R2 $R0 / 0x00010000
IntOp $R3 $R0 & 0x0000FFFF
IntOp $R4 $R1 / 0x00010000
IntOp $R5 $R1 & 0x0000FFFF
StrCpy $R1 "$R2.$R3.$R4.$R5"

## Write it to a !define for use in main script
FileOpen $R0 "DefineValues.txt" w
FileWrite $R0 '!define PRODUCT_VERSION "$R1" $\n'
FileClose $R0

Abort
FunctionEnd

Section
SectionEnd


I get a GetVersion.exe, but if I try to start it from the Windows Explorer, I get the message

GetVersion.exe is not a valid Win32 application.
If I use NSIS 2.46, it works fine and I get DefinesValues.txt with the entry !define PRODUCT_VERSION ....

I know the post
Originally posted by vcoder
This script work well on ANSI version of NSIS and failed on Unicode version: ...
I use Windows 7 64 bit, and other more complex Setup-Skript get compiled well with Unicode NSIS.

I saw this once. Try to include at least one FILE in your installer. Can be any non-empty dummy file.

@jimpark: Any ideas?


Originally posted by LoRd_MuldeR
I saw this once. Try to include at least one FILE in your installer. Can be any non-empty dummy file.

@jimpark: Any ideas?
Thanks for the answer.

I tried first following solution:


Section
!tempfile DUMMYFILE
!appendfile "${DUMMYFILE}" "${DUMMYFILE}"
File "${DUMMYFILE}"
!delfile "${DUMMYFILE}"
!undef DUMMYFILE
SectionEnd


It doesn't help, also the dummyfile was packed.

But after I tried


Section
File "program.exe"
SectionEnd


it works, also it is not nice.

Thanks for the help!

Interesting. I will look into it.


Hello again!

While I tried to implement the solution with File "program.exe", I noticed the command !define /product_version in the NSIS User Manual.

Solutions like

Originally posted by vcoder
OutFile "GetVersion.exe"
with GetDLLVersion or GetDLLVersionLocal are for getting the version from a file on the building machine as a compiler constant during compile time, I think.

With the command !define /product_version it's quite easier.

So I write a little NSIS header Attachment 49224 with the macros GetFileVersionLocal and GetProductVersionLocal.
Now I can get a constant with the version number without makeing a dummy setup.


!insertmacro GetProductVersionLocal "$%windir%\system32\kernel32.dll" version
!echo "${version_0}.${version_1}.${version_2}.${version_3}"
!echo "${version}"


For a description see the header file.

when trying to compile a script on windows 2003 server, i get this error:

"The procedure entry point EncodePointer could not be located in the dynamic link library KERNEL32.dll"

some seconds later a second message pops up:

"Unable to initialize MakeNSIS. Please verify that makensis.exe is in the same directory as makensisw.exe"

(it is in the same directory)


SP1 installed?

Minimum supported server
Windows Server 2008, Windows Server 2003 with SP1
Stu

Originally posted by Yathosho
when trying to compile a script on windows 2003 server, i get this error:

"The procedure entry point EncodePointer could not be located in the dynamic link library KERNEL32.dll"

some seconds later a second message pops up:

"Unable to initialize MakeNSIS. Please verify that makensis.exe is in the same directory as makensisw.exe"

(it is in the same directory)
Please see my post here:
http://forums.winamp.com/showpost.ph...&postcount=474

In short: Binaries compiled with VS2010 don't run on systems prior to WinXP with SP-2, unless countermeasures are taken.
(Probably not a big deal for MakeNSIS, which runs on developer machine only, but important for the EXE stubs)

Originally posted by Afrow UK
Minimum supported server
Windows Server 2008, Windows Server 2003 with SP1
i wonder where you even found that, such things should be mentioned on the website. anyway, i'm only using win 2003 because i have no legit copy of windows xp. so there should be no troubles when using xp (sp3)?

Originally posted by Yathosho
i wonder where you even found that, such things should be mentioned on the website. anyway, i'm only using win 2003 because i have no legit copy of windows xp. so there should be no troubles when using xp (sp3)?
EncodePointer function's MSDN page.

Stu

Originally posted by Yathosho
i wonder where you even found that, such things should be mentioned on the website. anyway, i'm only using win 2003 because i have no legit copy of windows xp. so there should be no troubles when using xp (sp3)?
This is not a limitation of NSIS or Unicode NSIS in general. It's just a limitation of the Visual C++ 2010 CRT libraries. And, as jimpark switeched to VS2010 for his latest builds, these builds will now require Windows XP with SP-2 or later. To make it clear again: This only applies to MakeNSIS, not to the rersulting installer EXE. The created installer EXE even runs on Windows 2000...

(Just be sure you really use the latest Unicode NSIS. There was a version that is broken with Win2k!)

Stdin handling is broken:

makensis - < Examples\example1.nsi

@Jim: Thanks a lot for the unicode version! It made my supporting other languages in my installer a lot easier.

I didn´t read through all of the previous 13 pages: Is there already a date when unicode will be available in the vanilla NSIS version? I read something about MakeNSIS v2.50...


There is no date, nor is there any progress to speak of concerning the unicodification of NSIS trunk. If it will ever happen, it won't be any time soon.


That´s a pity =(


as there's no proper focus on getting things done, different people want it to be done in different ways (i.e. some just want to go all unicode and leave at that, others want to sort out what is basically a final ansi version, etc etc) and there's no consensus on how it should be done which makes what MSG said pretty reasonable for there being no eta.

-daz


What about branching? NSIS 3 would be unicode and NSIS 2 could still get bug fixes.


Originally posted by DrO
as there's no proper focus on getting things done
that is the biggest issue. and branching won't help - in fact i think that it's gotten us into this mess to begin with as people are using the unofficial unicode version yet expect support for it, etc. doing another branch with all of the issues i mentioned is just going to be even more of a pain for everyone concerned, ignoring the fact of who is going to do it?

-daz

Originally posted by DrO
as there's no proper focus on getting things done, different people want it to be done in different ways [...] there's no consensus on how it should be done
even open-source projects need decisions and people who make them. so why not let the community decide which direction nsis will go or who will be the decision-makers? that only leaves the question who should be able to vote.

i'm not disagreeing with that point and things can or should be decided on, it still requires people to do things and that seems to be the biggest issue with getting things actually put into place and moving.

hell i'd love to help out (like i did many years back) but this pesky thing known as work gets in the way.

no one is disagreeing that a proper unicode version needs to be done. but how that is to be done (which is more about the implementation than what the end user actually sees) is the biggest stopping point it seems. and with your point, no one is gong to be able to do that if there's no proper structure on who is in proper control of things.


between jim's version, the stuff wizou started to do, what anders has been doing, we've got 3 instances of things being all over the shop without any true focus, not quite doing things the same and just overall causing more confusion - look at the number of posts from people who cannot get plug-ins to work as they think this unicode one is official and don't know what is / isn't correct.


it's really just a big cockup when you look at it all overall which is more detrimental to the community in the long run it seems (going on what i've seen from people's posts over the last few years).

-daz


The major problem is that the unicode fork uses MS 2005+ CRT specific stuff and the official NSIS supports compiling (and building) on VC6, MinGW and POSIX. (The unicode fork is broken in certain places (scroll up to my previous post) since it just relies on the MS code to do all conversion for it) The current code also mixes WCHAR and wchar_t which is a major no-no on POSIX, on Windows they are both UTF16LE but wchar_t can be any type on other platforms...

Except for the UTF8 langfile support in the ANSI build, all the other stuff I have been doing is generic and should benefit both.


@jimpark:

I am running into a little problem with latest Unicode NSIS (v2.46.4) release:

My code looks like this:

!searchreplace PRODUCT_VERSION_DATE "${LAMEXP_DATE}" "-" "."


LAMEXP_DATE contains something like "2012-03-10"

The expected output in PRODUCT_VERSION_DATE is "2012.03.10" (right?), but I only get "2012." :confused:

Went back to Maknsis.exe from Unicode NSIS v2.46.3 release and the issue is gone.

Can you look into this? Thanks! :)

Unicode Python plugin for Unicode NSIS
Hi!

I patched Python plugin for NSIS. Now it also supports Unicode NSIS.
Here is more information about that: http://georgik.sinusgear.com/2013/05...-unicode-nsis/

You can find projet at github - nsPythonUnicode

It was tested on Windows XP, Vista, 7, 8, 2003, 2008, 2012.

Happy Python coding with Unicode NSIS. ;-)


Hi all, :)
On the official website there is a new version NSIS 3.0a1 (Released July 14, 2013).
Do you plan to update to version Unicode NSIS? :rolleyes:


3.0 has unicode support already.


Unicode NSIS 2.46.5 fails with packhdr again?

I:\>upx --best --force "C:\DOCUME~1\User\LOCALS~1\Temp\exehead.tmp" 
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2013
UPX 3.09w Markus Oberhumer, Laszlo Molnar & John Reiser Feb 18th 2013

File size Ratio Format Name
-------------------- ------ ----------- -----------
upx: C:\DOCUME~1\User\LOCALS~1\Temp\exehead.tmp: CantPackException: superfluous data between sections

Packed 1 file: 0 ok, 1 error.