Archive: Silent Install - Incomplete Releases


Silent Install - Incomplete Releases
I've been having some issues lately and am wondering if it's happened a lot. We're doing silent installs using a website I wrote that uses PSExec to kick off silent installs on test systems. The silent install automatically kicks off the uninstaller and then rolls into the installer. We have the fileoverwrite flag set to ifnewer. What's happening is that the delete of files is lagging, and when the file /r gets started, some of the files haven't disappeared yet, and thus NSI says "SKIPPING" and then we end up with an incomplete app out there. It didn't happen at all at first except on one machine. I figure that machine was slower than the rest. Lately it's been happening on all sorts of machines. I'm thinking my best resolution is to set fileoverwrite on, but just wondering if this is a typical situation.

Thanks,
Eric <><


Sounds like the process is asynchronous. Make sure you execute the uninstaller with ExecWait and _?=.

See: http://nsis.sourceforge.net/When_I_u...he_uninstaller


Yep, we're doing that. Here's an example of one of our uninstall sections:

<!--Code Snippet-->
IfSilent runSilentUninstall

MessageBox MB_OKCANCEL|MB_ICONEXCLAMATION \
"Application is already installed. $\n$\nClick `OK` to remove the \
previous version or `Cancel` to cancel this upgrade." \
IDOK uninst
Abort

;Run the uninstaller
uninst:
ClearErrors
CopyFiles /SILENT /FILESONLY "$INSTDIR\Uninst.exe" "$TEMP\Uninst.exe"
Delete "$INSTDIR\Uninst.exe"
ExecWait '"$TEMP\Uninst.exe" _?=$INSTDIR'
goto done

runSilentUninstall:

ClearErrors
CopyFiles /SILENT /FILESONLY "$INSTDIR\Uninst.exe" "$TEMP\Uninst.exe"
Delete "$INSTDIR\Uninst.exe"
ExecWait '"$TEMP\Uninst.exe" /S _?=$INSTDIR'
<!--End Code Snippet-->

We've been doing these silent installs for about 6 months. The partial release phenomenon originally only happened on one machine, and not consistently (same installer on other machines worked fine). Now it's been moving through all of the machines. Win2k3 OS. Maybe a patch or something is hosing us up???


Can't help much without any method to reproduce this. Try adding some debug messages to make sure it's really synchronized.

If it's really deleting the files only after the installer starts extracting them and it's not a problem with ExecWait, I'd put my money on some anti-virus update.


I'll see if "SetOverWrite on" solves it. It seems like the OS is telling NSIS that things are there that no longer exist. If that's the case, this should fix it (hokey fix, but still a fix). I'm logging (special log build), so I should be able to determine the order of uninstall/install stuff...


OK, now that I've changed it to overwrite, it's getting an error. The log states that it's creating the directory first:

CreateDirectory: "D:\Program Files\AppName\Public" (1)


but this never gets created. So then when it tries to write the files there, it errors:

File: overwriteflag=0, allowskipfilesflag=2, name="MyFile1.aspx"
File: error creating "D:\Program Files\AppName\Public\MyFile1.aspx"
File: error, user cancel
File: overwriteflag=0, allowskipfilesflag=2, name="MyFile2.aspx"
File: error creating "D:\Program Files\AppName\Public\MyFile2.aspx"
File: error, user cancel


Any clue where I go from here to figure out why the CreateDirectory failed?

Thanks,
Eric

Does "D:\Program Files\AppName" exist?

Stu


No, it did, but I deleted it in the uninstaller. Then when the installer tries to create it, it doesn't seem to fail, but it doesn't create it (unless that's what the (1) means after it tries to create the directory).

This is getting more confusing, because we've got a different installer that's doing this same thing (saying error creating such and such file), but the files exist after the installer gets done with the new timestamp and everything.


(1) means it's also SetOutPath. It doesn't actually log directory creation failures. I've added a bit more logging there for the next version.

You should add a few message boxes in there to figure out what happens when.


What happens when "What"?


What happens at what time point. Which files are extracted at what time? Which files are deleted at what time? What happens first? Does the uninstaller really finish by the time the installer starts? Does the file exists when the installer tries to extract it? Does the folder exist when the installer tries to put a file in it? You can even use IfFileExists for that.


Message boxes won't help because this only happens on silent install. I can see what happens, when, by the log file. The uninstaller doesn't log what it does, but it does log that it finished before the installer starts his work.

Exec: command=""C:\DOCUME~1\efetzer\Local Settings\Temp\Uninst.exe" /S _?=D:\Program Files\MyApp"
Exec: success (""C:\DOCUME~1\efetzer\Local Settings\Temp\Uninst.exe" /S _?=D:\Program Files\MyApp")
Section: "Install"
settings logging to 1
logging set to 1


The weirdest thing about this is that we had no problems for 5+ months (except occasionally on one machine out of about 200) and then, all of a sudden, it started happening everywhere... I'll add a few pushes to the log as to whether the directory exists or not when it tries to create it and such.

Message boxes can show in silent installers when /SD isn't used.

The fact Exec finishes doesn't mean the uninstaller really finished executing.

If the problem really came without any change, it's probably a failing anti-virus.


The only change we've identified is the following patch:

http://www.microsoft.com/technet/sec.../MS07-040.mspx


What exactly do you mean by "a failing anti-virus"? That we would have a virus on all of these machines?


No an anti-virus suite that isn't doing it's job correctly.
A false positive is one of them.

Stu


I don't understand what you mean. How can a anti-virus suite affect a release executable.


Ant-virus applications employ a slew of tactics to try and protect your system. Some are simple such as scanning every file on the system once a day and some invasive such as scanning a file when it's modified or even monitor file deletions and modifications and filter those on the basis of known viral behaviors. Sometimes, the more invasive operations can be flawed, either by design or by implementation. That could cause problem such as those you're describing.


Well explained, thanks! I'll dig into it with my systems dudes.


OK, this is definitely not it. We reproduced it this afternoon with the antivirus completely shut off. Any more clues?


How about a machine with no Anti-Virus installed? Not all AVs give you an option to completely turn them off in the GUI.


I didn't use the GUI, I shut off all of the services (runs as multiple services). I reproduced the error with no processes running for Symantec at all. I'm at the mercy of my organization, and thus, not allowed to uninstall antivirus.


Are you sure it has no drivers?


BTW, I have heard of this patch you were talking about as the cause for other problems. I don't remember exactly where, but it's worth searching to see if the problems are related.


I'm having that patch removed from the server that the problem happens on consistently. We'll see where this puts us... Thanks again Kichik!!!


Re: Silent Install - Incomplete Releases

Originally posted by stonkers
I've been having some issues lately and am wondering if it's happened a lot. We're doing silent installs using a website I wrote that uses PSExec to kick off silent installs on test systems. The silent install automatically kicks off the uninstaller and then rolls into the installer. We have the fileoverwrite flag set to ifnewer. What's happening is that the delete of files is lagging, and when the file /r gets started, some of the files haven't disappeared yet, and thus NSI says "SKIPPING" and then we end up with an incomplete app out there. It didn't happen at all at first except on one machine. I figure that machine was slower than the rest. Lately it's been happening on all sorts of machines. I'm thinking my best resolution is to set fileoverwrite on, but just wondering if this is a typical situation.

Thanks,
Eric <><
Okay Eric, I've had this problem before too. It appears that while the uninstaller is described to work where it doesn't start up another copy of itself when you use the _? parameter, that it still does. All _? does is tell it where to uninstall from.

In order to solve this issue, I've posted the following bit of code I used:


function RunPreviousVersionUninstall

StrCpy $0 "$PROGRAMFILES\MYAPP\uninstall.exe"
${GetParent} "$0" $5
IfFileExists "$0" copyuninstaller loopend

copyuninstaller:
CopyFiles /SILENT "$0" "$TEMP\tempuninstall.exe"

setdetailsprint textonly
Exec '"$TEMP\tempuninstall.exe" /S _?=$5'
StrCpy $2 "Uninstalling Previous Version..."
DetailPrint "$2"
StrCpy $4 ""

Sleep 1000

loopbegin:
StrLen $5 "$4"
StrCmp $5 "12" 0 +2
StrCpy $4 ""
StrCpy $4 "$4."
StrCpy $3 "$2$4"
DetailPrint "$3"
${nsProcess::FindProcess} "tempuninstall.exe" $R2
StrCmp "$R2" "0" loopagain
${nsProcess::FindProcess} "tempuninstall" $R2
StrCmp "$R2" "0" loopagain
goto loopend
loopagain:
sleep 1000
goto loopbegin
loopend:

setdetailsprint both
functionend


As you can see I am waiting for the named process to quit, as the uninstaller starts up another copy of itself, with the same name. If you wait until this process has gone away, then the uninstall will be completed. Everyone here told me that _? made the uninstaller run only once, but that's just not true from my experiences.

When you use _?= another process will never be spawned. That can be shown by the following simple script.

OutFile test.exe
Name test
Section
WriteUninstaller $TEMP\uninst.exe
ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'
MessageBox MB_OK "uninstall completed"
SectionEnd

Section uninstall
MessageBox MB_OK uninstalling...
SectionEnd
If this doesn't work for you, it's one huge bug and you should report it instead of creating such workarounds.

Originally posted by kichik
When you use _?= another process will never be spawned. That can be shown by the following simple script.
OutFile test.exe
Name test
Section
WriteUninstaller $TEMP\uninst.exe
ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'
MessageBox MB_OK "uninstall completed"
SectionEnd

Section uninstall
MessageBox MB_OK uninstalling...
SectionEnd
If this doesn't work for you, it's one huge bug and you should report it instead of creating such workarounds.
Yeah it doesn't work for me, perhaps it is because I am using it silently as well. However in my case I used the time waiting around, to do something so it's not something I considered a huge issue for me. However I did ask around about it, and everyone just re-iterated the help file, which I can read all by myself (thank you,) and until I figured out that _? wasn't working for me as documented, has exactly the same problems Erik is describing.

As I said, this is a major issue. I need every possible detail.


Yeah, I'm still getting the same thing with the patch off. Looks like we've got something going since I've been gone. I'm really happy about that because this is kicking my butt!!!! Let me know kichik if I should do 9 9's workaround. I'm hoping not because I would have to do it in about 400 places. Thanks - Eric


No, you shouldn't use his workaround. And if my script shows both message boxes at the same time for you too, I need every possible detail and preferably direct access to the computer from you too.


When i ran it locally - >test /S, it worked fine. The difference being that when I have these problems, I'm using PSEXEC to kick off the exe's in silent mode from another server. I won't be able to see the popups if I run from PSEXEC, so I'll have to figure out a way to modify the test code to show order without needing human intervention. Can you think of a way to do this kichik?


Try writing to the same file with both instances on both their beginning and end, and see what comes out first.


I don't get what you mean "both their beginning and their end"...


OutFile test.exe
Name test
Section

;Write to file here?

WriteUninstaller $TEMP\uninst.exe
ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'

;Write to file here?

SectionEnd

Section uninstall

;Write to file here?

SectionEnd

OutFile test.exe
Name test
Section

# Write here

WriteUninstaller $TEMP\uninst.exe
ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'

# Write here

SectionEnd

Section uninstall

# Write here

Sleep 2000

# Write here

SectionEnd

This isn't giving me anything like the results that I would expect. The following code:

OutFile test.exe
Name test
;Included file(s)
!include "FileFunc.nsh"

Section

FileOpen $7 "c:\testFile.txt" a
FileWrite $7 "Write1"
FileWrite $7 "$\r$\n"
FileClose $7

WriteUninstaller $TEMP\uninst.exe
ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'

FileOpen $7 "c:\testFile.txt" a
FileWrite $7 "Write2"
FileWrite $7 "$\r$\n"
FileClose $7

SectionEnd

Section uninstall

FileOpen $7 "c:\testFile.txt" a
FileWrite $7 "Write3"
FileWrite $7 "$\r$\n"
FileClose $7

Sleep 2000

FileOpen $7 "c:\testFile.txt" a
FileWrite $7 "Write4"
FileWrite $7 "$\r$\n"
FileClose $7

SectionEnd


Produces a file with the contents:

Write2

4.9.5.2 FileOpen Opens a file named "filename", and sets the handle output variable with the handle. The openmode should be one of "r" (read) "w" (write, all contents of file are destroyed) or "a" (append, meaning opened for both read and write, contents preserved). In all open modes, the file pointer is placed at the beginning of the file. If the file cannot be opened, the handle output is set to empty, and the error flag is set.
Add a FileSeek command after each open: FileSeek $7 0 END

Don

Thanks DEMiller9! With the following code:


OutFile test.exe
Name test
;Included file(s)
!include "FileFunc.nsh"

Section

FileOpen $7 "c:\testFile.txt" a
FileSeek $7 0 END
FileWrite $7 "Write1"
FileWrite $7 "$\r$\n"
FileClose $7

WriteUninstaller $TEMP\uninst.exe
ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'

FileOpen $7 "c:\testFile.txt" a
FileSeek $7 0 END
FileWrite $7 "Write2"
FileWrite $7 "$\r$\n"
FileClose $7

SectionEnd

Section uninstall

FileOpen $7 "c:\testFile.txt" a
FileSeek $7 0 END
FileWrite $7 "Write3"
FileWrite $7 "$\r$\n"
FileClose $7

Sleep 2000

FileOpen $7 "c:\testFile.txt" a
FileSeek $7 0 END
FileWrite $7 "Write4"
FileWrite $7 "$\r$\n"
FileClose $7

SectionEnd


The results are now:


Write1
Write3
Write4
Write2


I believe that shows that the two files aren't being kicked off at the same time, and thus, that the:

ExecWait '"$TEMP\uninst.exe" /S _?=$TEMP'

is working as designed.

But then again, we haven't had the error we usually get in the past week. Could be that the Systems group has done something that we are unaware of to this machine. I'll investigate.

OK, nothing's changed on the server. I'll have to wait until the next time this issue rears it's ugly head and see if the test.exe exhibits different behavior. Other than that, I have no idea where else to go with this other than always checking the deploy log to see whether it has any errors in it, and if so, deploying locally... Thanks for all the help!


OK, so I found the issue again on a different server, ran the test.exe with PSExec and got the same results:

1342

So, I'm thinking that we're barking down the completely wrong path (my dog does that a lot also). One thing that is always the case. It fails creating a folder, and thus, it doesn't write any files to that folder. The only reason I can think of for it not being able to create the folder is because it still thinks the folder is there. So maybe it got deleted in the uninstall, but it's so lightning fast (in silent mode because of the lack of interaction), that the OS lies to NSIS saying that the folder is already there. What if I put a 5 second sleeper at the end of all of my uninstalls to give the OS time to catch up? I'll give it a shot and report back.


Archive: Silent Install - Incomplete Releases


Putting a sleeper is never the right solution. It just hides the problem a bit better. If "lighting fast" suddenly become over 5 seconds because the CPU or HD are really busy, it won't be able to hide it.

Try verifying it's the right tree first by validating the folder exists in the installer. Use IfFileExists.


Originally posted by stonkers

One thing that is always the case. It fails creating a folder, and thus, it doesn't write any files to that folder. The only reason I can think of for it not being able to create the folder is because it still thinks the folder is there. So maybe it got deleted in the uninstall, but it's so lightning fast (in silent mode because of the lack of interaction), that the OS lies to NSIS saying that the folder is already there. What if I put a 5 second sleeper at the end of all of my uninstalls to give the OS time to catch up? I'll give it a shot and report back.
I also found this to be the case, it's some odd thing about the uninstaller/installer interaction.

Starting from a clean build the following were the different states after installs:

Install #1: Full install, completely successful
Install #2: Uninstall previous version, and then attempt an installation. Installation failed, because the install directory could not be created.
Install #3: Full install, (because there is no previous installation,) completely successful.

And round and round it goes. I have a couple ideas, but basically it appeared to me that the issue was related to the current installer having it's installation directory deleted, *after* it already had assumed it was there, and was ready to use.

The solution I used to prove myself more or less right, was to execute a batch file which created the directory, after ruuning the uninstall. It worked perfectly after I did that. As a result, I added a function to our custom plugin to handle this instead.

I don't think I've ever had a problem with the installation directory, only subdirectories that are in the File /r command. I don't specifically tell the installer to create the directories, just let it do it when it copies them in.

There's something that is causing me to switch gears though. When it fails (not really fails, but in the deploy log I see it error on creating a directory), if I run the installer manually it works (even with the /S option). I'm investigating whether PSExec has issues with permissions possibly (seems ludicrous because I'm admin on the machine and it doesn't have problems creating all directories, just one or another).


I'd say it's time for some debugging...

Attached is a set of stubs (for 2.31) that'd simply break into the debugger upon directory creation failure. Use them build an installer that'd be installed using psExec. On the computer where the installer is executed, install windbg and set it as the postmortem debugger. Now execute the installer using psExec and hope the magic psExec does doesn't block windbg from your desktop.

Once windbg is open, at the very least create a crash dump using .dump /ma C:\CreateDirectoryFailed.dmp and post it here.


I can't get past the 7z file. Do I need to change the extension to use it or is there a specific software I have to open it with?


No, it's just a simple 7z file. Try with the latest 7-zip.


OK, where do the stubs go? I'm not too versed in the inner workings of NSIS...


C:\Program Files\NSIS\Stubs.


Nevermind, I found the obvious "stubs" directory.


OK, I've got everything setup except the final step. So, I'm running the installer using PSExec silently from another machine. How am I going to catch the debugger open? I'm not on the machine.


You can setup remote debugging. Set the postmortem debugging to:

windbg -p %ld -e %ld. –server tcp:port=12345
Then, when it seems like psExec has frozen, try connecting to the debugger which should open on the remote computer using:
windbg -remote tcp:server=1.2.3.4,port=12345
Of course you'll have to replace 1.2.3.4 with the real IP.

OK, thanks! I'm impressed with your windows knowledge kichik!!!


OK, I haven't managed to get a dump yet, but I did get a better error when I put the latest NSIS on there. When it crashes creating a folder, it says (err=5):

CreateDirectory: can't create "C:\Program Files\MyFolder\MySubfolder\TheFolder" (err=5)

Does this do anything more for you Kichik? If not, I'll get a dump. I'm thinking what I need to do is wait for it to happen and then re-build that release with the debug stubs and try to release again. The problem with this whole situation is, it's TOTALLY unpredictable when or where it's going to happen.


5 is ERROR_ACCESS_DENIED. Why weren't you be able to get a dump at that point? Right after printing that to the log, it should break into the debugger.


I have to do a "debug build". They were in a hurry to test this build, so I didn't have time to rebuild. Next time I'll be able to do it hopefully. It seems to be happing a WHOLE lot less since upgrading NSIS (only once so far).


OK, this sucks!!! So I got the opportunity and did what I said I was going to. The build failed. So I set the debug stubs up and rebuilt. This time, however, everything worked fine. This caused me to dig a little deeper into why the redeploy's aren't working. Well it turns out that when I redeploy with PSEXEC, it is working. It isn't showing us that it's working because our log isn't allowing itself to be overwritten silently. So I'm going to have to catch it when it happens the first time or I won't catch it.

Is it safe to do create a debug exe everytime for just this one test environment? In other words, as a part of my build process, when I was building the exe for the test environment, have the build copy the stubs in there, run makensis, then copy the working stubs back over for creating all the other exe's...

Of course I also want to address the log issue, but that's another subject...

Again, Thanks!
Eric <><


It's perfectly safe to move over the stubs as long as you replace makensis.exe along with them.


I must be doing this wrong. I never understood having to replace makensis.exe. I only changed out the stubs. I was under the impression that those stubs went with the version you had told me to upgrade with. Is there a step that I missed?


Right, I gave you stubs that match makensis.exe 2.31 built with logging. As long as you switch between the normal logging stubs and the special logging stubs, makensis.exe can stay the same. But if you switch to non-logging stubs or another version of stubs, you'll have to switch makensis.exe.


Cool, I'll set this up so that I will get a dump whenever it chokes. Thanks Kichik!


Not working...
When the folder create fails, the psexec process doesn't hang. I'm guessing that these stubs aren't dumping. Should I put some sort of code in my nsi file for debug purposes to force a dump?


Does it at least crash or does it carry on?


It carry's on just like it did before the stubs were changed. Gives the can't create folder error in the log file.


What's the exact error it prints to the log?


I believe as it creates the folder, it says (err=5), and thereafter cannot create files in that folder that doesn't exist. If this isn't the case, I'll cut, paste, and post when it happens again.


Are you sure the executed installer was created with the stubs I gave you? I see no reason it'd ignore a call to DebugBreak(), even when executed with psexec. Open the installer with notepad or some text editor and look for DebugBreak.


I believe we've fixed it (no problems for 2 weeks). It looks like IIS was sometimes blocking the installer from deleting folders. So instead of doing an "IISRESET" after the installer is done, we now do an "IISRESET /STOP" in .onInit and then an "IISRESET /START" after the installer is done.

BTW - I checked it out and the move of those stub files was failing in NAnt. Probably a typo in folder name on my part, but we won't need it this time. I'll look deeper if we need this again though.

Thanks Kichik!!!


A happy ending at last :)