Archive: Large temporary file with /solid compression


Large temporary file with /solid compression
I have a relatively large installer, 1.3 GB uncompressed. You can imagine every little bit of extra compression helps, such as enabling solid compression: the end result with /solid lzma is a sweet 260MB.

The problem I'm seeing now is that solid compression generates at run-time a large temporary file, as large as the whole uncompressed size. (Note: this should not be confused with the temporary file generated at build time; it's a similar issue though, that manifests at install time). The end result is that the user can't install if the space is less than 2.5 GB, and the Temp folder is on the same partition. The extra space required in Temp is also not accounted for in the "required space" calculation, so the user is allowed to continue, only to have the install crash midway.

Is this the normal behaviour of /solid or is it a bug ? I see that solid archives can be uncompressed by 7zip without generating a large temporary file, so it's not a fundamental issue with the algorithm. At the very least, the free space calculation should be updated.

I'm using 2.45


Normal behavior. Don't use solid compression for such large installers.


That's a pity, the benefits of solid compression are most visible on large installers. Without /solid the installer comes out as 300MB, 14% larger.

I'm pretty sure though that underestimating the required free space by a factor of two is indeed a bug.
Can you please give more details about why the NSIS design requires this temp file ? If feasible, I'd like to go down in the source code and remove it, or at least fix the space computation.


There's an open bug for it in the bug tracker.

It's built this way because NSIS doesn't have a specific order in which it extracts files. It just goes by what the script tells it to. Imagine the following script:

Function install
File 2.dat
FunctionEnd

Section
File 1.dat
Call install
SectionEnd
1.dat needs to be extracted first, but 2.dat will be compressed first. If there was no temporary file, it would have to decompress 2.dat twice. A simpler and probably more common example would be:
File 1.dat
File 2.dat
File 1.dat
And it gets more complicated when you throw in plug-ins and callback functions.

As for space requirements, the temporary file goes into the temporary directory which might not be on the same drive where the files are installed. So that adds even more fun to this.

Let me know if you need more details.

Found the bug:
http://sourceforge.net/tracker/index...49&atid=373085

It's a complex issue, indeed. My suggested solution:

:winamp: Installation starts with an empty installdir, and an empty temporary folder

:winamp: Installation maintains a map data structure, that contains associations of the form: {internal path, extraction path}; this data structure starts out empty

:winamp: Whenever a file is read, referred by it's internal path, it is first searched in the map:


For your first example, the install will first try to install File1, but will first bump into File2 and will extract it into the temporary folder. Finally when function "install" is called, File2 will be moved from Temp to it's install location.

For your second example, no files will be extracted in temp. However, when File1 is requested for the second time, it's previous extract location is read from the map, and the file is copied from that location. (This fails if the install already deleted File1 before extracting it a second time; but it's quite an extreme case, I would say, enough to justify re-seeking trough the full installer when it happens)

As you can see the the algorithm allows the files to exist only once, either in Temp or Installdir, so the 2x space problem is fixed. A speed optimization would be to try and create the temporary folder on the same partition that holds Installdir, for an instantaneous file move.

The devil is in the details, though...

You can't trust files to be left unmodified after they're extracted. You'll need to maintain a CRC list as well and verify the CRC of each file before you copy/move it. That too would cause reseeking which could be a very expensive process, especially in your case of a 1.3GB installer.

Plus, there's also the size issue. We try to avoid memory allocation and too much logic in the stub. But that's for later.


You are of course right about potential changed files, I figured it out only after posting :)
The memory requirements can't be to large: if each element in the map requires no more than 256 bytes, it will use maybe a few MB for a typical installer.


I was talking about the stub size, but memory could get problematic too for large installers. Memory allocation of lots of small blocks or using padding to store it as one big block would also waste more memory than should. An installer with 100,000 files could consume as much as 30mb on top of compression related buffers. It's not that bad, but not that nice either.


If I just use NSIS to make a simple Self-Extractor that use best compression, is it possible to rebuilt NSIS to avoid the large temporary file?

I follow this rule:
don't use any plug-ins
don't use File Command in the function which may be called more than once
use ReserveFile to make a correct order like this:

ReserveFile 1.dat

Function install
File 2.dat
FunctionEnd

Section
File 1.dat
Call install
SectionEnd

I like the best compression, but I hate the temporary file.

I try to edit the fileform.c, but it seem to very hard, can anyone help?