Archive: LZMA Compression vs. 7zip SFX


LZMA Compression vs. 7zip SFX
I have noticed that I am unable to get NSIS LZMA to compress as much as 7zip SFX. For example, with the exact same files LZMA SOLID will give me an exe that is 4.98 MB while using 7zip SFX with the same files and the NSIS installer will give me an exe of 4.73 MB.

Though it may not seem like much size is extremely critical in this case and I am wondering if there is a way to get NSIS to improve the compression to the same level when using LZMA?

Thanks,
Robert


Have you tried increasing the dictionary size by using SetCompressorDictSize?

I found this from the 7-zip help file:

...Usually higher Dictionary size gives more compression ratio. But compressing can be slower and it can require more memory.

Memory (RAM) usage for LZMA compressing is about 10 times more than dictionary size. Memory usage for LZMA decompressing is close to value of dictionary size. Memory usage for PPMd compressing and decompressing is almost equal to dictionary size...

Yes, with the default size of 8 MB the exe size is 5.0 MB
with SetCompressorDictSize 16 and above the size is 4.98 MB

With the following 7zip packaging I get an exe of 4.73 MB
7z a -t7z out.7z <path to files>\*.* <path to NSIS exe>\installer.exe -mx -m0=BCJ2 -m1=LZMA:d24 -m2=LZMA:d19 -m3=LZMA:d19 -mb0:1 -mb0s1:2 -mb0s2:3
upx --best 7zSD.sfx
copy /b 7zSD.sfx+config.txt+out.7z setup.exe

Except for the File /r "<path to files>\*" in the installer both installers are the same.


The only explaination I can come up with is the BCJ2 option you have in your 7-zip command line. 7z docs indicate that BCJ2 is a method that allows 32-bit x86 executables to be compressed even further--a feature that may not be implemented in NSIS.

I found more info about this at http://en.wikipedia.org/wiki/LZMA

There may be other reasons. Perhaps someone with more programming experience can give a more detailed (and possibly more accurate) explanation.


another possibility:
once lzma has been implemented in NSIS, the devs took a version of the lzma sdk and modified it to match nsis' structure.

i don't know, if it ever has been updated since then.
because lzma itself has been updated more than one time and improved speed and compression quality.


nsis's version of lzma hasn't been updated in a while and yes, it was taken from the sdk and then tweaked (quite extensively from memory) to fit into the size constraints expected from zlib and bzip methods

-daz


Thanks for the responses.... I was thinking that it may be due to using the older SDK or not using the BCJ2 option... if the reason could be confirmed I might attempt updating NSIS to the latest SDK or adding the BCJ2 option.


Only the decoder was tweaked to fit the size limits. The encoder itself wasn't touched. The code was indeed not updated for a while, but I don't recall seeing any improved compressions in latest versions. BCJ2 is not used in NSIS and can result in better ratios, if you compress 32-bit executables.


kichik - thanks! It does appear to be due to BCJ2... the size difference is approximately the same. I'll take a look at what it would entail implementing BCJ2 in NSIS in the next month or two. If you know of anyone caveats / issues with implementing this I'd appreciate hearing about them.

Cheers,
Robert


There's one issue of aggregating the compression decoder and BCJ2's decoder. BCJ2 doesn't work with single bytes, it needs 5 due to op size. Because streams aren't used, this is a bit harder. You can either go over the entire data again after it's extracted, or preserve a state in the BCJ2 decoder.


would it be possible to upgrade NSIS with latest LZMA codecs?

maybe something for the TODO list :)


It's on the TODO, but it's not that important. I haven't seen anything too exciting in the change log. However, if a patch is submitted... ;)