Archive: "Space required" rounding


"Space required" rounding
It seems that NSIS actually truncates extra digits of the value. For example, if all files require 1.59 MB of disk space, NSIS displays "Space required: 1.5MB".


Does anybody believe that such behaviour is normal?


It doesn't truncate digits, but it does round file sizes to 1024 bytes to simulate the space it'll take on the hard disk. The real block size is unknown at compile time, so this value was selected as a compromise.


I've tested it again: 9 files with total size 1656142 bytes (including 2 zero-length files) plus uninstaller (32325 bytes). NSIS reports "Space required: 1.5MB" :(

BTW, should I manually add uninstaller size to program size?


My mistake, it does round down the number to contain only one digit after the decimal dot. The code in charge is located in the SetSizeText function in Source\exehead\Ui.c.


__Slightly modified version of you function will round sizes up:


static void NSISCALL SetSizeText(int dlgItem, int prefix, unsigned kb)
{
char scalestr[32], byte[32];
unsigned sh=0;
int scale=LANG_KILO;

if (kb >= 1024*1024) { sh=20; scale=LANG_GIGA; kb+=(1<<20)/10; }
else if (kb >= 1024) { sh=10; scale=LANG_MEGA; kb+=(1<<10)/10; }

wsprintf(
GetNSISString(g_tmp,prefix)+mystrlen(g_tmp),
"%u.%u%s%s",
kb>>sh,
((kb*10)>>sh)%10,
GetNSISString(scalestr,scale),
GetNSISString(byte,LANG_BYTE)
);

my_SetDialogItemText(m_curwnd,dlgItem,g_tmp);
}

__The problem is that a free space needs to be rounded down and, therefore, this function must have two versions or additional parameter:


static void NSISCALL SetSizeText(int dlgItem, int prefix, unsigned kb, bool RoundUp)
{
char scalestr[32], byte[32];
unsigned sh=0;
int scale=LANG_KILO;

if (kb >= 1024*1024) { sh=20; scale=LANG_GIGA; }
else if (kb >= 1024) { sh=10; scale=LANG_MEGA; }

if (RoundUp) { kb+=(1<<sh)/10; }

wsprintf(
GetNSISString(g_tmp,prefix)+mystrlen(g_tmp),
"%u.%u%s%s",
kb>>sh,
((kb*10)>>sh)%10,
GetNSISString(scalestr,scale),
GetNSISString(byte,LANG_BYTE)
);

my_SetDialogItemText(m_curwnd,dlgItem,g_tmp);
}

__If you don't want to increase size of exehead, please at least increase number of digits after the decimal point: replace "((kb*10)>>sh)%10" with "((kb*100)>>sh)%100". BTW, this construction in its current state fails with sizes>409.5GB (with two digits after point it fails after 40.95GB), so consider using a 64-bit number:

(UINT)((((UINT64)kb*100)>>sh)%100)

And probably you will save few bytes by eliminating "else" construction:


if (kb >= 1024) { sh=10; scale=LANG_MEGA; }
if (kb >= 1024*1024) { sh=20; scale=LANG_GIGA; }

May I hope that you make necessary changes yourself or should I make them locally in the every next version?


I'll take a look after 2.09.


Up ;)


Doctor, people ignore me :(


Doctor, people keep pushing me even though they know I only have the weekends to work on NSIS and those are normally used for other things than work :(

Not to be bitching or anything, I like working on NSIS. But come on... Did I say I'll look at it the second I finish releasing 2.09?


Ok. Sorry for the pressure.


roundUp added to CVS and will be in 2.10. Thanks.

BTW, this construction in its current state fails with sizes>409.5GB (with two digits after point it fails after 40.95GB)
Actually, it's limited at 4TB, because it's counted in kilobytes, not bytes.

Thanks for the changes (really!).

Actually, it's limited at 4TB, because it's counted in kilobytes, not bytes.
It is multilplied by 10 and then divided back, so limit is 409.5GB.

P.S. Maybe the "roundUp" parameter of type BOOL instead of the "int" will be more consistent?

You're right. I was actually thinking of the total limit, while you were, quite obviously, talking about the corruption of the digit after the decimal dot.

64-bit integer is quite expensive, so I've trimmed the overflowing part of the number using a mask:

    (((kb & 0x00FFFFFF) * 10) >> sh) % 10, // 0x00FFFFFF mask is used to
// prevent overflow that causes
// bad results
I'm having second thoughts about the rounding up. It doesn't make any sense to round up just the required size and round down the available size. It causes a weird situation where if the available and required sizes are the same, they may not be displayed as such. Even more, the required size displays as bigger than the available size, yet the installation is allowed because they're equal. They should both be rounded using the same method. Why do you think the available size should be rounded down?

I've replaced int with BOOL, but not for consistency. Most of the code in NSIS uses int. However, it does make sense to use BOOL.

The trick with the mask will not work :( For example, 0x1FFFFFF is decimal 33554431, but 0xFFFFFF is decimal 16777215. You will lose the real decimal digits even sooner than without the mask.

64-bit arithmetics in our case is not very expensive, because 64-bit MUL and DIV are provided by 32-bit x86 instruction set and 64-bit shift can be simulated by two instructions (SHRD and SHR). Indeed, most complilers (including your MSVC) will generate far from optimal code for 64-bit arithmetics, but if you have no objections, you can use inline assembly:


unsigned fraction;
//---
__asm
{
mov eax,kb
mov ebx,10
mov ecx,sh
mul ebx
shrd eax,edx,cl
shr edx,cl
div ebx
mov fraction,edx
}


As for rounding direction... Overestimate of available space is dangerous as well as underestimate of required space ;)

I can suggest another solution. Yes, different rounding is not ideal. The only more or less acceptable in both cases method is rounding to nearest, so we can strip this inconvenient "roundUp" parameter and adjust the bias:
kb+=(1<<(sh-1))/10;
or
kb+=(1<<sh)/20;

I think the second variant is more rational.


The final version may be looking like this:

static void NSISCALL SetSizeText(int dlgItem, int prefix, unsigned kb)
{
char scalestr[32], byte[32];
unsigned sh=0;
unsigned fraction;
int scale=LANG_KILO;

if (kb >= 1024) { sh=10; scale=LANG_MEGA; }
if (kb >= 1024*1024) { sh=20; scale=LANG_GIGA; }

kb+=(1<<sh)/20;

__asm
{
mov eax,kb
mov ebx,10
mov ecx,sh
mul ebx
shrd eax,edx,cl
shr edx,cl
div ebx
mov fraction,edx
}

wsprintf(
GetNSISString(g_tmp,prefix)+mystrlen(g_tmp),
"%u.%u%s%s",
kb>>sh,fraction,
GetNSISString(scalestr,scale),
GetNSISString(byte,LANG_BYTE)
);

my_SetDialogItemText(m_curwnd,dlgItem,g_tmp);
}


P.S. Should not we display two digits after the point while using rounding to nearest?

0x1FFFFFF and 0xFFFFFF may not translate to the same number in decimal, but their modulo by a number that is a power of 2, as long as it's not larger than 0x00800000, is the same. And to get the the digit after the decimal point, that's exactly what needs to be done. The bits that affect the calculation are in the mask. The attached example C file shows that calculating this using 64-bit numbers and the mask gives the exact same result.

The code the compiler creates for 64-bit arithmetic is expensive. If you compile the attached example, you'll see that it calls three functions called __allmul, __allshr and __allrem. The code size is over 3 times larger than the mask code.

Inline assembly is out of the question. It will make the code even more complicated, it's not platform and compiler independent and I'm not sure 64-bit MUL and DIV are not only available on newer processors (though I haven't checked this one).

Rounding to the nearest is indeed the best solution. I'll take a look at it over the weekend. Two digits after the decimal point is too much in my opinion.


You are right - the mask trick is working. Forget about 64-bit calculations (at least for a few years ;)).

BTW, MUL to 64-bit result and DIV with a 64-bit dividend was supported even by 8086/8088 processors.


New uniform rounding uploaded to CVS. Sadly, no mask tricks this time to reduce the size.


Great! NSIS is improving every day ;)

no mask tricks this time to reduce the size
What did you mean? The 0x00FFFFFF mask is still there.

That mask is still there, but I couldn't find any nice tricks for the rounding.


What can be simpler than a simple biasing?


I don't know, I couldn't find it.