Archive: VPatch v1.1 released!


VPatch v1.1 has been released:

http://www.tibed.net/vpatch

What's new:

- Edgewize's runtime is now included by default because of it's low overhead (only 4 Kb!)
- Greatly optimized the Patch Generator. It now loads the entire files into memory, which makes it 5 times faster. If you don't want to do this (because of extremely large files), you can still change the MemoryMode setting in the ini file to revert back to the 1.0 engine.

A GUI is coming soon...

VPatch is currently based on a routine which scans a file for similar blocks.
Ideas on how to generate patches are welcome! (mainly because Clickteam Patch Maker sometimes creates smaller, sometimes bigger patches). I want VPatch to be better than Patch Maker!

Especially EXEcutable files are better with Patch Maker.
If you have any ideas on how to create smaller/better patches, post them here!


It sounds like you may be doing this but I generated a patch maker for OS/2 (have c source) and it didn't look for differences but looked for matches for fragments of the file you wish to end up with. This worked quite well, to speed up lookup I had a number of techniques the main one I can remember off hand (I wrote it years ago) was keeping a table of link lists, each list pointing to matching strings in the original file so if a string began with '0003' it would check all strings identified in the 3rd linked list. I had a lot of different "op codes" to handle short/long as well as near and far strings etc.


Yes you are right, searching for fragments of the target file is better, and VPatch does that.
However, size of these fragments determines the compression - and making fragments too small (currently 256 bytes) will make things quite slow.
I'm not really sure what you mean by link lists, but I'm sure it's something similar to what I use.
Perhaps making fragments smaller will help. Do you know what kind of fragment size you used?
Currently I'm using 256 bytes as standard size because this gives good performance on my PII 233 - but since most people have faster machines nowadays, perhaps I should make it a lot smaller.


My smallest fragment was one byte past the smallest number of codes that I could represent it with, which with the best opcodes was 5 or so bytes!

If I was looking for a fragment that began with "12345678"x then I would take the first 2 bytes "1234"x and look up the table (entry '1234'x) which pointed to a linked list of source fragments which started with '1234'x.

Now a better way might be to pick the first byte and one or more bytes futher into the fragement to do a similar thing, so for example if your were insisting on 256 byte fragments, you might pick the first byte and the middle byte. Of course all this is just to improve performance.
Note that I wouldn't base my design on what a 233 can do, these won't be around much longer (and I had a 200 when I was writing my program)...

While relatively fast to build the patch (given sufficient ram) the actual application of a patch was fast.

With small patches like I was using I found I could do a good job even basing my patch on completely dissimilar/unrelated source files!


The actual minimum for fragment size is 16 (or maybe 12). But I never tried setting it this low - with the 1.0 version it took ages, but perhaps with this faster one it'll work.


With a number of OP codes you can handle "near" and different length fragments (ie < 64, < 128, <256 - ie combine opcode with counter, same for repeat codes) etc and don't forget the fragments available EARLIER in the file being built. With My table lookup the program was pretty fast. It did of course always look for the longest fragment


VPatch scans the entire (source) file for fragments in the target files, and it searches for multiple fragments and then chooses the biggest one (longest fragment).
I did some analysis, and the routine which takes up most time is the scanning routine (I have a block of say, 32 bytes, and I need to find all occurences in a 1 MB memory array). It takes up 90% of the time.

I have multiple OP codes (I always called them headers). 1, 2 and 3: copy from source offset, Size block


1 (1 byte) sourceoffset (long, 4 bytes) size (1 byte)
2 (1 byte) sourceoffset (long, 4 bytes) size (2 bytes)
3 (1 byte) sourceoffset (long, 4 bytes) size (4 bytes)

And I have the same for copy (which are 5-6-7) and consist of ID byte and Size to follow.

Which gives the following header size totals:
1=6
2=7
3=9
5=5
6=6
7=8

I know this wastes a few bits, but this allows for future extension.

What kind of lookup table do you use? I just get the first 4 bytes of the block-to-find, and check every 4 bytes of the target files if they match (increasing by 1 each time). If so, it uses the Delphi command CompareMem() to check if the blocks actually match (CompareMem is an Assembler routine of Delphi).
How could you do this more efficient? (I know... write it in assembler).


This may sound a little complex, but its how my (still unfinished) personal patching program will work.

It generates CRC16s (unsigned short integer checksums) of consecutive non-overlapping 64-byte chunks of data in the original file. These 2-byte CRCs are used as indexes into a hash array of linked lists.

Each linked list item has an offset (the position in the original file where the 64-byte block that had this CRC can be found) and a pointer to the next item.

To scan for fragments, the code generates a CRC16 for the first 64 bytes of the file. It then looks up the offsets and compares them to those found in the hash array; if there is a match, it computes the length of any of the possible matches in the linked list and picks the longest.

If it can't find a match, it advances by one byte and computes a new CRC16. Lather, rinse, repeat.

This is not quite an optimal approach but it very, very good. It is not fast by any means, but it can be speeded up by increasing the block size from 64 to 128 or any other amount. It can be made to perform 'perfect' patches by reducing the block size to 2, but it will then take hours (if not days) to complete even a medium-sized patch.


To Koen van de Sande:

The look up is as described ealier I know where ther first two bytes (and I think I did other "magic") of every string is. The patch process on a P200 took very little time even on 100K or more files (it did however need a lot of RAM to do this - mind you not a lot of RAM compared with what is no typical in PCs).


To Edgewize:

This is a way to do it. VPatch does something similar, but with the actual data of the fragments (and I increase not by 1 byte, but by BlockSize/2).
In theory, BlockSize=2 gives the smallest patches, but you have to watch the overhead of your opcodes. VPatch's biggest opcode is 9 bytes long, so BlockSize=8 would be the bare minimum (because 8/9 are for large blocks).

General:

What kind of file sizes should the patch generator be able to handle? VPatch can handle files up to RAM size (or 2GB if you use the old MemoryMode=0 thingy).

To dbareis:

I think these lookup tables are not suitable for files larger than 1 MB (because for every 2 bytes you need to store the offset in the look-up table->1000000*4=4MB of memory for the lookup table of a 1 MB file).
Wait a minute, 4 MB isn't so bad... Perhaps I think about lookup tables :)


hi koen,

could you add a switch so that GenPat.exe/LNK.exe closes after the patchfile creation, or make it close by default and add a /PAUSE switch like nsis has?

at the moment WaitForSingleObject doesn't work correctly and i can't get the console outputs in the GUI i wrote ;)

cu yzo


Yes, such an option is needed. It's just that in Delphi, the console automatically closes after program execution finishes, so I added the press key thing. I'll make sure it's deactivated by default in the next version.

On what kind of GUI are you working? For NSIS or VPatch or both?


hi,

Originally posted by Koen van de Sande
Yes, such an option is needed. It's just that in Delphi, the console automatically closes after program execution finishes, so I added the press key thing. I'll make sure it's deactivated by default in the next version.
so you could do it like this before finishing:

if ParamCount > 0 then
for p:=1 to ParamCount do
if ParamStr[p] = UpperCase('/PAUSE')
then
begin
Writeln('Press ENTER to Exit');
Readln;
Break;
end;

Originally posted by Koen van de Sande
On what kind of GUI are you working? For NSIS or VPatch or both?
well, powerpimpit for pimp/nsis is almost done and today i found your vpatch and thought i make a quick 'n dirty gui for it ;)

cu yzo

I just found in the help there also is a command called FindCmdLineStr which does the same.

So here's a new (and faster again) version:

VPatch v1.11
============

What's new:

- Command line options added (such as /WAIT, /DEBUG, and all the ini settings can now be changed on the command line)

- Optimizations to the patch generator (a bit faster)

- Added example of usage in NSIS in file Test.nsi

- Changed the default settings another time

- Increased maximum number of blocks to 10,000 from 1,000.

Which leads me to another question: What filesize do you intend to use with NSIS? 100 KB? 1 MB? 10 MB?
What should be the normal filesize (aka the size for which everything is optimized)?

Has anyone got suggestions for a 'standard' set of patch files, in order to compare performance on different machines?


When can a GUI be expected?


yazno is working on something. And I just don't have time at the moment - until the end of Mai I don't have the time to create a GUI.

So I guess you'll have to wait.


hi,

i am close to finish it, so stay tuned ;)

cu yzo


Mmm, I didn't include a link with the v1.11 release. So now's the time to test out my new download start script:

http://www.tibed.net/download?id=vpatch

This link should automatically redirect you to the latest version. I like it myself :)


Didn't know
yazno, I didn't know you'd already released a GUI 2 days ago!
Cool, I'm downloading it now... stay tuned.


hi,

ups sorry i forgot to tell you :)

well it's just basic funtionality and coded straight away. not that beauty and undocumented.
if you want i send you the sources for further development.

cu yzo



Sure, then I have something to start with :)
My e-mail is koen@tibed.net
It's nice (though there is a small thing with the Options, the Maximum setting is for large files only, but hey, it didn't say that in the ini :)