Archive: Copying single multibyte characters with StrCpy


Copying single multibyte characters with StrCpy
  I have this function that takes a string that is a directory path and tries to figure out what's the upper most directory that has to be created.

For example, if the user wants to install the app to:
C:\Program Files\ABC\123

and ABC doesn't exist in Program Files, the function will return C:\Program Files\ABC, it makes it usefull for deleting all the directories that you've created upon uninstall.

The function looks like:

 ; FindTopLevelDirectory

; input, none, uses $INSTDIR
; output, top of stack (replaces, with e.g. C:Program Files)
;the top-most parent of $INSTDIR that does not currently exist
; modifies no other variables.
;
;Usage:
; Call FindTopLevelDirectory
; Pop $R0
; ; at this point $R0 will equal "C:\Program Files\NonexistantDirectory"
function FindTopLevelDirectory
; $R2 directory path to installation
; $R3 counter
; $R4 temp counter
; $R5 temp string
Pop $R2;the orginal string
StrCpy $R3 "-1"
StrCpy $R5 ""
Loop:
IntOp $R3 $R3 + 1
StrCpy $R4 $R2 1 $R3
StrCmp $R4"" ExitLoop
StrCmp $R4 "\" Directory
StrCpy $R5 $R5$R4
Goto Loop
Directory:
StrCpy $R5 "
$R5"
IfFileExists "$R5*.*" "" ExitLoop
StrCpy $R5 "$R5"
Goto Loop
ExitLoop:
IfFileExists "$R5*.*" +2
Push $R5
FunctionEnd
>
The problem is when the path has Japanese characters, the statement

StrCpy $R4 $R2 1 $R3 

>
returns an invalid character, which is expected, because it's a multibyte character. I somtimes can detect this by doing the following instead:


         StrCpy $R4 $R2 1 $R3

StrCmp $R4"" "" Check
; try 2 bytes in case its a multibyte folder.
StrCpy $R4 $R2 2 $R3
StrCmp $R4"" ExitLoop
Check:
StrCmp $R4 "\" Directory
>
But sometimes the copying 1 character returns a valid character, even though it's part of a multibyte character, and I get what is commonly known as mojibake.

Does anyone have any ideas of a better way to copy one character at a time that handles multibyte characters? Or is there a better way to iterate over the parent directories of a path?

I'm using NSIS 2.01b if that helps.

Use GetParent to get the C:\Program Files\ABC bit, then use GetFileName to get hold of ABC.

All functions are on the NSIS Archive

-Stu


That has the same problem.


StrCpy $R2 $R0 1 $R1 

>

Have you tried getting the latest CVS, because maybe new support for these characters have been added.
I know that it will be a lot of work converting to the new NSIS2b4 format, but it's worth it.

-Stu


The latest CVS version won't help with this. But it does contain an example to what you're trying to do (folder stuff, not MBCS). You can find this example online, here.

To be to get one "real" character you should use CharNext with System.dll.

BTW, why are you comparing it to a quoted space?


With the GetParent and GetFileName functions, just blank out the lines that compare the variable to "" that come below the single character seperation.
It won't matter whether or not the character is invalid ("") or not then.

(All you want to look for is "\", so there is no need to check for "" which is usually the output got at the end of a string)

-Stu


I'm not comparing anything to a quoted space. It's an empty string. "" :)


So if i blank those lines, how does it know if it's reached the end of the string?


Haha, I'm going blind :)

Anyway, the only way you'd know what you're up against is by using CharNext or IsDBCSLeadByte. But you don't need either, just use the script I have attached, it's much more efficient.


Right, I see why that solution would work. There's a problem with it however, if the user had created an empty directory previously, then installed the application into that directory, the uninstall would delete the directory the user created beforehand even though it wasn't created by the application's installer.

This may or may not be the wish of the user. The safer route is to delete only the directories that the installer creates.


Originally posted by tderouin
So if i blank those lines, how does it know if it's reached the end of the string?
As long as the string has "\" in it then it won't need to go to the end of the string :p

-Stu

$INSTDIR isn't terminated with a \ character.


?
If $INSTDIR is e.g. C:\quake2\dday, then it will see the "\" before "dday".

$INSTDIR is a run-time variable, so changes to whatever the user has entered.

-Stu


Right.
What about this example: C:\Program Files\ABC (C:\Program Files exists, ABC does not)

It checks to C:\Program Files\ finds it exists.
Keeps copying over characters. Copies A, then B, then C, then copies "" not "\" and just loops infinitely because we're not checking for the end of a string that's not terminated by "\".

Doesn't work, does it?


I don't understand why you think it won't work.
It checks for "\", then when it is found (it will be) it gets straight out of the loop, gets ABC from the string, then exits the function.
There is no way it will loop unless you're string has no "\" in it.

-Stu


I know it won't work because I tested it. Take a look at the loop:


    Loop:

IntOp $R3 $R3 + 1
StrCpy $R4 $R2 1 $R3
StrCmp $R4"" ExitLoop
StrCmp $R4 "\" Directory
StrCpy $R5 $R5$R4
Goto Loop
>
It loops copying over a character at time. When it has copied over the final C, it has nothing left to copy, so $R4 is "" on the next iteration. Since it's not terminated by "\", it continues to loop, $R4 continues to be "". Taking that statement out will cause it to loop infinitely if the string is not terminated by a back slash, guaranteed.

I am wondering now why you need the function...
Because, the $INSTDIR is created automatically anyway.

If $INSTDIR is C:\progra~1\abc\123 then subdirs abc and 123 are created anyway.

-Stu


Code is wrong.


StrCpy $R3 0
Loop:
IntOp $R3 $R3 - 1
StrCpy $R4 $R2 1 $R3
StrCmp $R4 "" ExitLoop
StrCmp $R4 "\" Directory
StrCpy $R5 $R5$R4
Goto Loop


-Stu

Right. But if you read my previous posts you'll see I'm worried about the uninstall process.

RMDIR $INSTDIR

only removes C:\progra~1\abc\123 and not C:\progra~1\abc.


There's a problem with what you want too...
What happens if the $INSTDIR is just C:\progra~1\abc?
abc will be removed, then you will remove the next directory down which is C:\progra~1!!!

-Stu


Are you that familiar with the NSIS code?

RMDIR removes the directory only if it's empty.

RMDIR /R removes the directory recursively.

In the previous code C:\Program Files\abc would be identified as the top level directory and stored in the registry. That would be the top most directory that would be removed, not C:\Program Files, not sure where you got the idea that the code would attempt to remove that directory.


Sorry, missed that.

Ok. I will right a small function for this because I think that it would be useful.

-Stu


There is a very simple solution to avoid comparing to "" and it's to simply append a back-slash to the end of the string.

As for deleting empty directories the user have created, unless you want to create the directories manually at install time and save in a log each directory created, that's what you've got.


Try this out.

Usage:
Push 2 ;check 2 last dirs
Push $INSTDIR
Call RemoveDirs


Function RemoveDirs
Exch $R0 ;input string
Exch
Exch $R1 ;maximum number of dirs to check for
Push $R2
Push $R3
Push $R4
Push $R5
IfFileExists "$R0\*.*" 0 +2
RMDir "$R0"
StrCpy $R5 0
top:
StrCpy $R2 0
StrLen $R4 $R0
loop:
IntOp $R2 $R2 + 1
StrCpy $R3 $R0 1 -$R2
StrCmp $R2 $R4 exit
StrCmp $R3 "\" 0 loop
StrCpy $R0 $R0 -$R2
IfFileExists "$R0\*.*" 0 +2
RMDir "$R0"
IntOp $R5 $R5 + 1
StrCmp $R5 $R1 exit
Goto top
exit:
Pop $R5
Pop $R4
Pop $R3
Pop $R2
Pop $R1
Pop $R0
FunctionEnd


-Stu

Just tested it, and changed 1 thing - works fine.
Re-copy the script.
I will put this on the archive.

-Stu


How are you calling it?


http://nsis.sourceforge.net/archive/...ances=0,11,211

-Stu


That looks good. I'll give it a try.

However, how do you know how many directories are created? Would it involve iterating over $INSTDIR in SecCopyUI and create each directory piecemeal correct?

Then we'd probably have to copy a character at a time, right? :) Which takes us back to the original problem. I have yet to try out NextChar, I'll post my findings when I am finished.

Thanks!


Look at my script.
It bypasses the problem that we had before.

By comparing the strlen with the minus (chop number)
Therefore, if the strlen and chop number are equal, then it has got to the end of the string.

If you want to find out how many subdirectories there are, use this:


Function SubdirsAmount
Exch $R0 ;input string
Push $R1
Push $R2
Push $R3
Push $R4
StrCpy $R1 0
StrLen $R2 $R0
loop:
IntOp $R1 $R1 + 1
StrCpy $R3 $R0 1 -$R1
StrCmp $R1 $R2 exit
StrCmp $R3 "\" 0 loop
IntOp $R4 $R4 + 1
Goto loop
exit:
StrCpy $R0 $R4
Pop $R4
Pop $R3
Pop $R2
Pop $R1
Exch $R0 ;output
FunctionEnd


-Stu

So you're code should look like this:


Push $INSTDIR
Call SubdirsAmount
Pop $R0

Push $R0
Push $INSTDIR
Call UninstallDirs


-Stu

tderouin, do the MBCS characters come right if you copy byte by byte? Afrow UK's code should help you test that.


I'm just getting around to testing this now, SubdirsAmount doesn't actually record how many directories are going to be created. It should probably be changed to check to see if a directory exists, if it doesn't exist then increment the number of subdirs, if it doesn't exist don't increment but continue copying.
I'll play with it and post new code if I find a better way.

Ex:
SubdirsAmount C:\Documents and Settings\user\Start Menu\Programs\Product

returns 6

but only one dir is being created, so in the uninstall log you see:

rmdir C:\Documents and Settings\user\Start Menu\Programs\Product
rmdir C:\Documents and Settings\user\Start Menu\Programs
rmdir C:\Documents and Settings\user\Start Menu
rmdir C:\Documents and Settings\user
rmdir C:\Documents and Settings
rmdir C:

Which wouldn't look to pleasant to the user.


I found this was a good way for finding the numbeof subdirs that had to be created:


    Function SubdirsAmount

Exch $R0;input string
Push $R1
Push $R2
Push $R3
Push $R4
Push $R5
StrCpy $R1 0
StrLen $R2 $R0
loop:
IntOp $R1 $R1 + 1
StrCpy $R5 $R0-$R1
StrCpy $R3 $R0 1-$R1
StrCmp $R1 $R2 exit
StrCmp $R3 "\" 0 loop
IfFileExists "
$R5*.*" loop ; don't increment if the directory exists
IntOp $R4 $R4 + 1
Goto loop
exit:
StrCpy $R0 $R4
Pop $R5
Pop $R4
Pop $R3
Pop $R2
Pop $R1
Exch $R0 ;output
FunctionEnd
>
So when installing I would do:

   Push "$INSTDIR"

Call SubdirsAmount
Pop $R0
WriteRegStr HKCU "${REGISTRY_LOC}" "UninstallTopLevelFolder" "$R0"
When uninstalling:

  ; remove folders

ReadRegStr $R0 HKCU "${REGISTRY_LOC}" "UninstallTopLevelFolder"
Push $R0
Push "$INSTDIR"
call un.UninstallDirs
>

What about the MBCS characters? Do they come out right?


Yes, this seems to work for multibyte characters.


I have updated the GetParent function, so it should support MBCS now. Can you please test this one? Thanks :)

Function GetParent
Exch $R0 ; old $R0 is on top of stack
Push $R1
Push $R2
Push $R3
StrLen $R3 $R0
loop:
IntOp $R1 $R1 - 1
IntCmp $R1 -$R3 exit exit
StrCpy $R2 $R0 1 $R1
StrCmp $R2 "\" exit
Goto loop
exit:
StrCpy $R0 $R0 $R1
Pop $R3
Pop $R2
Pop $R1
Exch $R0 ; put $R0 on top of stack, restore $R0 to original value
FunctionEnd