It just occured to me that when I have such code:

It just occured to me that when I have such code:

type
TFooBar = class
s: string;
end;

procedure Main;
var
x, y: TFooBar;
begin
x := TFooBar.Create;
y := TFooBar.Create;

x.s := 'True';
y.s := 'True';

Assert(Pointer(x.s) = Pointer(y.s));
end;

begin
Main;
end.

The assertion will raise because whenever you assign a string literal to a string variable it checks if the string is a const (refcount = -1) and then calls _NewUnicodeString). I guess this has been discussed already somewhere else but I cannot find anything regarding my question:

I would guess that even if it is a const it could assign that reference to s and don't touch the refcount. This should work fine with the cow mechanics.

But as it is this would mean that if code runs through such const string assignments every time a new string gets allocated, no?
In my case I am using the const string 'True' for nullables in Spring to set the "HasValue" flag for them (otherwise the field is empty). However by using a const there it creates a new string every time. So if you have 10 nullable values you have 10 string instances with the content "True". The only solution I found so far was to make the const a variable (in this case it works because its private) and assign 'True' to it in the initialization section (or could also in the class ctor). That way there is only one string instance with content 'True' for the nullables around.

Any other solution I am missing?

Edit: I found this SO question:
https://stackoverflow.com/questions/12837129/string-const-why-different-implementation-for-local-and-result so I changed the original example to use objects and non global variables.

In my case I know that the module the literal is coming from will not be unloaded before anyone that is using it. That makes me think if we would need some kind of const string with start refcount = 1. Since its a const you could not modify it (which would modify the const because of refcount 1) but when assigning somewhere the same reference is being used and no new strings are being produced. I guess I am missing some cornercases why this is not possible -.- So I probably will use the hidden string variable to get the same result.

Comments

  1. type
    TFooBar = class
    private
    class constructor Create;
    public
    class var sTrue:String;
    var
    s: string;
    end;

    class constructor TFooBar.Create;
    begin
    sTrue:='True';
    end;

    procedure Main;
    var
    x, y: TFooBar;
    begin
    x := TFooBar.Create;
    y := TFooBar.Create;

    x.s := TFooBar.sTrue;
    y.s := TFooBar.sTrue;

    Assert(Pointer(x.s) = Pointer(y.s));
    end;

    begin
    Main;
    end.

    ReplyDelete
  2. Alexander Brazda Well, thanks for proposing what I already mentioned in my post...

    ReplyDelete
  3. i think it is an implementation failure, it should work with const's without copying, because of cow. We has i similar problem and fixed with using Pointers to the const strings, but it has a smell...

    ReplyDelete
  4. Thomas Mueller I know that post, what exactly is your point? That I could use an interface instead of a string? If the comments would not have been shredded while migrating to the new community page you could see that Barry Kelly at that time suggested the other way around. Also if you paid attention the problem is not specific to the spring nullable implementation but to every place you ever use a string literal and assign it to something creating duplicate strings all over the place for no reason.

    Yes, the string interning technique has also discussed by Eric Grange in his blog while ago but that applies more to scenarios where you are parsing code and producing identical strings via copy or concatenation.

    Friedrich Westermann A solution for an environment where the string is internal and goes not out or gets assigned somewhere else but it falls apart as soon as you assign that refcount-minusone string to some other string because then again it does a copy. :(

    I wish there was some "i don't unload modules so don't make new strings" option somewhere in the RTL. Especially since usually Delphi applications are monolithic.

    ReplyDelete
  5. I think the comments are still there, but malformed by the CSS. Comments are only visible when logged on, so i've extracted the text of post and comments to gist.github.com - post.txt

    ReplyDelete
  6. It is commented in the RTL source code: if the const is defined in a library or package, and the library is unloaded, accessing the text buffer from outside the library will raise an access violation. So a copy is made. This is why I always define var and not const, and fill the var value at runtime, in mORMot.

    ReplyDelete
  7. Jeroen Wiert Pluimers Ah, it was Hallvard - thanks.

    A. Bouchez Yes, I saw that from your SO answer. And as commented there this is the safe way in all cases even if it is not needed (monolithic exe or no dynamic unloading of modules).

    So what we would need is basically a way to declare string const with refcount = 1, no? Sure a var works but is not write protected.

    ReplyDelete
  8. Let's imagine some new syntax (just for the sake of the example):

    const var foo = 'True';

    This creates the constant string 'True' with it's -1 refcount in the code segment and then at start of the application does the assignment of that const to the foo var. Compiler protects this variable as much as it does regular const. Yes, if you apply some pointer magic hacking you can write it, that would not be new.

    This would be kinda similar to other languages that have the concept of write once variables that get their value when declared (of course this works only well when you can inline declare variables, the way Delphi/Pascal does it would not work so great)

    ReplyDelete
  9. Attila Kovacs I was referring to the concept of write once variables, not how that is solved on binary level. I leave that to people that studied compiler engineering. ;)

    Anyhow its the compiler that needs to protect the variable not the fact that it lives in read only memory at runtime. I think we have that already with typed consts which technically are just variables protected by the compiler (you can easily write on those by some pointer hacking)

    ReplyDelete
  10. Can you check with XE2 or below? does it clone the literals as well?
    XE2 has habit of destroying constants of "array of string literals" type in some scenarious. I did not bother though to compare with later RTLs.
    Enforcing cloning the string might be one fix to it.

    ReplyDelete

Post a Comment