Delphi Developers Archive

- December 23, 2015

Just one doubt. When TWriter.WriteString chooses vaString prefix, is it basically saving the string as a shortstring? If so, there is a massive possible optimization in TReader.ReadString for this vaString case, bypassing TEncoding and just reading the bytes directly on a shortstring.

Comments

Uwe SchusterDecember 23, 2015 at 11:24 AM
If you want to patch it for non NEXTGEN in your local copy fine, read it into an AnsiString and see how much impact this micro optimization has. If you think about a QP report - this will hardly be done by the overlords.
ReplyDelete
Replies
David BernedaDecember 23, 2015 at 11:42 AM
I've done a test without patching, using TReader NextValue = vaString then reading directly to a shortstring. I'm writing arrays of many many strings and doing this trick one test that takes 21seconds gets down to aprox 9.
ReplyDelete
Replies
David BernedaDecember 23, 2015 at 11:43 AM
So as long as there is a way to do it in user code, I'm fine with it. The doubt I have is if its really safe to consider content is always a plain shortstring
ReplyDelete
Replies
Uwe SchusterDecember 23, 2015 at 12:09 PM
vaString is one byte for the length followed by a 8-bit string with the length of the bytes value. That layout is the same as for the ShortString type. Did you check if there is a performance difference between using ShortString and AnsiString? If there is no significant difference the same code can be used for vaString and vaLString except for reading the length at the beginning.
ReplyDelete
Replies
David BernedaDecember 23, 2015 at 12:14 PM
Perfect, I'll test this hopefully tomorrow and post a test/bench project
ReplyDelete
Replies
David BernedaDecember 24, 2015 at 1:53 AM
Test project: https://drive.google.com/open?id=0BymV3q6di65nTm5UQ0JYUllwcWc

Very similar results with 32bit and 64bit. TReader ReadString = 2.1 seconds, with the trick = 0.7 seconds
ReplyDelete
Replies
Uwe SchusterDecember 24, 2015 at 3:00 AM
I do see a similar improvement - 6.6 seconds vs. 1.9 seconds. With AnsiString it is 2.2 seconds. IDE Fix Pack does not touch TReader.ReadStr and I'll check if there is a noticeable improvement and if yes, it should find it's way into IDE Fix Pack 6.
ReplyDelete
Replies
David BernedaDecember 24, 2015 at 3:33 AM
Perfect ! Thanks for testing !
ReplyDelete
Replies
Uwe SchusterDecember 28, 2015 at 6:56 AM
Unfortunately the UTF8 decoding in ReadStr is necessary. I have had property names with non ASCII characters in mind, but mixed it up with vaIdent / TReader.ReadIdent that has UTF8 decoding. I have done some tests and noticed that the UTF8 decoding is necessary in ReadStr. However even with an UTF8 check ("for I := 1 to L do if Ord(S[I]) > 127 then...") ObjectBinaryToText is 10 percent faster with ShortString. There is no significant difference between with and without the check. Results for ObjectBinaryToText for about 950 DFM files in binary format - original: 2100 ms, ShortString: 1901, ShortString with UTF8 check: 1920.
ReplyDelete
Replies

Add comment

Search This Blog

Delphi Developers Archive

Comments

Post a Comment