I needed about 4 hours to find this bug in Delphi XE5 (it may be there since D2009).
I needed about 4 hours to find this bug in Delphi XE5 (it may be there since D2009).
Actually, I can understand why Delphi behaves like that but to my mind it's still a very dangerous bug that is also very hard to find.
The problem occurs with non-packed records that have unused space inside. The default hash function in Delphi (e.g. that is used by TDictionary) may return 2 different hashes for the same record because it uses the bit represantation of the records which can be different for equal records (due to garbage in the unused record section).
More importantly, the garbage may be caused by Delphi itself because records are not initialized.
I definitely think it's a bug and I'll post it to QC. But before I do so I posted it here so you are aware of it as well. I am also a little bit worried if Embarcadero fixes it because it doesn't have a straightforward solution. A workaround is to use packed record.
+ What do you think about it?
Showcase:
program DictBug;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, System.Generics.Collections;
type
TMyRec = record //only 12 bytes of 16 are used! -> 4 bytes unused!!!
A: Int64;
B: Integer;
end;
TDictOfMyRec = TDictionary;
procedure _Add(aDict: TDictOfMyRec; A: NativeUInt; B: Integer; aInitGarbage: Boolean);
var
xRec: TMyRec;
begin
if aInitGarbage then
FillChar(xRec, SizeOf(xRec), $FF)//initialize: simulate garbage
else
FillChar(xRec, SizeOf(xRec), 0);//initialize: simulate empty data
xRec.A := A;
xRec.B := B;
aDict.Add(xRec, True);
end;
var
xDict: TDictOfMyRec;
xIter: TMyRec;
begin
xDict := TDictOfMyRec.Create;
try
_Add(xDict, 1, 0, False);
_Add(xDict, 1, 0, True);
//The second record will be added as well!!!
//This is because the bit representations of the records are not the same
//and thus the has function (HashLittle in System.Generics.Defaults)
//return s different hashes!!!
//Workaround: use PACKED RECORD!!!
for xIter in xDict.Keys do
Writeln(Format('%d:%d', [xIter.A, xIter.B]));
finally
xDict.Free;
end;
Readln;
end.
Actually, I can understand why Delphi behaves like that but to my mind it's still a very dangerous bug that is also very hard to find.
The problem occurs with non-packed records that have unused space inside. The default hash function in Delphi (e.g. that is used by TDictionary) may return 2 different hashes for the same record because it uses the bit represantation of the records which can be different for equal records (due to garbage in the unused record section).
More importantly, the garbage may be caused by Delphi itself because records are not initialized.
I definitely think it's a bug and I'll post it to QC. But before I do so I posted it here so you are aware of it as well. I am also a little bit worried if Embarcadero fixes it because it doesn't have a straightforward solution. A workaround is to use packed record.
+ What do you think about it?
Showcase:
program DictBug;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, System.Generics.Collections;
type
TMyRec = record //only 12 bytes of 16 are used! -> 4 bytes unused!!!
A: Int64;
B: Integer;
end;
TDictOfMyRec = TDictionary
procedure _Add(aDict: TDictOfMyRec; A: NativeUInt; B: Integer; aInitGarbage: Boolean);
var
xRec: TMyRec;
begin
if aInitGarbage then
FillChar(xRec, SizeOf(xRec), $FF)//initialize: simulate garbage
else
FillChar(xRec, SizeOf(xRec), 0);//initialize: simulate empty data
xRec.A := A;
xRec.B := B;
aDict.Add(xRec, True);
end;
var
xDict: TDictOfMyRec;
xIter: TMyRec;
begin
xDict := TDictOfMyRec.Create;
try
_Add(xDict, 1, 0, False);
_Add(xDict, 1, 0, True);
//The second record will be added as well!!!
//This is because the bit representations of the records are not the same
//and thus the has function (HashLittle in System.Generics.Defaults)
//return s different hashes!!!
//Workaround: use PACKED RECORD!!!
for xIter in xDict.Keys do
Writeln(Format('%d:%d', [xIter.A, xIter.B]));
finally
xDict.Free;
end;
Readln;
end.
It looks like you have to create your own IEqualityComparer that takes care about the actual variables in the record.
ReplyDeleteOk, I was wrong about missing typeinfo for fields, XE already has that. So I came up with 2 different solutions: a) copy the record field by field omit the gaps. Then use the hash algo or the binary compare. b) just zero the gaps
ReplyDeleteSecond solution does not involve allocating memory but writes on the original (because arguments are passed by const and thus by ref if bigger than a pointer which is the case for those that have gaps).
I implemented that into a TRecordEqualityComparer class. But it also should possible to patch the _LookupVtableInfo routine to return this comparer when you have a record with gaps.
procedure ZeroJunk(const Value; TypeInfo: Pointer);
var
typeData: PTypeData;
p: PByte;
i: Integer;
n: Integer;
f: PManagedField;
offset: Integer;
size: Integer;
begin
typeData := GetTypeData(TypeInfo);
p :=@typeData.ManagedFldCount;
i := typeData.ManagedFldCount;
Inc(p, 4); // skip ManagedFldCount
Inc(p, SizeOf(TManagedField) * i); // skip ManagedFields
i := p^;
Inc(p); // skip NumOps
Inc(p, SizeOf(Pointer) * i); // skip RecOps
n := PInteger(p)^; // read RecFldCnt
Inc(p, 4); // skip RecFldCnt
offset := 0;
for i := 0 to n - 1 do
begin
f :=@PRecordTypeField(p).Field;
size := f.FldOffset - offset;
if size > 0 then
FillChar(PByte(@Value)[offset], size, 0);
offset := f.FldOffset + GetTypeSize(f.TypeRef^);
P :=@PRecordTypeField(P).Name;
Inc(P, P^ + 1); // skip Name
Inc(P, PWord(P)^); // skip AttrData
end;
size := GetTypeSize(TypeInfo) - offset;
if size > 0 then
FillChar(PByte(@Value)[offset], size, 0);
end;