I needed about 4 hours to find this bug in Delphi XE5 (it may be there since D2009).

I needed about 4 hours to find this bug in Delphi XE5 (it may be there since D2009).
Actually, I can understand why Delphi behaves like that but to my mind it's still a very dangerous bug that is also very hard to find.

The problem occurs with non-packed records that have unused space inside. The default hash function in Delphi (e.g. that is used by TDictionary) may return 2 different hashes for the same record because it uses the bit represantation of the records which can be different for equal records (due to garbage in the unused record section).
More importantly, the garbage may be caused by Delphi itself because records are not initialized.

I definitely think it's a bug and I'll post it to QC. But before I do so I posted it here so you are aware of it as well. I am also a little bit worried if Embarcadero fixes it because it doesn't have a straightforward solution. A workaround is to use packed record.
+ What do you think about it?

Showcase:


program DictBug;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils, System.Generics.Collections;

type
  TMyRec = record //only 12 bytes of 16 are used! -> 4 bytes unused!!!
    A: Int64;
    B: Integer;
  end;

  TDictOfMyRec = TDictionary;

  procedure _Add(aDict: TDictOfMyRec; A: NativeUInt; B: Integer; aInitGarbage: Boolean);
  var
    xRec: TMyRec;
  begin
    if aInitGarbage then
      FillChar(xRec, SizeOf(xRec), $FF)//initialize: simulate garbage
    else
      FillChar(xRec, SizeOf(xRec), 0);//initialize: simulate empty data

    xRec.A := A;
    xRec.B := B;
    aDict.Add(xRec, True);
  end;

var
  xDict: TDictOfMyRec;
  xIter: TMyRec;
begin
  xDict := TDictOfMyRec.Create;
  try
    _Add(xDict, 1, 0, False);
    _Add(xDict, 1, 0, True);
    //The second record will be added as well!!!
    //This is because the bit representations of the records are not the same
    //and thus the has function (HashLittle in System.Generics.Defaults)
    //return s different hashes!!!
    //Workaround: use PACKED RECORD!!!

    for xIter in xDict.Keys do
      Writeln(Format('%d:%d', [xIter.A, xIter.B]));
  finally
    xDict.Free;
  end;

  Readln;
end.

Comments

  1. It looks like you have to create your own IEqualityComparer that takes care about the actual variables in the record.

    ReplyDelete
  2. Ok, I was wrong about missing typeinfo for fields, XE already has that. So I came up with 2 different solutions: a) copy the record field by field omit the gaps. Then use the hash algo or the binary compare. b) just zero the gaps

    Second solution does not involve allocating memory but writes on the original (because arguments are passed by const and thus by ref if bigger than a pointer which is the case for those that have gaps).

    I implemented that into a TRecordEqualityComparer class. But it also should possible to patch the _LookupVtableInfo routine to return this comparer when you have a record with gaps.

    procedure ZeroJunk(const Value; TypeInfo: Pointer);
    var
      typeData: PTypeData;
      p: PByte;
      i: Integer;
      n: Integer;
      f: PManagedField;

      offset: Integer;
      size: Integer;
    begin
      typeData := GetTypeData(TypeInfo);

      p :=@typeData.ManagedFldCount;
      i := typeData.ManagedFldCount;
      Inc(p, 4);                          // skip ManagedFldCount
      Inc(p, SizeOf(TManagedField) * i);  // skip ManagedFields
      i := p^;
      Inc(p);                             // skip NumOps
      Inc(p, SizeOf(Pointer) * i);        // skip RecOps
      n := PInteger(p)^;                  // read RecFldCnt
      Inc(p, 4);                          // skip RecFldCnt

      offset := 0;
      for i := 0 to n - 1 do
      begin
        f :=@PRecordTypeField(p).Field;
        size := f.FldOffset - offset;
        if size > 0 then
          FillChar(PByte(@Value)[offset], size, 0);
        offset := f.FldOffset + GetTypeSize(f.TypeRef^);
        P :=@PRecordTypeField(P).Name;
        Inc(P, P^ + 1);                   // skip Name
        Inc(P, PWord(P)^);                // skip AttrData
      end;
      size := GetTypeSize(TypeInfo) - offset;
      if size > 0 then
        FillChar(PByte(@Value)[offset], size, 0);
    end;

    ReplyDelete

Post a Comment