Hy All! :)

Hy All! :)

Is  the Capacity still Integer in TDictionary.Create in #10Seattle?

Comments

  1. There is no Capacity property on TDictionary but the Count property is Integer, as is the HashCode. Hopefully that answers your question...

    ReplyDelete
  2. Nicholas Ring He did not ask for a property but for the capacity parameter of the constructor.

    Nándor Kiss Why don't you check for yourself? http://docwiki.embarcadero.com/Libraries/Seattle/en/System.Generics.Collections.TDictionary.Create

    ReplyDelete
  3. Stefan Glienke My bad - misread the question :( Sorry Nándor Kiss

    ReplyDelete
  4. Stefan Glienke 
    Thanks, I forget it... :D

    Nicholas Ring 
    No problem :)

    btw:
    It doesn't make sense to me to use Capacity as an Integer in x64.
    It can be grow and grow until... maybe until High(UInt64)? :)

    ReplyDelete
  5. How often are you going to have more than two billion items in a dictionary, though? I suspect by the time you approach that, you will have been forced to write your own container for a number of other reasons too.

    On the other hand, I do wonder if NativeInt would be more natural. On the other other hand, keeping it Integer results in the same compiled behaviour for 32 and 64-bit.

    ReplyDelete
  6. David Millington 
    It's an experimantal project, storing values of this type:

    , Cardinal>

    It's like kind of a lookup table, but now I know my (Delphi's) limits :)

    ReplyDelete
  7. Nándor Kiss You'd probably run into similar limits in C++ or C# too... How many items do you have? (Are you using Cardinal in order to have up to 4 billion?)

    ReplyDelete
  8. David Millington Not in C++, they have sane types defined for these situations (ie size_t).

    ReplyDelete
  9. A 4+1 byte value pair ... I guess all possible combinations should fit in a dictionary (Integer = 8 Bytes)

    ReplyDelete
  10. Oliver Münzberg IIRC Integer in Delphi is always 4 Byte(32Bit) and NativeInt depends on Platform

    ReplyDelete
  11. Alexander Benikowski Yeah, I got lost in data types :o)

    ReplyDelete
  12. Asbjørn Heid Wouldn't the equivalent in Delphi be NativeInt, NativeUInt, NativePtr?

    size_t is sane, yes, but is it really necessary to have a dictionary that supports more than 2 billion items? No-one's directly answered that question yet.

    I'd understand if it was a 16-bit to 32-bit upgrade. But 32 to 64 doesn't bring the same advantages here.

    ReplyDelete
  13. David Millington Ignoring pedantic differences, yes, using NativeInt/NativeUInt would work. The main difference I was thinking of is that in C++ they actually use size_t in the libraries etc...

    As for "is it really necessary"... Some applications have a natural tendency to be pushed in terms of content. On the C++ renderer I worked on before I got back to Delphi, we had regular complains from users who ran into issues when loading highly detailed scenes taking multiple GB, or trying to render 32000x32000 pixel images etc. They had >=16GB RAM and naturally expected to be able to use all of it.

    We used many standard library types, for example during processing of the resulting image several steps would use a std::vector (C++'s version of TList) to hold some value for each pixel in the image. We never had any issues with those, they just worked. The issues were mostly where we assumed we could hold a copy of some datastructure in memory, or similar stuff in our own code.

    edit: when I say regular I don't mean frequent. But every few months someone would push our renderer past a different edge.

    ReplyDelete
  14. Asbjørn Heid I understand the issues. Until last year I used to work on a resource-stressing C++ app too (sometimes multi-gigabyte data sets. One was recorded at 2GB / minute, I think. Not always easy in what was then a 32-bit process.)

    I guess I just think a Delphi dictionary will be struggling by the time it approaches 2 billion items - its hashing, its lookup, everything. So by the time you get anywhere near that, you'd want a different data structure.

    ReplyDelete
  15. David Millington I'm not so sure that is the case. This program seems to scale reasonably well. Obviously it needs to be run in a 64 bit process.

    {$APPTYPE CONSOLE}

    uses
      System.SysUtils,
      System.Diagnostics,
      System.Generics.Collections;

    const
      N = 10000000;

    function RandomKey: Int64;
    begin
      Int64Rec(Result).Lo := Random(high(Integer));
      Int64Rec(Result).Hi := Random(high(Integer));
    end;

    procedure Main;
    var
      Dict: TDictionary;
    begin
      Dict := TDictionary.Create(N);
      while Dict.Count < N do begin
        Dict.AddOrSetValue(RandomKey, 0);
      end;
    end;

    var
      Stopwatch: TStopwatch;

    begin
      try
        RandSeed := 666;
        Stopwatch := TStopwatch.StartNew;
        Main;
        Writeln(Stopwatch.ElapsedMilliseconds);
      except
        on E: Exception do
          Writeln(E.ClassName, ': ', E.Message);
      end;
      Readln;
    end.

    ReplyDelete
  16. David Millington I don't see why TDictionary would struggle significantly on 64bit (had it used NativeInts as appropriate) with >2 billion items.

    The biggest issue would be growing to >2 billion items, which involve a reallocation and rehashing. But if you know how many items you will be storing in the dictionary and create it with the appropriate capacity, then that is not an issue. And for many "lookup-table" use cases you do know the required capacity.

    ReplyDelete
  17. David Heffernan Good test (although it's only 10 million - needs an extra digit) and I think shows I'm probably wrong :)
    Asbjørn Heid Yep. Good point. I don't know how its hash table allocates and what amount of collisions etc might occur at that size, do you?

    There is an easy solution to this. Let's file a QP asking the containers be changed to use NativeInt in 10.1.

    ReplyDelete
  18. David Millington The TDictionary is implemented using open addressing with linear probing, all elements are stored in a TArray<> effectively. Collision probability would not be affected as long as the array can grow to keep the load factor below 0.75.

    The biggest hurdle to make it "64bit ready" is that IEqualityComparer will have to be changed to use NativeInt, and that will likely break a lot of code, both RTL and and user code. IIRC this is Delphi.Net legacy, as .Net also has piss-poor 64bit support.

    ReplyDelete

Post a Comment