What it the fastest way to get the number of distinct elements in a TList

What it the fastest way to get the number of distinct elements in a TList

Have a look at this small example (List should stay a TList):

List : TList ;
StringList: TStringlist;

TScannedItem = class
public
PaletBarCode: String;
CartonBarCode : String
end;

List := TList.Create;
List.Items.add(TScannedItem.Create('PALET1','4555');
List.Items.add(TScannedItem.Create('PALET2','4555645');
List.Items.add(TScannedItem.Create('PALET1','44645');
List.Items.add(TScannedItem.Create('PALET3','449');
List.Items.add(TScannedItem.Create('PALET3','48645');

Should I iterate throught my list like this :

Stringlist := TStringList.create;

for ScannedItem in List do
if Stringlist.IndexOf(ScannedItem.PaletBarCode) = -1 then
Stringlist.Add(ScannedItem.PaletBarCode);

Showmessage('# of palets '+ StringList.count.ToString); (3 given the above example)

or is there a better way to achieve this ?

Comments

  1. These two approaches should be faster:
    - Just add to the TStringList, then call Sort, and then iterate to count the number of times item N is different from item N+1
    - Just add to a hash list which supports a replace mode, and then check its Count property

    Note: your code is O(n2), my 1st solution is O(n.log n), the second is O(n), but with a higher constant time (hashcode computation).

    Which will win depends on how many items you have, how long they really are, and how many duplicates.

    ReplyDelete
  2. Thank you Eric. My list will remains short (a list of scanned carton with an associated palet). I guess I'll choose your first approach.

    ReplyDelete
  3. In Spring4D I would write this function and use it (yes it might not be as fast as special case source code but reusable for all various cases)

    type
    TEnumerable = class(Spring.Collections.TEnumerable)
    class function DistinctBy(const source: IEnumerable;
    const keySelector: TFunc): IEnumerable; static;
    end;

    { TEnumerable }

    class function TEnumerable.DistinctBy(
    const source: IEnumerable;
    const keySelector: TFunc): IEnumerable;
    var
    known: ISet;
    begin
    known := TCollections.CreateSet;
    Result := source.Where(
    function(const element: TSource): Boolean
    begin
    Result := known.Add(keySelector(element));
    end);
    end;

    ReplyDelete
  4. I'd use a TDictionary. It may only be worthwhile for large-ish data sets. Here's a pseudo code example:

    var
    d: TDictionary;
    begin

    d := TDictionary.Create(List.Count);

    for ScannedItem in List do
    if not d.ContainsKey(ScannedItem.PaletBarCode) then
    d.Add(ScannedItem.PaletBarCode, ScannedItem.PaletBarCode);

    result := d.count;

    d.free;
    end;

    ReplyDelete

Post a Comment