What it the fastest way to get the number of distinct elements in a TList
What it the fastest way to get the number of distinct elements in a TList
Have a look at this small example (List should stay a TList):
List : TList ;
StringList: TStringlist;
TScannedItem = class
public
PaletBarCode: String;
CartonBarCode : String
end;
List := TList.Create;
List.Items.add(TScannedItem.Create('PALET1','4555');
List.Items.add(TScannedItem.Create('PALET2','4555645');
List.Items.add(TScannedItem.Create('PALET1','44645');
List.Items.add(TScannedItem.Create('PALET3','449');
List.Items.add(TScannedItem.Create('PALET3','48645');
Should I iterate throught my list like this :
Stringlist := TStringList.create;
for ScannedItem in List do
if Stringlist.IndexOf(ScannedItem.PaletBarCode) = -1 then
Stringlist.Add(ScannedItem.PaletBarCode);
Showmessage('# of palets '+ StringList.count.ToString); (3 given the above example)
or is there a better way to achieve this ?
Have a look at this small example (List should stay a TList):
List : TList
StringList: TStringlist;
TScannedItem = class
public
PaletBarCode: String;
CartonBarCode : String
end;
List := TList
List.Items.add(TScannedItem.Create('PALET1','4555');
List.Items.add(TScannedItem.Create('PALET2','4555645');
List.Items.add(TScannedItem.Create('PALET1','44645');
List.Items.add(TScannedItem.Create('PALET3','449');
List.Items.add(TScannedItem.Create('PALET3','48645');
Should I iterate throught my list like this :
Stringlist := TStringList.create;
for ScannedItem in List do
if Stringlist.IndexOf(ScannedItem.PaletBarCode) = -1 then
Stringlist.Add(ScannedItem.PaletBarCode);
Showmessage('# of palets '+ StringList.count.ToString); (3 given the above example)
or is there a better way to achieve this ?
These two approaches should be faster:
ReplyDelete- Just add to the TStringList, then call Sort, and then iterate to count the number of times item N is different from item N+1
- Just add to a hash list which supports a replace mode, and then check its Count property
Note: your code is O(n2), my 1st solution is O(n.log n), the second is O(n), but with a higher constant time (hashcode computation).
Which will win depends on how many items you have, how long they really are, and how many duplicates.
Thank you Eric. My list will remains short (a list of scanned carton with an associated palet). I guess I'll choose your first approach.
ReplyDeleteIn Spring4D I would write this function and use it (yes it might not be as fast as special case source code but reusable for all various cases)
ReplyDeletetype
TEnumerable = class(Spring.Collections.TEnumerable)
class function DistinctBy(const source: IEnumerable;
const keySelector: TFunc): IEnumerable; static;
end;
{ TEnumerable }
class function TEnumerable.DistinctBy(
const source: IEnumerable;
const keySelector: TFunc): IEnumerable;
var
known: ISet;
begin
known := TCollections.CreateSet;
Result := source.Where(
function(const element: TSource): Boolean
begin
Result := known.Add(keySelector(element));
end);
end;
I'd use a TDictionary. It may only be worthwhile for large-ish data sets. Here's a pseudo code example:
ReplyDeletevar
d: TDictionary;
begin
d := TDictionary.Create(List.Count);
for ScannedItem in List do
if not d.ContainsKey(ScannedItem.PaletBarCode) then
d.Add(ScannedItem.PaletBarCode, ScannedItem.PaletBarCode);
result := d.count;
d.free;
end;