Do you need a generic set? Take a look at this blog post: https://blog.grijjy.com/2017/01/05/expand-your-collections-collection-part-1-a-generic-set/
Do you need a generic set? Take a look at this blog post: https://blog.grijjy.com/2017/01/05/expand-your-collections-collection-part-1-a-generic-set/
https://blog.grijjy.com/2017/01/05/expand-your-collections-collection-part-1-a-generic-set/
https://blog.grijjy.com/2017/01/05/expand-your-collections-collection-part-1-a-generic-set/
Raising an exception when an item is already in the set is not very useful tbh. Let me explain why I think so.
ReplyDeleteUsually when using a set you combine the check if an item is already in the set and the add operation. That is why the set in Spring4D for example as well as the HashSet class in .NET have the Add return a Boolean that determines if the item was added or not because it was already in the set.
That means that you don't have to do two lookups when checking if an item is not in the set and then adding it which also enables nicer (imo) code such as:
if hashset.Add(item) then
That's a good point. My initial ideas was to keep the API similar to that of a TDictionary, where Add also raises an exception if the key already exists. (That's why there is also an AddOrSet method that does not raise exceptions). But the typical use cases for sets may be different than those for dictionaries. So the Boolean function approach may be better.
ReplyDeleteWe may change the API after reviewing the effects on other code that uses TgoSet.
Thanks for your thoughts on this
It looks like a thin wrapper around a hash table. Would that hash table be better as a separate class? As well as being reusable - you can build lots of other structures using a hash table - it would be an interesting writeup on its own. Eg, I noticed you're using linear probing but not explaining why linear rather than something else.
ReplyDeleteDoes the set allow intersection and other operations against another set?
This class is more like a hash set you may find in other languages. We created it because we had a need for an unordered set of items in our software, with fast lookup performance but without the value-overhead of a dictionary.
ReplyDeleteI tried some other approaches to collision handling besides linear probing, such as double hashing, bucket chaining and cuckoo hashing, but found that they used more memory, were less cache-friendly or were slower than simple linear probing. Maybe I'll elaborate on that in a future post (together with building a hash table).
This set does not allow intersections, unions and some other operations. If you need those, feel free to use this set as a base, or use (some of) the source code to build your own. We are not library developers/sellers and we mostly write posts to share issues we ran into developing our own software.
"My initial ideas was to keep the API similar to that of a TDictionary"
ReplyDeleteTDictionary has the worst possible dictionary-like interface I can think of, so best to only learn from its mistakes.
Asbjørn Heid What is a better interface (and, er, is it in QP?)
ReplyDeleteDavid Millington I strongly dislike it raising an exception on inserting an existing key and removing a non-existing key.
ReplyDeleteThe array access property should also behave like AddOrSet(), not raise an exception on a non-existing key. If trying to read a non-existing key it should return Default(TValue).
I also dislike that it's not an interface (for lifetime management).
Adding the above to QP is futile for obvious reasons.
Asbjørn Heid You might like the sorted dictionary class in an upcoming release of Spring4D then. Array access read returns Default() if the key isn't present, and on write overwrites if it is present. I based it on std::map, one of my favourite classes. It's also an interface. I can't remember the exception behaviour when removing a non-existing key.
ReplyDeleteI wrote it, but Stefan Glienke tells me he's almost completely rewritten it since then (sad face) (or Stefan, was that only the red-black tree backing it?)
Stefan Glienke "Usually when using a set you combine the check if an item is already in the set and the add operation."
ReplyDeleteWhy does it matter if it's already in the set? I'm having a hard time imagining when I'd want to know an item was already in the set when adding. Either way, it's in there now.
Joseph Mitzen It's common to want to know this when using the set to check for uniqueness.
ReplyDeleteAsbjørn Heid Exactly.
ReplyDeleteDavid Millington It was mostly the tree.
David Millington Yeah I'm looking forward to checking that out.
ReplyDeleteNo excuse for TDictionary though :P