using strings as key for std::hash_map
-
So VC7 includes hash_map support, but it doesn't apppear to support strings as keys without you specifying your own hashing function. This means I have to do something like: typedef std::string sstring; struct string_hash_compare : public std::hash_compare > { size_t operator( )( const sstring& Key ) const { // hash Key and return it } bool operator( )(const sstring& _Key1, const sstring& _Key2) const { return _Key1 < _Key2; } }; std::hash_map MyHashMap; I suppose this isn't *that* bad, but I am suprised they didn't add a size_t operator overload to std::string that hashed the string so this wouldn't be necessary. They could have at least provided a sensible string hashing function. I believe STLPort does this the right way. Someone please tell me I overlooked something! Chris Hafey
-
So VC7 includes hash_map support, but it doesn't apppear to support strings as keys without you specifying your own hashing function. This means I have to do something like: typedef std::string sstring; struct string_hash_compare : public std::hash_compare > { size_t operator( )( const sstring& Key ) const { // hash Key and return it } bool operator( )(const sstring& _Key1, const sstring& _Key2) const { return _Key1 < _Key2; } }; std::hash_map MyHashMap; I suppose this isn't *that* bad, but I am suprised they didn't add a size_t operator overload to std::string that hashed the string so this wouldn't be necessary. They could have at least provided a sensible string hashing function. I believe STLPort does this the right way. Someone please tell me I overlooked something! Chris Hafey
What would a plain string return when you want it converted to a size_t? It's size, the moon phase, or just plain old 42 or 4711? Of course you have to provide your own hash key, no one but you would know what you'd want to hash on! If you just want a map with a string key, use std::map.
-
What would a plain string return when you want it converted to a size_t? It's size, the moon phase, or just plain old 42 or 4711? Of course you have to provide your own hash key, no one but you would know what you'd want to hash on! If you just want a map with a string key, use std::map.
I think my post was clear that I would have liked for hash_map and string to work seemlessly together. I admit that returning the hash as a size_t conversion is not the right thing, a better idea would be to have hash_map call the hash() method on the key that is being used and add a hash() method to std::string. std::map has nothing to do with this discussion, it already works seemlessly with strings. Microsoft clearly could have put more effort into this, especially since >90% of hash tables use strings as keys. In fact, I have personally never used a non string key in a hash. Chris Hafey PS - If you look at the header file for hash_map, you can see some code commented out where someone tried to do a "best guess" hash of the key. At least they put some effort into it, unfortunately not enough.
-
I think my post was clear that I would have liked for hash_map and string to work seemlessly together. I admit that returning the hash as a size_t conversion is not the right thing, a better idea would be to have hash_map call the hash() method on the key that is being used and add a hash() method to std::string. std::map has nothing to do with this discussion, it already works seemlessly with strings. Microsoft clearly could have put more effort into this, especially since >90% of hash tables use strings as keys. In fact, I have personally never used a non string key in a hash. Chris Hafey PS - If you look at the header file for hash_map, you can see some code commented out where someone tried to do a "best guess" hash of the key. At least they put some effort into it, unfortunately not enough.
Chris Hafey wrote: and add a hash() method to std::string Here we go again. You don't think this issue has been beaten to death by C++ experts? :-) Exactly WHAT would your hash aglorithm be for a string? Not everyone wants the same, and you better watch your back before introducing suggestions about introducing virtual member functions into std::basic_string to be able to overload on your personal notion of what a hash value should be. :-> Microsoft clearly could have put more effort into this, especially since >90% of hash tables use strings as keys. Microsoft has basically NOTHING to do with this. The standard C++ (with emphasis on C++) library provided by MSVC is the Dinkumware library. Microsoft couldn't create such a work of art even if they wanted to. But since you now have made a statement that more than 90% of the hash_table uses use a string as the key, I urge you (nay, I challange you) to prove this. I'd say you're speaking out of the south end of a north going 'ru. In fact, I have personally never used a non string key in a hash. And you think you should in some way be in authority when speaking about using C++ here, especially with statements like "Well I have surely never seen anything else"? Get real! You are in a forum where many use MFC and actually think it's good!!! I urge you to read up on compl.lang.c++.moderated, and please read it a while before posting your ideas to not make a fool out of yourself.
-
Chris Hafey wrote: and add a hash() method to std::string Here we go again. You don't think this issue has been beaten to death by C++ experts? :-) Exactly WHAT would your hash aglorithm be for a string? Not everyone wants the same, and you better watch your back before introducing suggestions about introducing virtual member functions into std::basic_string to be able to overload on your personal notion of what a hash value should be. :-> Microsoft clearly could have put more effort into this, especially since >90% of hash tables use strings as keys. Microsoft has basically NOTHING to do with this. The standard C++ (with emphasis on C++) library provided by MSVC is the Dinkumware library. Microsoft couldn't create such a work of art even if they wanted to. But since you now have made a statement that more than 90% of the hash_table uses use a string as the key, I urge you (nay, I challange you) to prove this. I'd say you're speaking out of the south end of a north going 'ru. In fact, I have personally never used a non string key in a hash. And you think you should in some way be in authority when speaking about using C++ here, especially with statements like "Well I have surely never seen anything else"? Get real! You are in a forum where many use MFC and actually think it's good!!! I urge you to read up on compl.lang.c++.moderated, and please read it a while before posting your ideas to not make a fool out of yourself.
I am sure this issue has been discussed in depth, in fact I bet this issue is exactly why hash_map didn't make it into the original standard. I am very well aware that different hash algorithms have different characteristics. I am also very aware that there are some which are considered very good for general purpose use. Just because there is no one algorithm that meets everyones needs doesn't mean there shouldn't be one selected as the default for strings. Besides, if someone has a need to use a different algorithm, it is just a matter of specifying it at as the third parameter. Challenging me to proove that 90% of hash tables use strings makes you look like a fool. Can you honestly say that this is not a reasonable estimation? I never claimed to be an authority in C++, but I am a user with a very strong background in C++. This whole thread is about usability, something you seem to feel takes a backseat to design purity. Don't take me wrong, I am all for design purity, but sometimes a bit more usability for the majority is worth the expense of some purity. Finally I urge you to think about how you responded to this thread. Your attitude was uncalled for and did nothing to help the discussion. If you think acting like an ass hole earns you respect, you have a lot to learn.
-
I am sure this issue has been discussed in depth, in fact I bet this issue is exactly why hash_map didn't make it into the original standard. I am very well aware that different hash algorithms have different characteristics. I am also very aware that there are some which are considered very good for general purpose use. Just because there is no one algorithm that meets everyones needs doesn't mean there shouldn't be one selected as the default for strings. Besides, if someone has a need to use a different algorithm, it is just a matter of specifying it at as the third parameter. Challenging me to proove that 90% of hash tables use strings makes you look like a fool. Can you honestly say that this is not a reasonable estimation? I never claimed to be an authority in C++, but I am a user with a very strong background in C++. This whole thread is about usability, something you seem to feel takes a backseat to design purity. Don't take me wrong, I am all for design purity, but sometimes a bit more usability for the majority is worth the expense of some purity. Finally I urge you to think about how you responded to this thread. Your attitude was uncalled for and did nothing to help the discussion. If you think acting like an ass hole earns you respect, you have a lot to learn.
Chris Hafey wrote: Just because there is no one algorithm that meets everyones needs doesn't mean there shouldn't be one selected as the default for strings. True. But it also doesn't mean any algorithm should be selected. It would in a sense be like providing some "black box"
less<T>
for any type imaginable. Not a pretty sight, especially since less for a hash-value for a string is with almost certainty not the same as its case-sensitive string comparisonstrcmp()
. Besides, if someone has a need to use a different algorithm, it is just a matter of specifying it at as the third parameter. A third parameter to ... what? I'm sure you are aware that predicates are by convention types, and they are given as template arguments at compile time to the class template using them. I've got a "prototype" hash_map implementation from Dinkumware here (the revision before VC7 release I believe). It has got four template parameters. There is also the problem that many implementations (of whatever standard template it might be) are given (or have taken) the freedom to append (default) template parameters to the template parameter list. I understand the point that for usability there should possibly be a default string-hashing std function, but I also see that this is such a controversial subject that the standard perhaps is better off by not providing a "demo" implementation (remember that many of the std:: techniques are really "demonstrations" of how to do stuff). Challenging me to proove that 90% of hash tables use strings makes you look like a fool. It does? :confused: You who stated "especially since >90% of hash tables use strings as keys". Are you really surprised that someone challanged that statement and asked you to back it up? Either you have proof and your statement holds, or you don't have proof. It might be that I'm a fool sometimes, but I leave it up to you to figure out who can and who can't back their statements up in this case... Can you honestly say that this is not a reasonable estimation? Yes. I could also say No and it would possibly be equally true. I just don't know, and neither do you - that's my point. From my experience it's not true. From your experience it apparently is true, but without some figures to back it up I'd still say "south end of north going 'ru". Finally I urge you to think about how you responded to this thread. You're right. I was having