Tutorial :Hash codes for immutable types



Question:

Are there any considerations for immutable types regarding hash codes?

Should I generate it once, in the constructor?

How would you make it clear that the hash code is fixed? Should I? If so, is it better to use a property called HashCode, instead of GetHashCode method? Would there be any drawback to it? (Considering both would work, but the property would be recommend).


Solution:1

Are there any considerations for immutable types regarding hash codes?

Immutable types are the easiest types to hash correctly; most hash code bugs happen when hashing mutable data. The most important thing is that hashing and equality agree; if two instances compare as equal, they should have the same hash code. (The reverse is not necessarily true; two instances that have the same hash need not be equal.)

Should I generate it once, in the constructor?

That's a performance optimizing technique; by doing so, you trade increased consumption of space (for the storage of the computed value) for a possible decrease in time. I never make performance optimizations unless they are driven by realistic, customer-focused performance tests that carefully measure the performance of both options against documented goals. You should do this if your carefully-designed experiments indicate that (1) failure to do so causes you to miss your goal, and (2) doing so causes you to meet your goal.

How would you make it clear that the hash code is fixed?

I don't understand the question. A changing hash code is the exception, not the rule. Hash codes are always supposed to be unchanging. If the hash code of an object changes then the object can get "lost" in a hash table, so everyone should assume that hash codes remain stable.

is it better to use a property called HashCode, instead of GetHashCode method?

What consumer of your object is going to say "well, I could call GetHashCode(), a method guaranteed to be on all objects, but instead I'm going to call this HashCode getter that does exactly the same thing" ? Do you have such a consumer in mind?

If you don't have any consumers of functionality, then don't provide the functionality.


Solution:2

I wouldn't normally generate it in the constructor, but I'd also want to know more about the expected usage before deciding whether to cache it or not.

Are you expecting a small number of instances, which get hashed an awful lot and which take a long time to calculate the hash? If so, caching may be appropriate. If you're expecting a large number of potentially "throw-away" instances, I wouldn't bother caching.

Interestingly, .NET and Java made different choices for String in this respect - Java caches the hash, .NET doesn't. Given that many string instances are never hashed, and those which are hashed are often only hashed once (e.g. on insertion into the hash table) I think I favour .NET's decision here.

Basically you're trading memory + complexity against speed. As Michael says, test before making your code more complex. Of course in some cases (e.g. for a class library) you can't accurate predict the real-world usage, but in many situations you'll have a pretty good idea.

You certainly don't need a separate property though. Hash codes should always stay the same unless someone changes the state of the object - and if your type is immutable, you're already prohibiting that, therefore a user shouldn't expect any changes. Just override GetHashCode().


Solution:3

I would generate the hash code once when getHashCode is called the first time, then cache it for later calls. This avoids calling it in the constructor when it may not be needed.

If you don't expect to call getHashCode very many times for each value object, you may not need to cache the value at all.


Solution:4

Well, you've got to have a GetHashCode() overridden method, as that's how consumers are going to retrieve your hashcode. Most hashcodes are fairly simple arithmetic operations, that will execute quickly. Do you have a reason to believe that caching the results (which has a memory cost) will give you a noticeable performance improvement?

Start simple - generate the hashcode on the fly. If you think you'll see performance improvements caching it, test first.

Regulations require me to refer you to the "premature optimization is the root of all evil" quote at this point.


Solution:5

I know from my personal experience that developers are really good at misjudging performance issues.

So it it recommended to keep everything as simple as possible while calculating hash code on the fly in the GetHashCode().


Solution:6

Why do you need to make sure that the hashcode is fixed? The semantics of a hashcode are that it will always be the same value for any given state of an object. Since your objects are immutable, this is a given. How you choose to implement GetHashCode is us up to you.

Having it be a private field that is returned is one choice - it's small, easy, and fast.


Solution:7

In general, computing the HashCode should be fast. So caching should not be much of an optimization, and not worth the trouble.

If profiling really shows that GethashCode takes a significant amount of time then maybe you should cache it, as a fix.

But I wouldn't consider it part of the normal practice.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »