###

Question:

The JDK documentation for `java.lang.String.hashCode()`

famously says:

The hash code for a String object is computed as

`s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]`

using

`int`

arithmetic, where`s[i]`

is the *`i`

*th character of the string,`n`

is the length of the string, and`^`

indicates exponentiation.

The standard implementation of this expression is:

`int hash = 0; for (int i = 0; i < length; i++) { hash = 31*hash + value[i]; } return hash; `

Looking at this makes me feel like I was sleeping through my algorithms course. How does that mathematical expression translate into the code above?

###

Solution:1

I'm not sure if you missed where it says "^ indicates exponentiation" (not xor) in that documentation.

Each time through the loop, the previous value of hash is multipled by 31 *again* before being added to the next element of `value`

.

One could prove these things are equal by induction, but I think an example might be more clear:

Say we're dealing with a 4-char string. Let's unroll the loop:

`hash = 0; hash = 31 * hash + value[0]; hash = 31 * hash + value[1]; hash = 31 * hash + value[2]; hash = 31 * hash + value[3]; `

Now combine these into one statement by substituting each value of hash into the following statement:

`hash = 31 * (31 * (31 * (31 * 0 + value[0]) + value[1]) + value[2]) + value[3]; `

31 * 0 is 0, so simplify:

`hash = 31 * (31 * (31 * value[0] + value[1]) + value[2]) + value[3]; `

Now multiply the two inner terms by that second 31:

`hash = 31 * (31 * 31 * value[0] + 31 * value[1] + value[2]) + value[3]; `

Now multiply the three inner terms by that first 31:

`hash = 31 * 31 * 31 * value[0] + 31 * 31 * value[1] + 31 * value[2] + value[3]; `

and convert to exponents (not really Java anymore):

`hash = 31^3 * value[0] + 31^2 * value[1] + 31^1 * value[2] + value[3]; `

###

Solution:2

unroll the loop. Then you get:

`int hash = 0; hash = 31*hash + value[0]; hash = 31*hash + value[1]; hash = 31*hash + value[2]; hash = 31*hash + value[3]; ... return hash; `

Now you can do some mathematical manipulation, plug in 0 for the initial hash value:

`hash = 31*(31*(31*(31*0 + value[0]) + value[1]) + value[2]) + value[3])... `

Simplify it some more:

`hash = 31^3*value[0] + 31^2*value[1] + 31^1*value[2] + 31^0*value[3]... `

And that is essentially the original algorithm given.

###

Solution:3

Proof by induction:

`T1(s) = 0 if |s| == 0, else s[|s|-1] + 31*T(s[0..|s|-1]) T2(s) = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] P(n) = for all strings s s.t. |s| = n, T1(s) = T2(s) Let s be an arbitrary string, and n=|s| Base case: n = 0 0 (additive identity, T2(s)) = 0 (T1(s)) P(0) Suppose n > 0 T1(s) = s[n-1] + 31*T1(s[0:n-1]) T2(s) = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] = s[n-1] + 31*(s[0]*31^(n-2) + s[1]*31^(n-3) + ... + s[n-2]) = s[n-1] + 31*T2(s[0:n-1]) By the induction hypothesis, (P(n-1)), T1(s[0:n-1]) = T2(s[0:n-1]) so s[n-1] + 31*T1(s[0..n-1]) = s[n-1] + T2(s[0:n-1]) P(n) `

I think I have it, and a proof was requested.

###

Solution:4

Take a look at the first few iterations and you'll see the pattern start to emerge:

hash_{0}= 0 + s_{0}= s_{0}hash_{1}= 31(hash_{0}) + s_{1}= 31(s_{0}) + s_{1}hash_{2}= 31(hash_{1}) + s_{2}= 31(31(s_{0}) + s_{1}) + s_{2}= 31^{2}(s_{0}) + 31(s_{1}) + s_{2}...

###

Solution:5

Isn't it useless at all to count the hashcode of the String out **of all characters**? Imagine filenames or classnames with their full path put into HashSet. Or someone who uses HashSets of String documents instead of Lists because "HashSet always beats Lists".

I would do something like:

`int off = offset; char val[] = value; int len = count; int step = len <= 10 ? 1 : len / 10; for (int i = 0; i < len; i+=step) { h = 31*h + val[off+i]; } hash = h `

At the end hashcode is nothing more than a hint.

**Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com**

EmoticonEmoticon