Home >Technology peripherals >It Industry >How to Implement Java's hashCode Correctly
SitePoint Explore the Java world: Invite Java developers to contribute
SitePoint continues to expand its content field and will focus on Java in the near future. If you are an experienced Java developer and want to contribute to our Java content, please contact us to share the topic ideas you want to write.
Correct implementation of equals
and hashCode
methods in Java
You have implemented the equals
method for your class? great! But you also have to implement the method. Let's understand why and how to implement it correctly.
hashCode
Key points:
In Java, equal objects should have the same hash code. Therefore, if the
equals
When implementing hashCode
, the same fields used in the hashCode
Hash codes are related to performance optimization, so you should not put too much effort into hashing unless performance analysis indicates improvements are needed. equals
equals
Although the hashCode
method is reasonable from a general perspective, the
equals
Most data structures use the hashCode
method to check whether they contain an element. For example:
equals
Variable
<code class="language-java">List<string> list = Arrays.asList("a", "b", "c"); boolean contains = list.contains("b");</string></code>
However, comparing each element with an instance passed to the contains
method is inefficient, while a class of data structures uses a more efficient method. Instead of comparing requested instances with each element they contain, they use shortcuts to reduce the number of instances that may be equal, and then compare only those instances.
This shortcut is the hash code, which can be regarded as the equality of the object to be reduced to an integer value. Instances with the same hash code are not necessarily equal, but equal instances have the same hash code. (Or should have the same hash code, which we will discuss later.) Such data structures are usually named after their technical name, which contains "Hash" in which contains
is the most famous representative.
They usually work as follows: HashMap
contains
method, the bucket is calculated using its hash code. Only the elements in it are compared with the instance. In this way, implementing the contains
method may only require very few, ideally without any equals
comparison.
Like the equals
method, the hashCode
method is also defined in the Object
class.
Thinking about hash
If the hashCode
method is used as a shortcut to determine equality, then there is only one thing we should really care about: equal objects should have the same hash code.
This is also why if we rewrite the equals
method, we have to create a matching hashCode
implementation! Otherwise, things that are equal according to our implementation may not have the same hash code, because they use the implementation of the Object
class.
hashCode
Method agreement
Cite source code:
hashCode
The general agreement of the method is:
- Whenever it is called on the same object multiple times during execution of a Java application, the
hashCode
method must always return the same integer, provided that the information used in the object'sequals
comparison is not modified. This integer does not have to be consistent between the execution of one application and the other execution of the same application.- If two objects are equal according to the
equals(Object)
method, the call to thehashCode
method must produce the same integer result on each of the two objects.- If the two objects are not equal according to the
equals(Object)
method, you do not need to call thehashCode
method on the two objects that must produce different integer results. However, programmers should realize that generating different integer results for unequal objects can improve the performance of the hash table.
The first point reflects the consistency attribute of the equals
method, and the second point is the requirements we have drawn above. The third point illustrates an important detail that we will discuss later.
Implement hashCode
Method
A very simple Person.hashCode
implementation is as follows:
<code class="language-java">List<string> list = Arrays.asList("a", "b", "c"); boolean contains = list.contains("b");</string></code>
The human hash code is calculated by calculating the hash codes of related fields and combining them together. Both are left to the utility function Objects
for processing. hash
But which fields are related? These requirements help answer this question: if an equal object must have the same hash code, the hash code calculation should not contain any fields that are not used for equality checks. (Otherwise, only two objects that are different on these fields will be equal, but have different hash codes.)
Therefore, the set of fields used for hashing should be a subset of the set of fields used for equality. By default, both will use the same fields, but there are some details to consider.
Consistency
First of all, there are consistency requirements. It should be interpreted quite strictly. While it allows the hash code to change when some fields change (which is often inevitable for mutable classes), the hash data structure is not ready for this scenario.
As we saw above, the hash code is used to determine the bucket of the element. However, if the hash-related fields change, the hash is not recalculated and the internal array is not updated.
This means that subsequent queries using equal objects or even using exactly the same instance will fail! The data structure computes the current hash code (unlike the hash code used to store instances) and looks for it in the wrong bucket.
Conclusion: It is best not to use variable fields for hash code calculation!
Performance
The number of times the hash code is calculated may be approximately the same as the number of times the method is called. This is likely to happen in the critical performance part of the code, so it makes sense to consider performance. And unlike the equals
method, there is more room for optimization here. equals
If performance is critical, using
may also not be the best choice, as it requires creating an array for its mutable parameters. Objects.hash
Conflict
Take all your efforts to pursue performance, so how about this implementation?
<code class="language-java">List<string> list = Arrays.asList("a", "b", "c"); boolean contains = list.contains("b");</string></code>It must be fast. And equal objects will have the same hash code, so we're also fine in this regard. As a bonus, no variable fields are involved!
But remember what we had about buckets before? This way all instances will enter the same bucket! This usually results in a linked list holding all elements, which is very bad for performance. For example, each
call triggers a linear scan of the linked list. contains
Therefore, we want to minimize the number of items in the same bucket! An algorithm that returns a hash code that is very different even for very similar objects is a good start. How to implement depends in part on the selected field. The more details we include in the calculation, the greater the possibility that the hash code is different. Note that this is the exact opposite of what we think about performance. So, it is interesting to note that using too many or too few fields can lead to poor performance.
Another part of preventing conflicts is the algorithm used to actually calculate the hash.Calculate hash value
The easiest way to calculate the field hash code is to call the
method on it. They can be combined manually. A common algorithm is to start with an arbitrary number, then repeatedly multiply it with another number (usually a small prime number) and then add the hash of the field: hashCode
<code class="language-java">List<string> list = Arrays.asList("a", "b", "c"); boolean contains = list.contains("b");</string></code>This may cause overflow, but this does not cause exceptions in Java, so there is no big problem.
Note that even excellent hashing algorithms can lead to unusually frequent conflicts if the input data has a specific pattern. As a simple example, suppose we calculate the hash value of a point by adding the x and y coordinates of the point. This sounds pretty good until we realize that we often deal with points on the straight line f(x) = -x, which means that for all of these points, x y == 0. Conflict, a lot!
But again: use common algorithms and don't worry unless performance analysis shows problems.
Summary
We have seen that calculating hash codes is like compressing equality into integer values: equal objects must have the same hash code, and for performance reasons it is best to share the same as few unequal objects as possible as possible with the same hash code.
This means that if the
method is rewritten, the equals
method must always be rewritten. hashCode
Method: hashCode
equals
hashCode
method is related to performance, so don't waste too much energy unless the performance analysis shows it is necessary. hashCode
Methods (FAQ)hashCode
hashCode()
method in Java is a built-in function that returns an integer value. It is mainly used for hash-based collections (such as hashCode()
, HashMap
and HashSet
) to store and retrieve objects more efficiently. The HashTable
method works in conjunction with the hashCode()
method to ensure that each object has a unique identifier. This helps to quickly retrieve data, especially in large collections, thereby improving the performance of Java applications. equals()
hashCode()
method work in Java? The hashCode()
method in Java works by generating an integer value that represents the memory address of the object. This value is used as the index number of the object in a hash-based collection. When you call the hashCode()
method on the object, it uses a hashing algorithm to generate this unique integer. However, it is important to note that two different objects may have the same hashCode
, which is called hash conflict.
equals()
methods in hashCode()
in Java? and equals()
methods in hashCode()
in Java is a set of rules used to manage their interactions. The convention states that if two objects are equal according to the equals()
method, the call to the hashCode()
method must produce the same integer result on each of the two objects. This ensures consistency and accuracy when storing and retrieving objects in a hash-based collection.
hashCode()
method in Java? Rewrite the hashCode()
in Java method includes providing your own implementation that returns a unique integer for each object. This can be achieved by using instance variables of the object and prime multiplier. Prime numbers help to evenly distribute the hash codes in the set, thereby reducing the possibility of hash collisions.
Hash conflict means that the hashCode()
method generates the same integer for two different objects. If not handled properly, this can lead to data loss. To avoid hash conflicts, you can improve the hash algorithm to generate more unique integers. Furthermore, using larger prime numbers as multipliers can help to more evenly distribute the hash codes in the set.
hashCode()
method be rewritten? Rewrite hashCode()
Methods can improve the performance of Java applications, especially when dealing with large collections. By providing your own implementation, you can generate more unique and evenly distributed hash codes, reducing the possibility of hash conflicts and ensuring faster data retrieval.
hashCode
? Yes, in Java, two unequal objects can have the same hashCode
. This is called hash conflict. However, by improving the hashing algorithm and using a larger prime number as multiplier, the possibility of this happening can be reduced.
hashCode()
method? If you don't override the hashCode()
method, Java will use its default implementation, which may not provide a unique hash code for each object. This can lead to hash conflicts and slower data retrieval in hash-based collections.
hashCode()
How to improve the performance of Java applications? hashCode()
method improves the performance of a Java application by providing a unique identifier for each object. This allows data to be retrieved faster in hash-based collections, as the object can be found directly using the hash code of the object without searching the entire collection.
hashCode()
method in a non-hash-based collection? Although the hashCode()
method is mainly used for hash-based collections, it can also be used for non-hash-based collections. However, the benefits may be less obvious, because non-hash-based collections do not rely on hash code for data storage and retrieval.
The above is the detailed content of How to Implement Java's hashCode Correctly. For more information, please follow other related articles on the PHP Chinese website!