..would smell as sweet, wrote Shakespeare. An Integer in any other form is not the same, is the programmers dictum. Well, it is clear an integer takes up four bytes on Java, but if this is represented as a string would take up any number of bytes based on how big the number is. So when dealing with data structures, whether on disk or in memory we consciously use the right datatypes. It also a requirement if you are computing with it and you can avoid the conversions from string to the datatype and vice versa. But I ran into an interesting behavior with integers and HashMaps in Java. If you used an Integer as a key, the performance of the Map is better than when using the same integer represented as a string !!!
I also compared the performance with the Trove, High Performance Collections for Java Library. Surprisingly, the HashMap<Integer, Integer> beats even Trove. You will surely think twice before using a String key any more !!! 🙂 . (The numbers below are for a millions operations, so dont bother if you are dealing with smaller collections. I used JDK16 for these tests)
String hasing Put time:358 String hasing Get time:1404 Integer hashing Put time:171 Integer hashing Get time:57 THashMap Integer hashing Put time:424 THashMap Integer hashing Get time:72 TIntIntHashMap Integer hashing Put time:262 TIntIntHashMap Integer hashing Get time:55
But when you are dealing with large size collections and performance matters, you really need to check out the Trove library. Over and above, the performance of the essential methods, they have some convenience methods which will save CPU time and improve performance. Two such methods that I would like to point out are –
Integer putIfAbsent(Integer, Integer) adjustValue(int key, int amount) // available only in the TIntIntHashMap implementation
If you have worked with Maps you would immediately recognize the utility value of these two methods. The first saves a containsKey() call and the second would help you with frequency maps, saving a get() call in the process.
If you are like me, dealing with integer keys that originated in a database (either Oracle sequence or SQL Server identity) the HashMap<Integer, *> would be an ideal choice. As always do not forget to initialize your Maps and Lists with appropriate initial capacity.
ArrayList(int initialCapacity) Constructs an empty list with the specified initial capacity. |
HashMap(int initialCapacity)
Constructs an empty HashMap with the specified initial capacity and the default load factor (0.75).
|
HashMap(int initialCapacity, float loadFactor)
Constructs an empty HashMap with the specified initial capacity and load factor.
|
The test program I used for this, is available for download here.
Comments