A few weeks ago, we stumbled upon an interesting problem during a code refactoring. After changing the data type of an attribute from List<Long>
to List<Double>
, we noticed some severe performance issues. Our application took minutes to process data it previously handled in seconds. This seemed quite strange, so instead of just reverting the change, we decided to further investigate the reason for this behavior.
After some analysis, we found out that the version using Doubles
required more heap space and led to an enormous amount of garbage collections, which of course affected the application’s performance. But when consulting the Java Language Specification, you will find that the primitive types long
and double
are both represented internally using 64 bits. So why would the version using Doubles
lead to a higher memory consumption?
Now, the devil is in the details. Of course, a List
in Java does not handle primitive types but only their wrapper classes. But still, even when looking at the boxed types Long
and Double
, the only attribute in both classes is the wrapped primitive, so Double
objects should consume the same amount of memory as Long
objects. While this is all true, the problem in our use case was that our application created way more objects when using Double
than when using Long
– the reason, as we later found out, being the LongCache.
All the integer wrapper classes in Java (Byte
, Short
, Integer
, Long
, and also Character
) have an integrated cache for values that are often used. The cache is statically initialized, so in order to reduce the overhead of creating a cache for values that are probably never used, not all values are cached but only the values between -128 and +127 (which actually includes all values for Byte
). This range is hard-coded, except for Integers
, where it can be adjusted using the -XX:AutoBoxCacheMax
JVM option. A cached object will be used both when retrieving an object using valueOf()
as well as when using auto-boxing, but not when using the new
operator. This can also be tested quickly with the following code snippet:
Long long1 = 0L;
Long long2 = (long) 0;
Long long3 = Long.valueOf(0);
Long long4 = new Long(0);
System.out.println(long1 == long2); // true
System.out.println(long1 == long3); // true
System.out.println(long1 == long4); // false
The same cache does not exist for Float
or Double
values. Therefore, our application performed a lot worse when using Double
instead of Long
because our data contained many numbers in the cached ranged and so no new objects were created for these numbers when using Long
. Of course, you will only notice such effects when dealing with millions of numbers – in which case it is definitely a good idea to use integer numbers instead of floating point numbers, if possible.