Map usage optimization

You know the situation. You have a data structure that is a Map and the key is a simple object and the value is some complex object (lets say an ArrayList). You can’t just get() the ArrayList and start using it because it might not be there. First you need to check to see if it exists and then use it.

For years, I was doing this like so:

if (!map.containsKey(key)) {
    map.put(key, new ArrayList<String>());
}
map.get(key).addAll(someList);

It’s pretty basic. If the map doesn’t contain the key, put in the new ArrayList then use it. If the Map already contains the key, it just uses it.

A week or so ago I was writing a chunk of code just like this and stopped and thought: Is this really the best way to do this?

Lets break it down:

First Run:

  • 1 method call to containsKey
  • key is hashed
  • 1 object creation of ArrayList
  • 1 method call to put
  • key is hashed
  • 1 method call to containsKey (done by the put)
  • 1 method call to get
  • key is hashed
  • one method call to addAll

Second Run:

  • 1 method call to containsKey
  • key is hashed
  • 1 method call to get
  • key is hashed
  • one method call to addAll

Wow, that’s actually a good bit of work (not even including the Map’s underlying implementation when hash’s collide and it does things like traverse a LinkedList in the underlying array or has to allocate more space).

So, what would be a better way? Well, I decided that I would try a small experiment. I’d post to StackOverflow. I’ve used the site in the past (it’s very helpful for Android development) and I was curious what they would make of my question.

The answer, provided by Kublai Khan was quite simple and a bit of a “oh yeah, duh.” to me.

The solution:

List<String> existingList = map.get(key);
if (existingList == null){
   existingList = new ArrayList<String>();
   map.put(key, existingList);
}
existingList.addAll(someList);

Lets break it down:

First Run:

  • 1 method call to get
  • key is hashed
  • existingList is allocated
  • 1 null check (faster than a method call)
  • 1 object creation of ArrayList
  • 1 method call to put
  • key is hashed
  • 1 method call to containsKey (done by the put)
  • 1 method call to addAll

Second Run:

  • 1 method call to get
  • key is hashed
  • 1 method call to addAll

Notice that the first time in we are doing about the same work, but from them on there is significantly less work. Now, imagine if (like in my code) this happens a few thousand times each time the program is run. This little change can really add up time wise.

Where was my DOH! moment? Well, what was suggested leverages that since Java treats everything as a reference, since we placed in the object in the map, we can still work with it even though we didn’t go and get it again from the map. We have the reference from when we created it so we can just use that.

About sseaman

Connect with me on Google+
This entry was posted in Java, Programming and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published.