ProAndroidDev

The latest posts from Android Professionals and Google Developer Experts.

Follow publication

All you need to know about ArrayMap & SparseArray

--

Android SDK offers a rich variety of useful utilities that we successfully apply in development. However, there are other collections such as ArrayMap, SparseArray, LruCache, ArraySet, CircularArray.

In this article, I will explain the internals of the ArrayMap and SparseArray, which are analogous to JDK HashMap.

You will learn

  • How ArrayMap and SparseArray work under the hood
  • About efficiency in memory and speed
  • How and when you should use them

One more map implementation. Why ?

Unlike big server Java, in Android we are significantly limited by Memory.

While server environments may have a lot of RAM, an average Android smartphone (as of 2020) has 2 to 6 GB of RAM to serve both the system, and running apps.

That is why the issue of optimization inevitably comes forward.

HashMap is an array with nodes that is re-created and enlarged when loadFactor is reached.

Each node is represented by a regular singly linked list, which, if there are collisions, reduces the read speed from O(1) to O (n).

(When a certain number of collisions is reached, the node will be represented as a TreeNode, which will change the read speed from O(n) to log(n).)

The absence of collisions lets us win in terms of speed. However, we’re not that lucky with memory consumption.

ArrayMap from AndroidSDK provides the same functionality as HashMap and consumes much less memory.

But how?

ArrayMap: Under the hood

Instead of a single large array with buckets as in HashMap, ArrayMap contains two small ones:

  • mHashes — stores the hash of keys in sorted order
  • mArray — stores keys and their values

They are linked — not obviously though by their positions:

By default, the size of mHashes is 4, and mArray is 8.

Now let’s take a closer look at the principles of ArrayMap work based on the examples. The understanding may not come to you at once. It’s fine. I add diagrams to each example — I hope they’ll help.

I create a user and put it in ArrayMap.

To show you all cases, I’ll use an object as the key whose hashCode is guaranteed to cause collisions:

Put user

My first put, or «If nothing is in here, nothing to look for»

In the ArrayMap::put method, the first thing we need to do is calculate the hash.

In HashMap, hashCode goes through an additional hashing procedure. Unlike that, in ArrayMap, we only have the usual key.hashCode()

Then the ArrayMap::indexOf() method is called, which returns the index (positive if there is a hash, and negative if there is no hash). The index is used for calculating the following parameters:

  • Position of hash in mHashes
  • Position of Key in mArray
  • Position of Value in mArray

In the case of put, this method also helps determine the type of operation.

Negative values are always converted to a positive integer using bitwise operations. (Otherwise, we would have to use an additional variable, which would entail extra memory allocation.)
You may read more about bit operations here.

Since we don’t have any keys, the default value is returned:

In our case, -1 will be returned. We convert it back to the positive index and then write the key and value to mArray.

Visually, it looks like this:

Put value to not empty ArrayMap

Now let’s add another user and see what happens.

We need to figure out which operation to perform: replace if this hashCode is already present in mHashes and the value keys match, or put if at least one condition is not met.
To speed up the search, the binary search is used.

In ArrayMap, it is performed as a separate method, and returns the position for the passed hash in mHashes.
(Negative position if the key is not found, and positive if found.)

Then, same as in the first example, the result is written.

Note that we've got the position that is already occupied.
The thing is that the hashCode value of Key(“Diane Nguyen”) (12) is less than the hashCode value of Key(“Bojack Horseman”) (15), and the sort order is very important for binary search.

Okay, but what about collisions?

Unfortunately, we cannot completely eliminate collisions, since, due to the 32-bit restrictions, it is not possible to generate a truly equivalent hashCode.

Let’s add a user whose key hashCode matches the one of the existing user:

As a result, in the indexOf() method, we get a position that refers to Key(Dyane Nguyen) in mArray.

It is obvious that the passed key is not equivalent to the found one, but this does not necessarily mean that the passed key is missing at all. (This happens when collisions already exist.)

To make sure that the operation type is chosen accurately and correctly, let’s do the linear search for the key in mValues.

It looks like this under the hood:

  • If the keys are not equal, we try to find the key after the calculated index.
  • If not found, we try to find the key before the calculated index.
  • If there is no result, we return a negative index of the position.

In our case, the result will not be found and a put operation will be performed. This is how it looks:

Array size increasing/decreasing and complexity

In HashMap, to save memory, loadFactor is used. There is no such thing in ArrayMap, as we only re-create mHashes and mArray when it is really necessary, i.e. when the limit is reached.
We copy the old values using the native method System.arrayCopy().

The new size is calculated using the following formula:

mHashes, when it is removed from ArrayMap, is also re-created and reduced, but only on condition that mSize < mHashes.length /3

If we are sure that the keys will have a fixed size and want to fill in the values as soon as the keys are created, it makes sense to specify the size of the ArrayMap in the constructor.

If we only need to do this in certain cases, the ArrayMap::ensureCapacity method will do.

If we talk about speed, then due to binary search, read/write operations will be represented as log(On) and O(n) if there are collisions.

HashMap vs ArrayMap: Memory Efficiency

Let’s move on to the most important thing — the numbers.

Conditions:

  • Key: Int
  • Value: data class User(val id: Int, val name: String)
  • Size: 1000
  • Stuff: Memory Profiler from Android Studio (Retained Size)
  • Device: Vsmart Live, Android 9.0

In this case ArrayMap wins by 26.3% in memory.

And this is not the final result. We can do better actually.

SparseArray

SparseArray is also a standard collection in the Android SDK. It works in almost the same way, with a few exceptions:

  • Primitive Int serves as the key
  • Available implementations: SparseArray<T>, SparseIntArray, SparseLongArray, SparseBooleanArray
  • No hashing Keys are directly linked with the value
  • If we use the id from User as the key, we’d get the following picture:

We can avoid boxing/unboxing operations when we use primitive data types as keys. This again significantly reduces memory consumption and allocation of new objects — each int (32 bits) passed as a key will be converted to Integer (128 bits).

Moreover, due to the lack of hashing, we do not have to store keys together with values, since it is assumed that there are no collisions in SparseArray.

Final metrics

Same conditions:

  • Key: Int
  • Value: data class User(val id: Int, val name: String)
  • Size: 1000
  • Stuff: Memory Profiler from Android Studio (Retained Size)
  • Device: Vsmart Live, Android 9.0

Result: Average of 100 read operations (random position on each, from 0 to 1000)

Result: Average of 100 read operations (random position on each, from 0 to 1000)

Conclusion

  1. Use ArrayMap instead of HashMap
  2. Use SparseArray if you use int as the key
  3. Use SparseIntArray / SparseLongArray / SparseBooleanArray if you use int as the key and Int/Long/Boolean as the values
  4. If possible, set a fixed size in the constructor or the ensureCapacity method

Surely if you have more than 1,000 or 2,000 elements, ArrayMap/SparseArray won’t be as efficient in terms of speed as HashMap. But it’s uncommon for modern mobile development.

Just remember that this is always a trade-off.

By sacrificing memory, we increase performance, and vice versa.

In our case, we can save 25–30% of memory.

This is why ArrayMap and SparseArray are used in many internal components of the Android SDK: Bundle, RecyclerView, MediaPlayer, VectorDrawableCompat, FragmentStateAdapter, etc.

--

--

Published in ProAndroidDev

The latest posts from Android Professionals and Google Developer Experts.

Written by Alexey Bykov

Android GDE, Senior Software Engineer at Reddit.

Responses (5)