All you need to know about ArrayMap & SparseArray
Android SDK offers a rich variety of useful utilities that we successfully apply in development. However, there are other collections such as ArrayMap
, SparseArray
, LruCache
, ArraySet
, CircularArray
.
In this article, I will explain the internals of the ArrayMap
and SparseArray
, which are analogous to JDK HashMap
.
You will learn
- How
ArrayMap
andSparseArray
work under the hood - About efficiency in memory and speed
- How and when you should use them
One more map implementation. Why ?
Unlike big server Java, in Android we are significantly limited by Memory.
While server environments may have a lot of RAM, an average Android smartphone (as of 2020) has 2 to 6 GB of RAM to serve both the system, and running apps.
That is why the issue of optimization inevitably comes forward.
HashMap
is an array with nodes that is re-created and enlarged when loadFactor
is reached.
Each node is represented by a regular singly linked list, which, if there are collisions, reduces the read speed from O(1)
to O (n)
.
(When a certain number of collisions is reached, the node will be represented as a TreeNode, which will change the read speed from O(n)
to log(n)
.)
The absence of collisions lets us win in terms of speed. However, we’re not that lucky with memory consumption.
ArrayMap
from AndroidSDK provides the same functionality as HashMap
and consumes much less memory.
But how?
ArrayMap: Under the hood
Instead of a single large array with buckets as in HashMap, ArrayMap contains two small ones:
- mHashes — stores the hash of keys in sorted order
- mArray — stores keys and their values
They are linked — not obviously though by their positions:
By default, the size of mHashes
is 4, and mArray
is 8.
Now let’s take a closer look at the principles of ArrayMap
work based on the examples. The understanding may not come to you at once. It’s fine. I add diagrams to each example — I hope they’ll help.
I create a user and put it in ArrayMap
.
To show you all cases, I’ll use an object as the key whose hashCode
is guaranteed to cause collisions:
My first put, or «If nothing is in here, nothing to look for»
In the ArrayMap::put
method, the first thing we need to do is calculate the hash
.
In HashMap
, hashCode
goes through an additional hashing procedure. Unlike that, in ArrayMap
, we only have the usual key.hashCode()
Then the ArrayMap::indexOf()
method is called, which returns the index (positive if there is a hash, and negative if there is no hash). The index is used for calculating the following parameters:
- Position of hash in mHashes
- Position of Key in mArray
- Position of Value in mArray
In the case of put
, this method also helps determine the type of operation.
Negative values are always converted to a positive integer using bitwise operations. (Otherwise, we would have to use an additional variable, which would entail extra memory allocation.)
You may read more about bit operations here.
Since we don’t have any keys, the default value is returned:
In our case, -1
will be returned. We convert it back to the positive index and then write the key and value to mArray
.
Visually, it looks like this:
Put value to not empty ArrayMap
Now let’s add another user and see what happens.
We need to figure out which operation to perform: replace if this hashCode is already present in mHashes and the value keys match, or put if at least one condition is not met.
To speed up the search, the binary search is used.
In ArrayMap, it is performed as a separate method, and returns the position for the passed hash in mHashes.
(Negative position if the key is not found, and positive if found.)
Then, same as in the first example, the result is written.
Note that we've got the position that is already occupied.
The thing is that the hashCode
value of Key(“Diane Nguyen”) (12)
is less than the hashCode
value of Key(“Bojack Horseman”) (15)
, and the sort order is very important for binary search.
Okay, but what about collisions?
Unfortunately, we cannot completely eliminate collisions, since, due to the 32-bit restrictions, it is not possible to generate a truly equivalent hashCode
.
Let’s add a user whose key hashCode
matches the one of the existing user:
As a result, in the indexOf()
method, we get a position that refers to Key(Dyane Nguyen)
in mArray.
It is obvious that the passed key is not equivalent to the found one, but this does not necessarily mean that the passed key is missing at all. (This happens when collisions already exist.)
To make sure that the operation type is chosen accurately and correctly, let’s do the linear search for the key in mValues.
It looks like this under the hood:
- If the keys are not equal, we try to find the key after the calculated index.
- If not found, we try to find the key before the calculated index.
- If there is no result, we return a negative index of the position.
In our case, the result will not be found and a put operation will be performed. This is how it looks:
Array size increasing/decreasing and complexity
In HashMap
, to save memory, loadFactor
is used. There is no such thing in ArrayMap
, as we only re-create mHashes
and mArray
when it is really necessary, i.e. when the limit is reached.
We copy the old values using the native method System.arrayCopy().
The new size is calculated using the following formula:
mHashes
, when it is removed from ArrayMap
, is also re-created and reduced, but only on condition that mSize < mHashes.length /3
If we are sure that the keys will have a fixed size and want to fill in the values as soon as the keys are created, it makes sense to specify the size of the ArrayMap
in the constructor.
If we only need to do this in certain cases, the ArrayMap::ensureCapacity
method will do.
If we talk about speed, then due to binary search, read/write operations will be represented as log(On)
and O(n)
if there are collisions.
HashMap vs ArrayMap: Memory Efficiency
Let’s move on to the most important thing — the numbers.
Conditions:
- Key: Int
- Value:
data class User(val id: Int, val name: String)
- Size: 1000
- Stuff: Memory Profiler from Android Studio (Retained Size)
- Device: Vsmart Live, Android 9.0
In this case ArrayMap wins by 26.3% in memory.
And this is not the final result. We can do better actually.
SparseArray
SparseArray is also a standard collection in the Android SDK. It works in almost the same way, with a few exceptions:
- Primitive Int serves as the key
- Available implementations:
SparseArray<T>
,SparseIntArray
,SparseLongArray
,SparseBooleanArray
- No hashing
Keys
are directly linked with the value - If we use the
id
fromUser
as the key, we’d get the following picture:
We can avoid boxing/unboxing
operations when we use primitive data types as keys. This again significantly reduces memory consumption and allocation of new objects — each int (32 bits) passed as a key will be converted to Integer (128 bits).
Moreover, due to the lack of hashing, we do not have to store keys together with values, since it is assumed that there are no collisions in SparseArray
.
Final metrics
Same conditions:
- Key: Int
- Value:
data class User(val id: Int, val name: String)
- Size: 1000
- Stuff: Memory Profiler from Android Studio (Retained Size)
- Device: Vsmart Live, Android 9.0
Result: Average of 100 read operations (random position on each, from 0 to 1000)
Conclusion
- Use
ArrayMap
instead ofHashMap
- Use
SparseArray
if you use int as the key - Use
SparseIntArray
/SparseLongArray
/SparseBooleanArray
if you use int as the key andInt/Long/Boolean
as the values - If possible, set a fixed size in the constructor or the
ensureCapacity
method
Surely if you have more than 1,000 or 2,000 elements, ArrayMap/SparseArray
won’t be as efficient in terms of speed as HashMap
. But it’s uncommon for modern mobile development.
Just remember that this is always a trade-off.
By sacrificing memory, we increase performance, and vice versa.
In our case, we can save 25–30% of memory.
This is why ArrayMap
and SparseArray
are used in many internal components of the Android SDK: Bundle
, RecyclerView
, MediaPlayer
, VectorDrawableCompat
, FragmentStateAdapter
, etc.