Kotlin extension functions: more than sugar
Kotlin’s extension functions have been around for awhile, but sometimes it’s helpful to take a step back and look more deeply at core features of our favorite languages.
Extension functions are often described as being similar to static “Utils” classes — a common staple of many Java codebases. Under the hood this is exactly how they work. Imaging writing a Java static method StringUtils.emojify(string: String)
. In Kotlin, an equivalent extension function could be written asString.emojify()
.
To use them we would write:
StringUtils.emojify(“:meow-party:”)
Or
“:meow-party:”.emojify()
And get the result:
Silly example aside, what’s happening is that the compiler is rewriting our extension function to a static function that also takes the instance being operated on as a parameter, then renaming that parameter to this
. For the curious, the decompiled code looks a bit like the following:
static final String emojify(String $this)
All this is pretty neat and it already speaks to some really nice value provided by extension functions. Yet this is often when some folks (fairly) begin to question if extension functions are really worth it. It seems like this change is just cosmetic. It looks like all we’ve done is rearrange the order that the code reads. I’ve certainly heard a number of developers describe extension functions as “syntactic sugar”. While this is a fair question to ask, extension functions are more valuable than they seem.
Improving readability of code itself is quite valuable, but extension functions provide more benefit than that. Depending on how they are used, extension functions aid in encapsulation and information hiding. They improve expressiveness and give us a private, context-specific perspectives of types.
Expressiveness
To make these benefits clear let’s look at some examples.
Say we have a Person
model and we’re trying to calculate the discount they should get for some product or service.
Now let’s say we want to calculate the discount and to do that we have a calculateDiscount(person)
function.
In order to calculate the discount, let’s imagine we need to give different discounts to someone over 65, under 13, and no discounts to those who make over $1 million. One approach we could take is to write all that logic in the calculateDiscount()
function.
This is fine but it’s not particularly great for readability. As our business logic grows, you can probably imagine this function becoming harder and harder to read. We’re also doing more than one thing here (a violation of the Single Responsibility Principle). The right side of the when
expression is mapping discount amounts to certain classifications of Person
. The left side of thewhen
expression is evaluating if the Person
instance meets those classifications.
It would be nicer if we could extract out each one of the boolean expressions on the left. Each of these expressions such as person.age > 65
is a piece of logic by itself. We can express it better by giving it a name, and we can test it very easily if it wasn’t evaluated inline.
One option would be to move it to the Person
class.
We’ll call these our “classification functions” for the rest of the article since they are classifying aspects of a person.
This brings a nice improvement in the readability of the when
expression in calculateDiscount(...)
.
This is far more readable and our business logic becomes much more clear. Now all we’re doing in the when
expression is mapping the discount amounts to the classifications. Additionally, it would be pretty easy to write a unit test of each function.
The problem here is that we’ve polluted our Person
model with context specific business logic. If you imagine we have another usage of the Person
class, say for applying coupons, it may not need to know the age of the Person
. Or worse, it may have different rules like “under 10 years old” or “over 70”. If we add more rules to our Person
model, we’re quickly building up an unwieldy amount of logic from all over our application. We’ve also exposed the Person
instance’s private property name
to our classification functions. But never fear, extension functions are here.
Context-specific perspectives
In the intro of the article I mentioned context-specific perspectives. The idea here is that code that uses a Person
model can have its own understanding of a Person
. A DiscountCalculator
can see a Person
model differently than a CouponEligibility
class. To better understand this, let’s add some private extension functions to the file where the DiscountCalculator
class is defined.
Note: It’s important we don’t add them to DiscountCalculator
itself, and we’ll see why in the next section.
Once we’ve done this our code for the calculateDiscount()
looks exactly the same as it did when we had these functions in the Person
class.
And now our Person
class returns to its original code that just defines the data.
Now other classes like our CouponEligibility
class don’t know about these extension functions. CouponEligibility
can even have a different meaning for isRich()
. This is especially beneficial. We’ve kept the two classes from becoming coupled to each other through shared calls to functions on the Person
class.
Information hiding, encapsulation, and cohesion
Now you might be wondering at this point why couldn’t we do all this simply with private functions in our DiscountCalculator
and CouponEligibility
classes. The answer is we could, but it has a few drawbacks.
We could write our classification functions in DiscountCalculator
like this:
And that would end up changing calculateDiscount()
to the following:
I personally find this a bit less fluent and readable. But, if I am being honest, this is a pretty minor drawback that is definitely subjective. over65(person)
is just about as easy to read as person.over65()
.
However, we’ve caused another, more serious drawback. By using local functions on DiscountCalculator
, we’ve exposed all the internal state and private functions of DiscountCalculator
to our classification functions. It now becomes possible to unintentionally create dependencies and side effects.
It would be pretty easy to write something like this:
I like to call this danger “reaching out sideways” and it can have painful consequences in terms of testing, compartmentalization, modularity, and refactoring. isRich()
can no longer be tested in isolation. It has inputs that require setting up internal state, and outputs that require monitoring side effects. Another way to say this is that isRich()
is no longer a pure function.
If we go back to having private extension functions outside of the DiscountCalculator
but in the same file, these functions can no longer read internal properties or call private functions inside DiscountCalculator
.
These functions also don’t have access to private properties or functions on the Person
instance either, like name
.
Let’s take stock of the benefits of our extension function approach.
DiscountCalculator
can see the classification functions likeisRich()
isRich()
can’t see the internals ofDiscountCalculator
isRich()
can’t see the internals ofPerson
isRich()
is an easily testable, pure functionCouponEligibility
can have a different meaning forisRich()
- We’ve kept
DiscountCalculator
andCouponEligibility
more decoupled by avoiding shared functions calls calculateDiscount()
is easy to read and expressivecalculateDiscount()
is only doing one thing: mapping discounts to classifications- This business logic of classification such as
isRich()
, is grouped nearby where it is used inDiscountCalculator
(functional cohesion)
These are all good things that go beyond syntactic sugar. While many of the benefits can individually be achieved through other patterns, having all these benefits at once is pretty dope.
At this point however, you may be asking yourself if you couldn’t just do all this with private top level functions in the DiscountCalculator
class. If you are asking yourself that, you’re right! That would give you all the same benefits above. But you would be missing out on one important benefit we haven’t mentioned yet.
Discoverability
The one advantage truly unique to extension functions is that when someone is working with the receiver type, the IDE will automatically suggest the functions appropriate for that type and your current scope.
Of course discoverability is an interesting benefit to consider. In small, focused, single-purpose classes, there isn’t as strong a benefit. But for a type that is reused widely, discoverability becomes extremely important. This can be helpful for publishing libraries or reusing things like your team’s emojify()
function (the importance of :meow-party: we saw earlier).
By grouping all of their functions together as extension functions, library writers keep the base types pure while adding their library-focused functionality in a discoverable way.
So while much of our touted benefits came from how we set up our extension functions (private, in the same file, outside the class), they do offer some truly unique benefits by adding functionality to types in a discoverable way. One only need to look at the stdlib for how JetBrains themselves view this pattern. Nearly all of those functions are added via extension functions.
If you enjoy cats and Kotlin you can find me on the twitters: https://twitter.com/alostpacket