Designing a Kotlin memory safe mode

In my previous article, I shared my frustration about how the Kotlin/Native memory model is enforced at run-time, and proposed the beginning of language solution, having const in the type system.

Designing a language, even theoretically, is a very interesting challenge, so let’s put on our fake language designer hat, and see how we could offer a solution to the Kotlin/Native memory model.

Warning: this is a long article, describing a language feature that does not exist! I whish it did, though. This article describes the reasoning behind the proposal.

I am going to draw some comparisons with the Rust language, especially how rust handles ownership, lifetime, and concurrency. You don’t have to read these links to understand this article, but if you want to dig deeper into these concepts, I highly recommend you do. Rust is an amazing language, with an amazing documentation.

When we want to extend an existing language, we first have to understand the framework of rules that governs this language.

Rust has a very strict framework of rules: if it is not explicitly allowed, then it is forbidden. Objects are immutable by default, unless specified mutable, you cannot pass pointers around unless specifically handling ownership, you cannot have multiple mutable references, etc.

Java on the other hand has a very “relaxed” framework of rules. If it is not explicitly forbidden, then it’s allowed. Everything is mutable, every class can be extended unless explicitly final, generic parameters are optional, etc.

Kotlin lies somewhere between: its designers keep saying that it is first and foremost pragmatic, which leads to some decisions that may look contradictory:

Classes are final by default, but overrides are final only if explicit.
Objects are mutable but collection interfaces are first immutable, then mutable (MutableList extends List).
Nullability must be handled explicitly but all exceptions are implicit.

So, I think that, when thinking of extending the Kotlin language, we must remember the following “pragmatic” framework of rules :

Constraints must not get “in the way” of programmers. If the constraint is “worked around” more times than it is used, then it should not be enforced.
Code must be concise and explicit, meaning that the default behavior should be the more frequent one, but any change in behavior must be explicitly defined and readable.

These 2 rules are the reason Kotlin enforces it most infamous rules: “everything is public by default”.

Finally, of course:

New additions to the language must be backward compatible, not suddenly changing the semantic of existing Kotlin code.

Let’s start.

First, from the famous Kotlin/Native memory model rule “a datum is either mutable or shared”, we can infer the exact opposite rule that describes the exact same concept: “a datum is either thread local or immutable”.

Kotlin/Native proposes a freeze API that renders an object (and all its subgraph) immutable. I’ve exposed in my previous article that I think this is a very bad idea to enforce immutability at run-time. So let’s introduce the const keyword in Kotlin.

C++ has an interesting approach to this: a type annotation. In Kotlin, it would look like this:

fun description(foo: const Foo): String {}

The description function states that it will not mutate the foo object, it will only read values from it. To enforce this, the compiler would only allow the function to access const val geters and const functions, something that in Kotlin would look like:

class Foo {
const val answer = 42
const fun getName(): String = TODO()
}

Here, the getName method is essentially pure, the compiler will ensure that it does not mutate the object (has no side effect).

This const type annotation is a very bad idea in Kotlin, for a lot of reasons:

It would mean going over the entire standard library code-base to add const keyword to every pure functions such as map, filter, etc.
There would be const all over the place. The vast majority of functions and methods do not mutate data. Sure enough, the const keyword in C++ is everywhere, impairing readability.

So why not make it the opposite? Say that everything is immutable unless explicitly defined mutable. Rust does that with the mut keyword. Unfortunately, while this is very appealing to programmers that love constrains (a group in which I proudly associate myself), we are not going to change the semantic of the entire existing Kotlin code-base. This would require everybody to go over all the Kotlin code that ever was created in order to add the mut keyword everywhere it is needed.

Not going to happen.

Here is something that could happen, though: a const class annotation. Something like this:

const class User(val firstName: String, val lastName: String) {
    fun fullName() = "$firstName $lastName"
}

At compile-time, the compiler ensures that const classes contain no var properties and only primitive or const values.
At run-time, in Kotlin/Native, all const classes reifications and their sub-trees are frozen by default.

This eliminates the need for the freeze API. Objects are either const and frozen, or they’re not.
Some will argue that this removes the possibility to create and configure an object and later freeze it. I believe this is a huge code smell (you cannot know when the object has been frozen and it’s become forbidden to mutate it), and would redirect you to the builder pattern. Also: most garbage collectors have become experts at handling short-lived objects.

Yes, I’m aware this would still need some standard library changes: standard classes such as String would need to be annotated with const.

The next question is about polymorphism. Can we extend const classes? Are const interfaces allowed? If so, are all of their implementations required to be const?
These are very relevant questions, but irrelevant in the scope of this article. They do not need to be answered for this demonstration.
Also, I don’t know ;)

Side benefit: const data classes could have their toString() and hashCode() value generated at initialization and cached, which would greatly improve the performance of using data classes as Map keys. My last benchmark showed that caching hash code speeds up Map lookup by a factor of 3.

Let’s go back to our primal rule: “a datum is either thread local or immutable”. This means that the compiler should not allow non-const global values. This is the part where existing Kotlin code gets invalid.
Hopefully, Kotlin/Native is still in its early days, so I believe it would be acceptable to introduce a compiler safe mode that could or could not be enabled when using Kotlin/JVM or Kotlin/JS, but that would forcibly be enabled in any multi-platform project containing Kotlin/Native targets.

This “safe mode” would simply forbid mutable shared global state (but allow thread-local root variables):

class Foo { /*...*/ }
const class Bar { /*...*/ }

val foo = Foo() // Forbidden, Foo is not const
var bar = Bar() // Forbidden, var are not allowed

val global = Bar() // Allowed
@ThreadLocal
var local = Foo() // Allowed (because thread-local)

fun main() { TODO() }

“Don’t communicate by sharing memory, share memory by communicating”.
— The Go language designers.

This has become a moto for a lot of modern programming languages and programmers. At some points, threads need to communicate information between them, and can do so either by mutating shared data (which is bad, that’s the entire point) or by sending data from one to another.

Rust has the notion of ownership & lifetime, which ensures that only one variable owns the data it contains and that the lifetime of the variable ends when it looses ownership of its value. This makes passing mutable objects between threads possible and safe.
These are notions that cannot make it in the Kotlin language, it’s way too late.

However, we have introduced the notion of const classes. Const classes objects are directly frozen, so they can be shared between threads. All we need is, for a function that handles thread message passing, a way to declare that it only accepts const value parameters.
Generics can help us here:

const class ConstChannel<T : const Any> { /*...*/ }

fun <const T : Any> newConstChannel(): ConstChannel<T> = TODO()

The ConstChannel is a const class, so it can be safely shared between threads. It allows to send and receive data of type T, which must be const themselves.
How can ConstChannel be a const class, and yet handle inter-thread message passing internals (which definitely needs mutability)? Keep reading !

In Kotlin, lambda functions can capture outer scope values and variables. In essence, they are really closures and should not be called lambdas.

What if we are using the Executor Pattern (or the Kotlin/Native worker API) and want to schedule a lambda to run on an executor. There is a high probability chance that the code will be executed on another thread, which means that this lambda must not be able to capture non-const values.

Here is how you would declare the executor class:

const class ConstExecutor {
    fun schedule(block: const () -> Unit) = TODO()
}

Block is a const lambda, meaning that it can only capture const values. The following example illustrate its usage:

const data class GameState(val level: Int)

val executor = ConstExecutor()

fun save(level: Int) {
    val state = GameState(level)
executor.schedule { writeToFile(state) }
}

Because state is of type GameState, and because GameState is a const class, the lambda can capture it.
By contrast this code would not compile:

fun save(level: Int) {
    var saveLevel = level
    if (level > 10)
        saveLevel -= 1 // Games become hard and frustrating!
    executor.schedule {
writeToFile(GameState(saveLevel))
//                    ^^^^^^^^^
        // Cannot capture a mutable value!
}
}

How can ConstExecutor be a const class, and yet handle inter-thread scheduling internals (which definitely needs mutability)? Keep reading !

We need a way to screw up. Or more specifically, we need a way to go around these limitations. Something like the !! operator where we can say to the compiler “you can’t be the guarantee of this value’s nullability but I can, trust me ; I accept the possibility of a run-time crash if I’m mistaken”.

Kotlin/Native already does that with the Atomic* classes, which are classes that allows mutation while frozen.

We have defined the possibility of a const lambda, which can only capture const values, but there’s nothing preventing such lambdas from returning non-const value. Consider the following definition:

const class Attachable<T>(creator: const () -> T) {
    fun <R : const Any?> attach(block: const (T) -> R): R
}

Let’s break this down.

Attachable is a const class, so it can be shared between threads.
The creator constructor parameter is a const lambda, which ensures that it returns either a captured const value (safe to share) or a new mutable data with no other references (since the lambda cannot capture mutable data) which can safely be detached / reattached between threads.
The attach function executes the block const lambda attaching the mutable data to the current thread for the duration of the lambda’s execution.
Because it is a const lambda with a const return, it cannot escape or “leak” references outside of the lambda.
If multiple threads attach at the same time, it will crash at run-time in Kotlin/native but not in Kotlin/JVM (but race conditions may occur).

So, why oh why? Why get a headache designing a beautiful safe system only to provide the baseball bat to destroy it all? We wanted to ensure no crash at run-time, we wanted to ensure similar semantic whatever the target, and we destroyed these two dreams in 3 lines of code.

We’ve put everything we wanted to avoid in a single class. In essence, we’ve confined unsafety into Attachable. This leads to the following point: do not use Attachable if you’re an application developer. The Attachable class exists only for library developers to provide application developers higher level & safer tools such as ConstChannel & ConstExecutor.

Rust does exactly this with the Sync & Send traits. Use it only if you’re designing a higher level tool that properly handles safety. Remember that when using it, you’re in unsafe territories, here be dragons.

With this Attachable class, it becomes trivial to create a multi-platform mutex that uses ReentrantLock in Kotlin/JVM, and pthread_mutex in Kotlin/Native, which leads to the possibility of deadlocks.

Mutexes are not a bad thing, they are often way faster than inter-thread message passing, and are needed for low-level, high performance stuff that, for example, Kotlin/Native embedded developers might need to write.
Most deadlocks are easy to debug. Just pause the app in your debugger and see which mutexes are being waited by which threads.
Race conditions, on the other hand are very hard to find and correctly debug, because you cannot pause when they happen. You cannot reliably detect them.
This is why I am perfectly OK with a framework of programming rules that allows for deadlock, but makes race conditions impossible.

I sincerely hope the Kotlin language & the Kotlin/Native compiler can evolve to a safer, better framework of rules that:

Encourages safe patterns & practices.
Enforces constraints at compile time.
Allows unsafe code if needed.

All this const thing won’t make it into the Kotlin language (I work neither at Jetbrains nor on Kotlin compilers), but I do hope this can be an example of a way to “fix” the current situation.
Both the Kotlin Multiplatform & the Kotlin/Native stories can be improved, and we, as a community, may be the spark that creates the discussion, that creates the movement, that leads to the solution.

Designing a Kotlin memory safe mode - ITNEXT

Designing a Kotlin memory safe mode

Recommend

暴风反思录:冯鑫被捕成内部"禁忌" 创业初心如何迷失

人类危机：亚马逊大火开启魔鬼封印

哪里能看到深度新闻解读？

Leo Zovic: Error Handling In Context Managers

PostgreSQL 物理坏块和文件损坏案例分享

[译] Go 高级并发模式（一）

硬核解读一文读懂隐私技术现状

第 11 篇：自动生成文章摘要

集群、限流、缓存 BAT 大厂无非也就是这么做

Java 多线程编程核心技术 (二)：对象及变量的并发访问（上）

About Joyk