Shutdown order consistency: how Rust helps

Some Java code with bugs

Here’s my main method (in Java). Can you guess the bug?

Db db = new Db();
Monitoring monitoring = new Monitoring();
Monitoring mon2 = new Monitoring();
Billing billing = new Billing(db, monitoring);
monitoring.setDb(db);

runMainLoop(billing, mon2);

db.stop();
billing.stop();
monitoring.stop();

If you would like to hunt down the 2 bugs manually, try reading the full code here: ShutdownOrder.java

But maybe you have an idea already? Maybe you’ve seen code like this before? If you have, you probably have an instinct that there’s some kind of bug, even if you can’t say for sure what it is. Code like this almost always has bugs!

This code compiles fine, but it contains two bugs.

First, we forgot to setDb() on mon2. This causes a NullPointerException, because Monitoring expects always to have a working Db.

Second, and in general harder to spot, we shut down our services in the wrong order. It turns out that Monitoring uses its Db during shutdown, so we get an exception. Even worse, if some other code needed to run after monitoring.stop(), it won’t, because the exception prevents us getting any further.

Of course, this is toy code, but this kind of problem is common (and much harder to spot) in real-life code. In fact, my team dealt with a similar bug this week.

It’s fundamentally hard to figure out your shutdown order. It’s complicated further if classes have start() methods too, which I have seen in lots of Java code.

Given that this is just a hard problem, maybe there’s no point looking for tools to make it easier?

Some Rust code without those bugs

Let’s try writing this code in Rust. Here’s the main method:

let db = Db::new();
let monitoring = Monitoring::new(&db);
let mon2 = Monitoring::new(&db);
let billing = Billing::new(&db, &monitoring);

run_main_loop(&billing, &mon2);

// drop() is called automatically on all objects here

Here’s the full code: shutdown_order.rs

This code shuts down all the services automatically at the end, and any mistakes we make in the order are compile errors, not things we find later when our code is running.

The code to shut down each service looks like this:

impl Drop for Monitoring<'_> {
    fn drop(&mut self) {
        // [Disconnect from monitoring API]
        self.db.add_record("MonitorShutDown");
    }
}

This is us implementing the Drop trait for the struct Monitoring (traits are a bit like Java Interfaces). The Drop trait is special: it indicates what to do when an instance of this struct is dropped. In Rust, this is guaranteed to happen when the instance goes out of scope, which is why our comment at the end of the main method sounds so confident.

Furthermore, Rust’s compiler shuts down everything in the reverse order in which it was created, and guarantees that nothing gets used after it has been dropped.

Rust’s lovely world gives us two relevant treats: no unexpected nulls, and lifetimes.

Treat number 1: no unexpected nulls

First, in Rust, like in other modern languages like Kotlin, we have to be explicit about items that could be missing. In our example, we were able to re-arrange the code so that db can never be missing (or null), and the compiler encouraged us to do so. If we really needed it to be missing some of the time, we could have used the Option type, and the compiler would have forced us to handle the case when it was missing, instead of unexpectedly getting a NullPointerException like we did in Java. (In fact, if we’d structured our code to use final in as many places as possible, we could have been encouraged towards basically the same solution in Java too.)

Treat number 2: lifetimes

Second, if you look a bit more closely at the full code of shutdown_order.rs you’ll see lots of confusing-looking annotations like <'a> and &'a:

struct Monitoring<'a> {
    db: &'a Db,
}

The approximate meaning of those annotations is: a Monitoring holds a reference to a Db, and that Db must last longer than the Monitoring.

This “lasts longer than” wording is what Rust Lifetimes are for. Lifetimes are a way of saying how long something lasts.

Lifetimes are really confusing when you start with Rust, and have caused me a lot of pain. Code like this is where they are both most painful and most helpful. As I mentioned earlier, the problem of shutdown order is fundamentally hard. Rust gives you that pain at the beginning, and until you understand what’s going on, the pain is very confusing and acute. But, once your code compiles, it is correct, at least as far as problems like this are concerned.

I love the sense of security it gives me to write Rust code and know the compiler has checked my code for this kind of problem, meaning it can’t crop up at 3am on Christmas Day…

Final note/caveat

This Rust code is probably over-simplified, because all the references are immutable (you can’t change the objects they point to). In practice, we may well have mutable references, and if we do we’re going have to deal with the further difficulty that Rust won’t allow two different objects to hold references to an object if any of those references are mutable. So it would object to Billing and Monitoring using the Db object at the same time. We’d need to make it immutable (as we have here), or find a different way of structuring the code: for example, we could hold the Db instance only within the run_main_loop code, and pass it in temporarily to the Billing and Monitoring objects when we called their methods. A large part of the art, fun and pain of learning Rust is finding new patterns for your code that do what you need to do and also keep the compiler happy. When you manage it, you get amazing benefits!

Profile a Java unit test (very quickly, with no external tools)

I have a unit test that is running slowly, and I want a quick view of what is happening.

I can get a nice overview of where the code spends its time by adding this to the JVM arguments:

-agentlib:hprof=cpu=samples,lineno=y,depth=3,file=hprof.samples.txt

and running the test as normal.

Now I can look at the file that was created, hprof.samples.txt, and looking at the bottom section I can see how much time is spent in each method.

This worked for me within IntelliJ IDEA community edition by clicking “Run” then “Edit Configurations” and adding the above code to “VM options” for my test.

It should also work in Gradle by editing gradle.properties and adding something like this:

org.gradle.jvmargs=-agentlib:hprof=cpu=samples,lineno=y,depth=3,file=hprof.samples.txt

and should also work in Maven. In fact, I found this information in this stackoverflow question: How do you run maven unit tests with hprof?.

Example Android project with repeatable tests running inside an emulator

I’ve spent the last couple of days fighting the Android command line to set up a simple project that can run automated tests inside an emulator reliably and repeatably.

To make the tests reliable and independent from anything else on my machine, I wanted to store the Android SDK and AVD files in a local directory.

To do this I had to define a lot of inter-related environment variables, and wrap the tools in scripts that ensure they run with the right flags and settings.

The end result of this work is here: gitlab.com/andybalaam/android-skeleton

You need all the utility scripts included in that repo for it to work, but some highlights include:

The environment variables that I source in every script, scripts/paths:

PROJECT_ROOT=$(dirname $(dirname $(realpath ${BASH_SOURCE[${#BASH_SOURCE[@]} - 1]})))
export ANDROID_SDK_ROOT="${PROJECT_ROOT}/android_sdk"
export ANDROID_SDK_HOME="${ANDROID_SDK_ROOT}"
export ANDROID_EMULATOR_HOME="${ANDROID_SDK_ROOT}/emulator-home"
export ANDROID_AVD_HOME="${ANDROID_EMULATOR_HOME}/avd"

Creation of a local.properties file that tells Gradle and Android Studio where the SDK is, by running something like this:

echo "# File created automatically - changes will be overwritten!" > local.properties
echo "sdk.dir=${ANDROID_SDK_ROOT}" >> local.properties

The wrapper scripts for Android tools e.g. scripts/sdkmanager:

#!/bin/bash

set -e
set -u

source scripts/paths

"${ANDROID_SDK_ROOT}/tools/bin/sdkmanager" \
    "--sdk_root=${ANDROID_SDK_ROOT}" \
    "$@"

The wrapper for avdmanager is particularly interesting since it seems we need to override where it thinks the tools directory is for it to work properly – scripts/avdmanager:

#!/bin/bash

set -e
set -u

source scripts/paths

# Set toolsdir to include "bin/" since avdmanager seems to go 2 dirs up
# from that to find the SDK root?
AVDMANAGER_OPTS="-Dcom.android.sdkmanager.toolsdir=${ANDROID_SDK_ROOT}/tools/bin/" \
    "${ANDROID_SDK_ROOT}/tools/bin/avdmanager" "$@"

An installation script that must be run once before using the project scripts/install-android-tools:

#!/bin/bash

set -e
set -u
set -x

source scripts/paths

mkdir -p "${ANDROID_SDK_ROOT}"
mkdir -p "${ANDROID_AVD_HOME}"
mkdir -p "${ANDROID_EMULATOR_HOME}"

# Download sdkmanager, avdmanager etc.
cd "${ANDROID_SDK_ROOT}"
test -f commandlinetools-*.zip || \
    wget -q 'https://dl.google.com/android/repository/commandlinetools-linux-6200805_latest.zip'
unzip -q -u commandlinetools-*.zip
cd ..

# Ask sdkmanager to update itself
./scripts/sdkmanager --update

# Install the emulator and tools
yes | ./scripts/sdkmanager --install 'emulator' 'platform-tools'

# Platforms
./scripts/sdkmanager --install 'platforms;android-21'
./scripts/sdkmanager --install 'platforms;android-29'

# Install system images for our oldest and newest supported API versions
yes | ./scripts/sdkmanager --install 'system-images;android-21;default;x86_64'
yes | ./scripts/sdkmanager --install 'system-images;android-29;default;x86_64'

# Create AVDs to run the system images
echo no | ./scripts/avdmanager -v \
    create avd \
    -f \
    -n "avd-21" \
    -k "system-images;android-21;default;x86_64" \
    -p ${ANDROID_SDK_ROOT}/avds/avd-21
echo no | ./scripts/avdmanager -v \
    create avd \
    -f \
    -n "avd-29" \
    -k "system-images;android-29;default;x86_64" \
    -p ${ANDROID_SDK_ROOT}/avds/avd-29

Please do contribute to the project if you know easier ways to do this stuff.

Dependency Injection frameworks: reasons to avoid them video

At my job we have done a great deal of work to remove Guice from our codebase. Here I try to explain why we did that, and try to apply my reasoning to dependency injection frameworks in general.

Slides: Dependency Injection frameworks: reasons to avoid them slides