Rust 101 – 23 Exercises for module B (q3)

Testing, benchmarking and optimising a small program that plays FizzBuzz.

Series: Language basics, More syntax, Traits and generics, Building applications, Concurrency and parallelism, Trait objects, Async, Unsafe

This section (Building applications): 18: Dependencies, 19: API design, 20: Tests, 21: Exercise B1, 22: Exercise B2, 23: Exercise B3

Links:

The course materials for this series are developed by tweede golf. You can find more information at github.com/tweedegolf/101-rs and you can sponsor the work at github.com/sponsors/tweedegolf. They are released under the Creative Commons Attribution Share Alike 4.0 International license.

This series of videos is copyright 2024 Andy Balaam and the tweede golf contributors and is released under the Creative Commons Attribution Share Alike 4.0 International license.

[Fixed in FF 123] Deleting an Indexed DB store can be incredibly slow on Firefox

Update: as confirmed in the bug I logged, this was fixed in Firefox 123!

See also: Keep your Indexed DB keys and values small if you want good performance! and Don’t store arrays of numbers in Indexed DB – use base64 instead.

We had performance problem in Element Web when upgrading the Indexed DB schema, and it turned out that in Firefox, deleting an object store can be incredibly slow. It can take tens of minutes or even hours.

(In Chromium the same operation can take tens of seconds, but it’s way, way faster.)

I was expecting this to be a near-instant operation, so this was a definite surprise to me.

Note that my analysis is based on widely-available browsers in February 2024, and will probably become out-of-date.

You can see the full graphs here: artificialworlds.net/indexed-db-perf/delete.html. Source code is at codeberg.org/andybalaam/indexed-db-perf.

Here’s what I learned:

Headline 1: Firefox can take a very long time to delete an object store

Graph showing that Firefox took 1.2 million milliseconds to delete an object store containing 200K records.

Firefox took 20 minutes to delete an object store containing 200K records, with one index.

In Chromium, I found that deleting a similar object store took 21 seconds, which is still slow, but rather more acceptable.

Consider not deleting stores you don’t need any more. If you must delete them, you will need to provide feedback to your users, especially if they are using Firefox.

Headline 2: Clearing the store before deleting it really helps(!)

Graph showing that Firefox takes 1.2 million ms to delete a large object store (200K records), or 300 thousand ms if you clear it before deleting it.

Graph showing that Chromium takes 20 thousand ms to delete a large object store (200K records), or 10 thousand ms if you clear it before deleting it.

On both Firefox and Chromium, running objectStore.clear() before upgrading the db and deleting the store made a significant improvement to the total time. On Chromium it more than halved the time to delete the store, and on Firefox (where the numbers were huge) it reduced the time by about 3 times!

Thanks to michael-p on Stack Overflow for giving me the idea.

Clear your store before deleting it.

Headline 3: With no indices, this is fine

Firefox deletes object stores fairly fast if there are no indices (and so does Chromium).

Graph showing that even for 200K records, Firefox can delete an object store in under 500ms if there is no index on it.

Even for 200K records, Firefox can delete an object store in under 500ms if there is no index on it.

If you need fast deletion, don’t use indices.

Observation: It’s not done until the DB is closed

In my timing experiments, I found that objectStore.delete() completed, but the operation was not really done. When I called the close() method on my IDBDatabase I had to wait a whole lot longer. (The close time is included in the measurements above.)

Even when I refreshed the browser, I found I had a long wait to open the Indexed DB database. After the wait, it worked fine and the schema update was complete.

Expect long close times after a deletion.

Don’t store normal arrays of numbers in Indexed DB – use UInt8Array instead

Following on from Keep your Indexed DB keys and values small if you want good performance!, here is another thing I’ve learned about Indexed DB performance (in July 2024):

Update: Thanks to richvdh, we now know UInt8Array is much better than base64!

If you have a long list of numbers to store, don’t put them in a JavaScript array – instead put them into a UInt8Array. Even encoding them to base64 strings is better than normal arrays.

Here are the numbers:

Graph showing that arrays of numbers are much slower to count than the same numbers encoded as base64, and UInt8Arrays are even better, in Firefox.

On Firefox, storing the same set of numbers as an array is much slower than encoding them as base64, and both are easily beaten by UInt8Arrays.

Graph showing that arrays of numbers are much slower to count than the same numbers encoded as base64, and UInt8Arrays are even better, in Chromium.

On Chromium, storing the same set of numbers as an array is much slower than encoding them as base64, and both are easily beaten by UInt8Arrays.

These graphs show that in both Firefox and Chromium (at time of writing), it is much faster to count the records in an Indexed DB store if the list of numbers inside each record is encoded as a base64 string instead of being included directly as a JavaScript array, but the best performance comes from using a UInt8Array.

The interactive 3D(!) graphs are here: artificialworlds.net/indexed-db-perf/arrays.html and the source code is at codeberg.org/andybalaam/indexed-db-perf.

See also: Keep your Indexed DB keys and values small if you want good performance! and Deleting an Indexed DB store can be incredibly slow on Firefox.

Keep your Indexed DB keys and values small if you want good performance!

In our work recently on Element Web (specifically attempting to replace our encryption code with our cross-platform Rust implementation) we’ve noticed some strange behaviour with the performance of our Indexed DB queries.

We’ve been aware of some slowdowns, and wondered whether it was related to the locks we take in the Rust code (which is compiled to WASM for this project), but actually we’ve tracked the latest problems down to time spent inside Indexed DB queries.

The first surprising thing we found was that several of our slow queries were just counting the number of records in a store. As we dug deeper, we began to suspect that the size of the values we were storing was affecting performance.

Click the link for interactive graphs

I designed a simple test page and we measured some results. You can look at interactive graphs and and record on your own device here: artificialworlds.net/indexed-db-perf. I am hoping to expand this page to more devices and more operations over time, so do check back and/or contribute results. Source code is at codeberg.org/andybalaam/indexed-db-perf.

Note that my analysis is based on widely-available browsers in January 2024, and may become out-of-date.

Here are my conclusions so far.

Headline 1: counting the records in your store can be slow

If we have large keys or values, and we try find out how many records there are in a store containing 200K of them, it will be very slow (6 seconds on my Chromium).

Try to avoid counting records when you can, and consider caching the count in your own code if you need it often.

The slowness of counting probably also indicates that operations that walk all records will also be slow, so think carefully about when and whether you need to do that, and expect to provide feedback to your users when you do it.

Don’t count or walk all records unless you have to.

Headline 2: long keys hurt performance

3D graph showing that long keys (2000) and large numbers of records (200K) result in 800ms time to count records.

On Firefox, as keys get longer, performance with large numbers of records becomes slower.

The time it takes to count records grows linearly with the length of the keys, and becomes large when keys go over 50 bytes (500 bytes in Firefox).

The shorter your keys, the better.

Headline 3: large values hurt performance

Graph showing it can take 6 seconds to count records with large values (50K) and lots of records (200K(

On Chrome, as values get larger, performance with large numbers of records becomes slower.

The time it takes to count records grows linearly with the size of the values you are storing, and becomes large when values go over 1K in size (10K in Firefox).

The smaller your values, the better.

Headline 4: the number of records matters

Graphs showing the performance on Firefox degrades rapidly after around 50K records.

On Firefox over 50K records (with large keys or values), performance degrades rapidly.

If you are storing large values (> 1K) or large keys (> 50 bytes), and you need to walk all records regularly, you should probably try to keep the number of records below 50K if you want interactive performance – over that it rapidly increases to the order of seconds.

In Firefox on Intel, it does seem reasonably stable after 50K, though, so if you exclusively target Firefox on Intel, larger stores are feasible, but you will need to manage the user experience around these ~1 second delays.

In Chromium, performance continues to degrade as the number of records increases. It reaches truly terrible times (6 seconds) with large keys (2000 bytes) and large numbers (200K).

Keep below 50K records if you can.

Headline 5: Indexed DB performance is currently very poor on Apple M1

Graph showing counting records can take 12 seconds on an Apple M1.

Chrome on Apple M1 hardware takes 12 seconds to perform operations that take 2.5 seconds on Intel.

Firefox’s indexed DB performance on Apple M1 is ~10 times slower than on Intel, taking 8 seconds to count 200K large records.

Chrome’s performance with large keys (> 1000 bytes) is fairly catastrophic when the number of rows approaches 50K. It takes 12 seconds to count 200K records with keys of length 2000, ~6x slower than on Intel.

Both browsers on M1 hit a wall at about 50K records, where their performance with either large keys or large values rapidly degrades to almost unusable levels. The only option for decent performance is to keep keys short and values small.

Test on Apple M1 – it has a serious problem right now!

Observation: Firefox faster than Chromium

Graph showing that for counting large numbers of large records, Firefox takes 0.8 seconds and Chromium takes 6 seconds.

Chrome takes 6 seconds to count large numbers of large values, and Firefox takes 0.8 seconds to do the same thing.

At time of writing, Firefox’s Indexed DB appears to be much faster than Chromium’s when dealing with large numbers of records. In some scenarios, over 7 times faster.

Side note: string values are better than arrays of numbers

Update: see Don’t store arrays of numbers in Indexed DB – use base64 instead.

I don’t have the formal evidence for this because I ran out of time for my investigation, but my initial exploration suggests that encoding a string as an array of numbers, one for each character, slows down performance by over 10 times.

(I think) you should use strings instead of arrays of numbers.

Conclusions

Watch out for Apple M1!

Meanwhile, if you need to improve the performance of your Indexed DB-based web site, you have three options:

  1. Store fewer records,
  2. Reduce the size of the records, or
  3. Reduce the length of the keys.

1 and 2 are probably harder than 3. So definitely consider 3 first.

If you keep key length and value length small, you can have lots of records.

Explore the graphs here: artificialworlds.net/indexed-db-perf

See also: Don’t store arrays of numbers in Indexed DB – use base64 instead and Deleting an Indexed DB store can be incredibly slow on Firefox.

Python Async basics video (100 million HTTP requests)

I found something difficult in Python, which was a bit of a first, so I wrote a whole blog series about it, and now a whole video:

Slides: Python Async Basics slides

Blog posts: asyncio basics, large numbers in parallel, parallel HTTP requests, adding to stdlib