My name is Philipp C. Heckel and I write about nerdy things.

Yearly Archives / 2021


  • Oct 19 / 2021
  • 0
Distributed Systems

Lossless MySQL semi-sync replication and automated failover

MySQL is a really mature technology. It’s been around for a quarter of a century and it’s one of the most popular DBMS in the world. As such, as an engineer, one expects basic features such as replication and failover to be fleshed out, stable and ideally even easy to set up.

And while MySQL comes with replication functionality out of the box, automated failover and topology management is not part of its feature set. On top of that, it turns out that it is rather difficult to not shoot yourself in the foot when configuring replication.

In fact, without careful configuration and the right tools, a failover from a source to a replica server will almost certainly lose transactions that have been acknowledged as committed to the application.

This is a blog post about setting up lossless MySQL replication with automated failover, i.e. ensuring that not a single transaction is lost during a failover, and that failovers happen entirely without human intervention.

Continue Reading

  • Jun 20 / 2021
  • 3
Programming

elastictl: Import, export, re-shard and performance-test Elasticsearch indices

For my work, I work a lot with Elasticsearch. Elasticsearch is pretty famous by now, so I doubt that it needs an introduction. But if you happen to not know what it is: it’s a document store with unique search capabilities, and incredible scalability.

Despite its incredible features though, it has its rough edges. And no, I don’t mean the horrific query language (honestly, who thought that was a good idea?). I mean the fact that without external tools it’s quite impossible to import, export, copy, move or re-shard an Elasticsearch index. Indices are very final, unfortunately.

This is quite often very inconvenient if you have a growing index for which each Elasticsearch shard is outgrowing its recommended size (2 billion documents). Or even if you have the opposite problem: if you have an ES cluster that has too many shards (~800 shards per host is the recommendation I think), because you have too many indices.

This is why I wrote elastictl: elastictl is a simple tool to import/export Elasticsearch indices into a file, and/or reshard an index. In this short post, I’ll show a few examples of how it can be used.

Continue Reading