Replicating Jepsen Results

If you arenʼt aware, Kyle Kingsbury has a great series of posts testing whether databases live up to their claims. Its an invaluable resource as many of the databases he has tested donʼt live up to their stated goals. That being said, some of the posts are getting quite old at this point so its possible that the developers may have fixed the issues that caused them to fail their stated goals. Luckily, Kyleʼs Jepsen project is open source and youʼre free to try and replicate his results.

This does take some setup though. Youʼll need 5 database servers. It’s easiest to use Debian Jessie for this as that is what Kyle uses and therefore all of the tests that heʼs written work against it. You do need to replace SystemD with SysV init before the tests will be able to run. You also need a machine to run Jepsen on. You shouldnʼt try to reuse one of the database servers for this as the tests will cut off access to some servers at certain points in the tests. For the easiest testing process, youʼll want the database servers to be called n1-n5. They need to all be resolvable by all the other database servers and the server running the tests. The server running the tests also needs to be able to ssh to all of the database servers using the same username and password/ssh key and have sudo access. These hosts must also exist in the known hosts file in the non-hashed format before Jepsen is able to execute a test. Iʼm unsure of what the default values that Jepsen uses for username and password but, youʼre easily able to change the values that it uses for each test. Finally, the server running the tests will need the Java JDK 8 and leiningen to run.

That is quite a bit, isnʼt it? I thought that it was and given the wonderful tooling we have to replicate these sorts of environments, I thought that, for sure, someone had created a way to spin up a set of servers on AWS to run any of the tests that you would like. I wasnʼt able to locate one which likely just means that my search skills were lacking. Since I couldnʼt locate one, I made one using Terraform. jepsen-lab is relatively simple but, it goes through the process of setting up all of the previously stated requirements. It sets up all of the servers and configures them as required and once that process is complete, it outputs the ip address that youʼre able to ssh into. It does leave a number of steps for you to complete on your own: You need to clone the Jepsen repo and youʼll need to modify the test configuration for the username and password. The former is simply because I donʼt know what revision you may wish to use and the latter is because the step is dependent on which tests you chose to run. For more information on how to use jepsen-lab, see the readme in the repository.

After getting everything setup, it’s just a matter of running lein test from the correct directory and verifying the results. You can also make any modifications you like to see if they change the results of the tests. In future installments, Iʼll discuss the particular tests that Iʼve tried to replicate, modifications that Iʼve made and the results that Iʼve gotten.