Hey Ruby, how’s your TOML?

At BetterStack, we use Ruby for a part of our infrastructure orchestration, and we use TOML for our script configuration needs. As you may know, TOML (Tom’s Obvious, Minimal Language) is a configuration file format that’s easy to read with obvious semantics.

Since I dabble into Rust development a little and TOML is pretty popular over there, selecting it instead of YAML or JSON was an easy choice, due to its ease of use and straightforward syntax. Unfortunately, the state of TOML libraries in Ruby is a bit more complex at this point.

NOTE: This article was written at the start of 2022. The landscape of TOML gems may be a bit different, so treat the information presented here accordingly.

gem install toml and move on…?

TOML is a versioned file format with it’s first major version (v1.0.0) released at the beginning of 2021. Checking out the project wiki shows that there’s a lot of implementations that are compliant with the v1.0.0 spec. However, Ruby only shows up elsewhere in the list, five times total:

  • tomlrb gem by @fbernier stated as v1.0.0-rc.1 compliant, and
  • toml-rb gem by @emancu) stated as v0.5.0 compliant

followed by

  • toml2 gem by @charliesome,
  • toml gem by @jm, and
  • tomlp gem by @sandeepravi,

all of which have unknown compliance based on the wiki.

A quick search on RubyMine lets us know there are even more gems - toml-ruby, multi_toml, ptolemy, tock, toml-rb-hs, toby, to mention a few showing up at the top.

Most of these additional gems have their latest releases during 2013 (what a year for TOML that must’ve been!) - toby and toml-rb-hs are the exceptions, but toml-rb-hs is a derailed fork of toml-rb.

Quite a mess!

What’s in the box?

Let’s inspect tomlrb, toml-rb, toml2, toml, tomlp, and toby, as the remaining gems are clearly outdated and don’t support anything near the v1.0.0 TOML spec.

Most of these gems use some kind of grammar or a parsing library - like racc/yacc, citrus, or parslet. This makes sense - TOML even has the language specification in the ABNF format present in the root of the project, so it should be fairly straightforward to use that definition as a base for a decoder/encoder library.

Starting from the oldest, the tomlp gem (sandeepravi/tomlp) uses the treewtop grammar and was updated 9 years ago. Although the author says (in bold and itallics) that You can use this in production, I’m skipping this one.

Moving on to the toml2 gem (haileys/toml2), also updated 9 years ago, it’s interesting to note that there’s no grammar present in this library - only a five-line, 340-character long, perfectly aligned 😎 ruby eval block. There’s also a note that this gem is a gold plated, production grade, high performance parser, which makes me a little sad to ignore it moving forward.

Here’s a summary of the remaining libraries (ordered alphabetically):

Gem Grammar Last update commits in the last 12 month RubyGems Downloads (all/latest version)
toby Citrus Jan 31, 2021 24 723 / 10,205
toml Parslet Dec 1, 2021 11 667,568 / 16,041,996
toml-rb Citrus Nov 22, 2021 30 238,443 / 18,650,046
tomlrb Racc Feb 19, 2021 6 2,866,930/23,665,130

It’s interesting to see that the total downloads are pretty similar between the top three gems - and none of them are abandoned. But it’s still not clear how one should select which gem to use…

Roll your own!

The obvious solution is to take the ABNF definition, transform it to a Ruby regex or to a different usable format, and roll a solution off that. I even have a great name - yatoml, as in Yet Another TOML. A great idea, right? … Right?

TOML compliance & performance comparison

Since I don’t want to pollute the space with yet another gem library, I want to use (and optionally improve) the most mature gem. I was able to compare the decoder compliance against the v1.0.0 TOML standard using the toml-test test suite. I’ll summarize the results below, but if you wanted to run this yourself, here’s the repository: toml-comparison.

I measured the compliance against the latest toml-test suite, as well as a very basic performance benchmark (I compared the time it takes to run the test-suite on my MacBook Air).

There are 306 test cases in toml-test. Here are the comparison results:

Gem Failed cases Duration*
toby 89 121.9s
toml 89 58.97s
toml-rb 80 134.45s
tomlrb 23 41.1s

*I averaged three test suite runs for each gem.

Based on these test runs, tomlrb comes on the top when comparing both v1.0.0 TOML spec compliance as well as speed performance.

Other notable facts: toby, toml, and toml-rb convert all times to a Time or DateTime class, while tomlrb uses it’s internal classes LocalDate, LocalTime, and LocalDateTime. On a similar note, only toby implements internal classes for Binary, Octal and Hexadecimal numbers. All of these decisions make sense to me - the trade-off is usually between being more explicit and being easier to use.

I didn’t test the encoders - tomlrb doesn’t have one, and it seemed superfluous since the decoders are not 100% v1.0.0 compliant.

Great, so what now?

Based on the comparison, it would seem tomlrb is the closest gem to get to v1.0.0.

So let’s get them there!

I submitted a PR to the tomlrb gem to cover a few of the remaining 23 cases, and hopefully with combined forces we can get it to 100%. The fact that it’s the fastest parser I tested is a nice cherry on top.

Till next time!