- Hey Ruby, how’s your TOML?
Hey Ruby, how’s your TOML?
At BetterStack, we use Ruby for a part of our infrastructure orchestration, and we use TOML for our script configuration needs. As you may know, TOML (Tom’s Obvious, Minimal Language) is a configuration file format that’s easy to read with obvious semantics.
Since I dabble into Rust development a little and TOML is pretty popular over there, selecting it instead of YAML or JSON was an easy choice, due to its ease of use and straightforward syntax. Unfortunately, the state of TOML libraries in Ruby is a bit more complex at this point.
NOTE: This article was written at the start of 2022. The landscape of TOML gems may be a bit different, so treat the information presented here accordingly.
gem install toml and move on…?
TOML is a versioned file format with it’s first major version (v1.0.0) released at the beginning of 2021. Checking out the project wiki shows that there’s a lot of implementations that are compliant with the v1.0.0 spec. However, Ruby only shows up elsewhere in the list, five times total:
tomlrbgem by @fbernier stated as v1.0.0-rc.1 compliant, and
toml-rbgem by @emancu) stated as v0.5.0 compliant
toml2gem by @charliesome,
tomlgem by @jm, and
tomlpgem by @sandeepravi,
all of which have unknown compliance based on the wiki.
A quick search on RubyMine lets us know there are even more gems -
toby, to mention a few showing up at the top.
Most of these additional gems have their latest releases during 2013 (what a year for TOML that must’ve been!) -
toml-rb-hs are the exceptions, but
toml-rb-hs is a derailed fork of
Quite a mess!
What’s in the box?
toby, as the remaining gems are clearly outdated and don’t support anything near the v1.0.0 TOML spec.
Most of these gems use some kind of grammar or a parsing library - like racc/yacc, citrus, or parslet. This makes sense - TOML even has the language specification in the ABNF format present in the root of the project, so it should be fairly straightforward to use that definition as a base for a decoder/encoder library.
Starting from the oldest, the
tomlp gem (
sandeepravi/tomlp) uses the
treewtop grammar and was updated 9 years ago. Although the author says (in bold and itallics) that You can use this in production, I’m skipping this one.
Moving on to the
toml2 gem (
haileys/toml2), also updated 9 years ago, it’s interesting to note that there’s no grammar present in this library - only a five-line, 340-character long, perfectly aligned 😎 ruby eval block. There’s also a note that this gem is
a gold plated, production grade, high performance parser, which makes me a little sad to ignore it moving forward.
Here’s a summary of the remaining libraries (ordered alphabetically):
|Gem||Grammar||Last update||commits in the last 12 month||RubyGems Downloads (all/latest version)|
||Citrus||Jan 31, 2021||24||723 / 10,205|
||Parslet||Dec 1, 2021||11||667,568 / 16,041,996|
||Citrus||Nov 22, 2021||30||238,443 / 18,650,046|
||Racc||Feb 19, 2021||6||2,866,930/23,665,130|
It’s interesting to see that the total downloads are pretty similar between the top three gems - and none of them are abandoned. But it’s still not clear how one should select which gem to use…
Roll your own!
The obvious solution is to take the ABNF definition, transform it to a Ruby regex or to a different usable format, and roll a solution off that. I even have a great name -
yatoml, as in Yet Another TOML. A great idea, right? … Right?
TOML compliance & performance comparison
Since I don’t want to pollute the space with yet another gem library, I want to use (and optionally improve) the most mature gem. I was able to compare the decoder compliance against the v1.0.0 TOML standard using the
toml-test test suite. I’ll summarize the results below, but if you wanted to run this yourself, here’s the repository: toml-comparison.
I measured the compliance against the latest toml-test suite, as well as a very basic performance benchmark (I compared the time it takes to run the test-suite on my MacBook Air).
There are 306 test cases in
toml-test. Here are the comparison results:
*I averaged three test suite runs for each gem.
Based on these test runs,
tomlrb comes on the top when comparing both v1.0.0 TOML spec compliance as well as speed performance.
Other notable facts:
toml-rb convert all times to a
DateTime class, while
tomlrb uses it’s internal classes
LocalDateTime. On a similar note, only
toby implements internal classes for
Hexadecimal numbers. All of these decisions make sense to me - the trade-off is usually between being more explicit and being easier to use.
I didn’t test the encoders -
tomlrb doesn’t have one, and it seemed superfluous since the decoders are not 100% v1.0.0 compliant.
Great, so what now?
Based on the comparison, it would seem
tomlrb is the closest gem to get to v1.0.0.
So let’s get them there!
I submitted a PR to the
tomlrb gem to cover a few of the remaining 23 cases, and hopefully with combined forces we can get it to 100%. The fact that it’s the fastest parser I tested is a nice cherry on top.
Till next time!