⁂ George Ho

Thoughts on Hanukkah of Data 2022

This blog post contains spoilers for Hanukkah of Data 2022.

This holiday season I’ve been doing the Hanukkah of Data, which is a puzzle suite by a group of hackers called the Devottys. It’s a sequence of programming puzzles, with one puzzle dropping for every day of Hanukkah. If you’re familiar with Advent of Code, it’s very similar to that, except (a) it only lasts 8 days instead of 25, and (b) it’s more data-oriented, instead of coding or algorithms-oriented.

I did it in VisiData, which is a tool I’ve been using a lot recently (both at work and for my side projects) that I really wanted to develop expert proficiency with.

Here were my solve statistics:

 Puzzle | Solve Time | # Attempts ║
      0 |  3 minutes |          2 ║
      1 | 82 minutes |          2 ║
      2 | 20 minutes |          1 ║
      3 | 20 minutes |          5 ║
      4 | 37 minutes |          1 ║
      5 |  7 minutes |          1 ║
      6 |  6 minutes |          1 ║
      7 | 24 minutes |          1 ║
      8 |  5 minutes |          1 ║

Overall Impressions

Hanukkah of Data is much shorter than Advent of Code, which I think is a hugely underrated benefit — in previous years, Advent of Code sometimes felt more like homework than a puzzle suite.

It’s also more of a puzzle than Advent of Code — for example, Puzzle 2 required very non-trivial reading comprehension and logical inference to realize that you were looking for (a) a customer with the initials JD (b) who had, in the same order, bought coffee and bagels at Noah’s market (c) in 2017. I found this much more enjoyable than Advent of Code, where the solution is usually straightforward, and the implementation is the meat of the challenge.

On VisiData: for those comfortable with command line interfaces, Vim-style key bindings, or simply willing to put in the time to learn a mini-language of keyboard shortcuts, I think VisiData is the best tool for doing well-scoped, one-off data explorations or analyses.

Towards the end of Hanukkah, the puzzles became less conceptually ambiguous and more technically difficult (in terms of the sophistication of the data wrangling required). As someone already experienced with querying data, I was pleased to finish these puzzles in single-digit minutes — an achievement that I credit almost entirely to VisiData, which makes visualizing, filtering and aggregating data seamlessly interactive.

I only see two downsides of VisiData: the sparse documentation of advanced features (more on that below) and performance. Performance is most obviously an issue when you’re doing joins — joining two tables with a few thousand rows each takes a noticeably long time. I’m looking forward to vdsql, which is VisiData’s sibling project that skins various databases with a VisiData interface (via Ibis), and should therefore be as performant as the underlying database.

Some Miscellaneous Thoughts

#data #visidata