December-February Recap

3/18/202512:00:00 AM

I decided to take a break from posting (and coding) for a while, both to celebrate some of the holiday/new-year season but also just matching my natural work schedule over the year. But of course despite the break period the work must continue and the past few months have a few fun things to report back on.

Project: saint

January produced a few changes for saint that can mostly be categorized as maintenance and bug fixing. Saint continues to be used as my primary observability tool and I'm very happy with how its scaled. I'm able to pretty quickly spin up dashboards although with plenty of boiler-plate and copy-paste I want to reduce. The missing piece right now is a small SDK for more cleanly querying and interfacing to the ClickHouse data produced by yamon and a better plotting SDK/API. I'm still deciding how exactly I want to implement those pieces so it might be a while before any more significant progress is made.

The only added feature was to support auto-reloading harnesses (the backing deno processes that query data and generate panels for the frontend to render) and dashboards when commits get pushed to GitHub. This has been a great QOL and really helps encourage me to work on the dashboards.

Changes

  • fix: track & manage harness state explicitly (#12)
  • fix: cleanup harnesses properly
  • fix: time out alert waiting for harness
  • fix: debug log for alerts
  • fix: report more errors from git
  • ref: move deno to sheath repo
  • fix: harness router should return 500s
  • fix: incorrect sandbox config in alert harness
  • feat: github webhook endpoint (#11)

Project: sheath

Nothing major here apart from supporting returning 404 on some errors in the URLParamMiddleware. I do have a branch and PR open playing around with a ECS library (based off my previous work inbe) but its in a very early prototype form.

Project: yamon

I've been impressed with the adoption of yamon across both of my primary clients. This adoption of course comes at the cost of feature requests and bug reports. The biggest changes this month are a few added collectors, the GPU one being the most notable as it was a direct request from a client dabbling in more AI workloads, where monitoring the GPU power draw and performance is critical.

The other main change was related to how collectors work. A few clients wanted to reduce the metric load for some collectors without disabling them or changing the collection interval globally. To support this I've added per-collector configuration which both replaces the old method of disabling collectors and also allows you to override the collection interval and the collector timeout.

Changes

  • fix(debug): missing newline
  • fix(gpu): use scanner
  • feat(debug): print duration collector took
  • feat: add apt collector
  • feat(debug): add info for listing collectors
  • feat: add gpu collector
  • fix(prometheus): handle and drop nan
  • feat: dynamic collector intervals (#3)
  • chore: update some more go versions
  • chore: upgrade to go 1.24 + dep bump

Project: carto

carto is a small project I made to render a web-based minecraft map like I have hosted here.

Its heavily inspired by a friends project landis but is a bit more complete and polished. I actually wanted to use it again recently so I've done a maintenance pass; updating dependencies and adding a simple dockerfile to make it easy for running.

I did also spend some time playing around with a rewrite since the current version is pretty much a lightly polished prototype. Sadly I don't have the time right now, and at least for the moment it works well enough for my needs.