Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems projects prolog protocol protocols pseudo 3d python reddit redis reference releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Run a Program on your dedicated Nvidia graphics card on Linux

I've got a new laptop, and in it I have an Nvidia graphics card. The initial operating system installation went ok (Ubuntu 20.10 Groovy Gorilla is my Linux distribution of choice here), but after I'd done a bunch of configuration and initial setup tasks it inevitably came around to the point in time that I wanted to run an application on my dedicated Nvidia graphics card.

Doing so is actually really quite easy - it's figuring out the how that was the hard bit! To that end, this is a quick post to document how to do this so I don't forget next time.

Before you continue, I'll assume here that you're using the Nvidia propriety drivers. If not, consult your distribution's documentation on how to install and enable this.

With Nvidia cards, it depends greatly on what you want to run as to how you go about doing this. For CUDA applications, you need to install the nvidia-cuda-toolkit package:

sudo apt install nvidia-cuda-toolkit

...and then there will be an application-specific menu you'll need to navigate to tell it which device to use.

For traditional graphical programs, the process is different.

In the NVIDIA X Server Settings, go to PRIME Profiles, and then ensure NVIDIA On-Demand mode is selected. If not, select it and then reboot.

The NVIDIA X Server Settings dialog showing the correct configuration as described above.

Then, to launch a program on your Nvidia dedicated graphics card, do this:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia command_name arguments

In short, the __NV_PRIME_RENDER_OFFLOAD environment variable must be set to 1 , and the __GLX_VENDOR_LIBRARY_NAME environment variable must be set to nvidia. If you forget the latter here or try using DRI_PRIME=1 instead of both of these (as I advise in my previous post about AMD dedicated graphics cards), you'll get an error like this:

libGL error: failed to create dri screen
libGL error: failed to load driver: nouveau

The solution is to set the above environment variables instead of DRI_PRIME=1. Thanks to this Ask Ubuntu post for the answer!

Given that I know I'll be wanting to use this regularly, I've created a script in my bin folder for it. It looks like this:

#!/usr/bin/env bash
export __NV_PRIME_RENDER_OFFLOAD=1;
export __GLX_VENDOR_LIBRARY_NAME=nvidia;

$@;

Save the above to somewhere in your PATH (e.g. your bin folder, if you have one). Don't forget to run chmod +x path/to/file to mark it executable.

Found this useful? Does this work on other systems and setups too? Comment below!

Adventurous Accounts: The Pepperminty Wiki Android App

I've found recently that there are a growing number of things I haven't blogged about on here yet because I either haven't finished them, or I have run into difficulties. I'd like to start blogging about these projects here and there, so in this post I'm going to talk about the Pepperminty Wiki Android App.

For those not in the know, Pepperminty Wiki is a wiki engine I've built myself from scratch that is all packed into a single file using a custom build system. It's pretty stable, and I'm committed to supporting it in the long term - as I have more than 1 rather important wikis hosted using the software myself.

Anyway, as a companion to this I have also implemented a simple Android app companion that reads information from the Pepperminty Wiki Rest API. It's read-only at the moment, but long-term I'd like to build an app that can write back changes as well.

3 screenshots of the Pepperminty Wiki Android App

In it's current form, it works quite well actually. It's functional (although there are a few bugs lurking around). You can even download it from the Google Play Store: https://play.google.com/store/apps/details?id=com.sbrl.peppermint

I haven't really worked on it for a while though, and I'd like to talk about why that is.

Trying to implement additional features is where things get complicated though. The app is written in Kotlin, which I found much easier to use than Java (example: spawning a new thread is 3 lines instead of 30!). For some crazy reason when I was reading the guidance on building such an app, one of the guidelines was to keep the class structure as flat as possible.

This was a bad idea.

For so many reasons as well. Without using lots of files and multiple classes, a codebase gets difficult to follow and build upon. Taking this advice has severely impacted my ability to continue to add new features to the project, so a large refactor is needed to bring this under control. Add to this the complexity and ambiguity of the Android APIs themselves (how about a OnListFragmentInteractionListener, or an android.support.design.widget.BottomNavigationView? the API to display a list of things is also particularly complicated), and you can see how it can spiral out of control.

In addition, I've discovered that I have some serious problems updating it for Android 10. When you build an Android app, you have to regularly update the target platform version (I forget the exact name) - otherwise you get nagging emails from Google Play about this (I've had several already). I'm pretty sure you can't use newer APIs either if you don't keep the platform version updated.

It's been a while since I've tried to update it so I can't remember the exact error messages, but I do remember doing some pretty extensive searching around on the Internet and having no luck.

To this end, the app works in its current form quite well, but at some point (eventually) I'm going to make another attempt at writing a replacement for it. To do this though, I'd like to find some kind of framework that eases the process just a little bit - building an Android app at the moment is unhelpfully complicated and a larger time investment than I can afford at the moment with the regular maintenance required - since regular updates have to be done to keep the app updated, which don't seem to serve much purpose at all to me other than being a bother and creating extra work - since often when you update dependencies and the build system behind an Android app everything breaks, which you then have to fix.

This is different from the maintenance for Pepperminty Wiki itself, which is still regular - but constitutes fixing bugs as they crop up and occasionally implementing new features - and it has settled into a delightfully sedate pace that allows me to follow my inspiration. I find this to be much more enjoyable.

If anyone has any suggestions of alternative approaches I could try, I'm definitely interested. Comment below!

Lua in Review 2

The Lua Logo Back in 2015, I reviewed the programming language Lua. A few months ago I rediscovered the maze generation implementation I ported as part of that post, and since then I've been writing quite a bit of Lua - so I thought I'd return to the thoughts in that original post and write another language review now that I've had some more experience with the language.

For those not in the know, Lua is a lightweight scripting language. You can find out more here: https://www.lua.org/

In the last post, I mentioned that Lua is very lightweight. I still feel this is true today - and it has significant advantages in that the language is relatively simple to understand and get started in - and feels very predictable in how it functions.

It is often said that Lua is designed to be embedded in other programs (such as to provide a modding interface to a game, for example) - and this certainly seems to hold true. Lua definitely seems to be well-suited for this kind of use-case.

The lightweightness comes at a cost though. The first of these is the standard library. Compared to other languages such as C♯ and even Javascript, the standard library sucks. At least half of the time you find yourself reimplementing some algorithm that should have been packaged with the language itself:

  • Testing if a string starts with a given substring
  • Rounding a number to the nearest integer
  • Making a shallow copy of a table

Do you want to do any of these? Too bad, you'll have to implement them yourself in Lua. While these really aren't a big deal, my point here is that with functions like these it can be all too easy to make a mistake when implementing them, and then your code has a bug in it. If you find and fix an obscure edge case for example, that fix will only apply to your code and not the hundreds of other ad-hoc implementations other developers have had to cook up to get things done, leading to duplicated and wasted effort.

A related issue I'm increasingly finding is that of the module system and the lack of reusable packages. In Lua, if you want to import code from another file as a self-contained module, you use the require function, like this:

local foo = require("foo")

The above will import code from a file named foo.lua. However, this module import here is done relative to the entrypoint of your program, and not the file that's requesting the import, leading to a number of issues:

  • If you want to move a self-contained subsection of a codebase around, suddenly you have to rewrite all the imports of not only the rest of the codebase (as normal), but also of all the files in the subdirectory you've just moved
  • You can't have a self-contained 'package' of code that, say, you have in a git submodule - because the code in the submodule can't predict the path to the original entrypoint of your program relative to itself

While LuaRocks attempts to alleviate this issue to some extent (and I admit I haven't yet investigated it in any great detail), as far as I can tell it installs packages globally, which doesn't help if you're writing some Lua that is going to be embedded inside another program, as the global package may or may not be available. Even if it is available, it's debatable as to whether you'd be allowed to import it anyway, since many embedded environments have restrictions in place here for security purposes.

Despite these problems, I've found Lua to be quite a nice language to use (if a little on the verbose side, due to syntactical structure and the lack of a switch statement). Although it's not great at getting out of your way and letting you get on with your day (Javascript is better at this I find), it does have some particularly nice features - such as returning multiple values from a single function (which mostly makes up for the lack of exceptions), and some cute table definition syntax.

It's not the kind of language you want to use for your next big project, but it's certainly worth experimenting with to broaden your horizons and learn a new language that forces you to program in a significantly different style than you would perhaps use normally.

New website for Pepperminty Wiki

By now, Pepperminty Wiki is quite probably my longest running project - and I'm absolutely committed to continuing to support and improve it over time (I use it to host quite a lot of very important information myself).

As part of this, one of the things I'm always looking to improve is the installation process and the first impression users get when they first visit Pepperminty Wiki. Currently, this has a GitHub repository. This is great (as it shows people that we're open-source), but it isn't particularly user-friendly for those who are less technically inclined.

To this end, I've built a shiny new website to introduce people to Pepperminty Wiki and the features it has to offer. I've been thinking about this for a while, and I realised that actually despite the fact that I haven't yet incremented the version number to v1.0 yet (as of the time of posting the latest stable release is v0.22), Pepperminty Wiki is actually pretty mature, easy to deploy and use, and stable.

The new website for Pepperminty Wiki (link below)

(Above: The new Pepperminty Wiki website. Check it out here!)

The stability is a new one for me, as it isn't something I've traditionally put much of a focus on - instead focusing on educational purposes. Development of Pepperminty Wiki has sort of fallen into a pattern of 2-3 releases per year - each of which is preceded by one or more beta releases. I always leave at least 1 week between releasing a beta and the subsequent stable release to give myself and beta testers (of which Pepperminty Wiki has some! If you're reading this, I really appreciate it) time to spot any last-minute issues.

Anyway, the website can be found here: https://peppermint.mooncarrot.space/

Share it with your friends! :D

The initial plan was to buy a domain name like pepperminty.wiki for it, but after looking into the prices (~£36.29 per year) I found it was waaay too expensive for a project that I'm not earning a penny from working on (of course, if you're feeling that way inclined I have a Liberapay setup if you'd like to contribute towards server costs, but it's certainly not required).

Instead, I used a subdomain of one of my existing domains, mooncarrot.space (I use this one mostly for personal web app instances on my new infrastructure I'm blogging about in my cluster series), which is a bit shorter and easier to spell/say than starbeamrainbowlabs.com if you're not used to it.

After a few false starts, I settled on using Eleventy as my static site generator of choice. I'm not making use of all it's features (not even close), but I've found it fairly easy to use and understand how it ticks - and also flexible enough such that it will work with me, rather attempting to force me into a particular way of working.

Honourable mentions here include Hugo (great project, but if I recall correctly I found it confusing and complicated to setup and use), documentation (an epic documentation generator for JS projects, but not suitable for this type of website - check out some of the docs I auto-generate via my Laminar CI setup: powahroot, applause-cli, terrain50).

The Pepperminty Wiki website light theme

(Above: The light theme for the website - which one you see depends on your system preference - I use prefers-colour-scheme here. Personally I prefer the dark theme myself, as it's easier on my eyes)

The experience of implementing the website was an interesting one. Never having built a website to 'sell' something before (even if this is for a thing that's free), I found the most challenging part of the experience determining what text to use to appropriately describe the features of Pepperminty Wiki.

From the beginning I sort of had a vision for how I wanted the website to look. I wanted an introductory bit at the top (with a screenshot at a cool angle!), followed by a bit that explained the features, the some screenshots with short descriptions, followed finally by a download section. I also wanted it to be completely mobile-friendly.

A screenshot of the website as viewed by a mobile device

(Above: A screenshot of the website as viewed by a mobile device. The Firefox Developer Tools were useful for simulating this)

For the most part, this panned out quite well. Keeping the design relatively simple enabled me to support mobile devices as I went along, with minimal tweaks needed at the end of the process (mobile support really needs to be part of the initial design process).

The cool screenshot at the top and the fancy orange buttons you'll see in various places across the site were especially fun to put together - the iterative process of adding CSS directives to bring the idea I had in my head as to how I wanted it to look to life was very satisfying. I think I'll use the same basic principle I used for the fancy buttons again elsewhere (try hovering over them and clicking them to see the animations).

The bottom of the website, showing the fancy orange buttons

(Above: The bottom of the website, showing the fancy orange buttons)

I did contemplate the idea of using a CSS framework for the website, but not having seriously used one before for a personal project combined with the advent of the CSS grid ended up in the decision to abandon the use of a framework once again (I'll learn one eventually, I'm sure ).

So far my experience with frameworks is that they just get in the way when you want to do something that wasn't considered when the framework was built, but I suppose that given their widespread use elsewhere that I really should make an effort to learn at least one framework to get that experience (any suggestions in the comments are welcome).

All in all the experience of building the Pepperminty Wiki website was an enjoyable one. It took a number of hours over a number of days to put together (putting the false starts aside), but I feel as though it was definitely worth it.

Find the website here: https://peppermint.mooncarrot.space/

If I end up moving it at a later date, I'll ensure there's a redirect in place so the above link won't break.

Found this useful? Got a comment about or a suggestion to improve the website? Comment below! I'd love to hear from you.

Stitching videos from frames with ffmpeg (and audio/video editing tricks)

ffmpeg is awesome. In case you haven't heard, ffmpeg is a command-line tool for editing and converting audio and video files. It's taken me a while to warm up to it (the command-line syntax is pretty complicated), but since I've found myself returning to it again and again I thought I'd blog about it so you can use it too (and also for my own reference :P).

I have 2 common tasks I perform with ffmpeg:

  1. Converting audio/video files to a more efficient format
  2. Stitching PNG images into a video file

I've found that most tasks will fall into 1 of the 2 boxes. Sometimes I want to do something more complicated, but I'll usually just look that up and combine the flags I find there with the ones I usually use for one of the 2 above tasks.

horizontal rule made up of an orange rotating maze cube

Stitching PNG images into a video

I don't often render an animation, but when I do I always render to individual images - that way if it crashes half way through I can more easily restart where I left off (and also divide the workload more easily across multiple machines if I need it done quickly).

Individual image files are all very well, but they have 2 problems:

  1. They take up lots of space
  2. You can't easily watch them like a video

Solving #1 is relatively easy with optipng (sudo apt install optipng), for example:

find . -iname "*.png" -print0 | xargs -0 -P4 -n1 optipng -preserve

I cooked that one up a while ago and I've had it saved in my favourites for ages. It finds all png images in the current directory (recursively) and optimises them with optipng.

To resolve #2, we can use ffmpeg as I mentioned earlier in this post. Here's one I use:

ffmpeg -r 24 -i path/to/frame_%04d.png -c:v libvpx-vp9 -b:v 2000k -crf 31 path/to/output.webm

A few points on this one:

  • -r 24: This sets the frames per second.
  • -i path/to/frame_%04d.png: The path to the png images to stitch. %04d refers to 4 successive digits in a row that count up from 1 - e.g. 0001, 0002, etc. %d would mean 1, 2, ...., and %02d for example would mean 01, 02, 03....
  • -c:v libvpx-vp9: The codec. We're exporting to webm, and vp9 is the latest available codec for webm video files.
  • -b:v 2000k: The bitrate. Higher values mean more bits will be used per second of video to encode detail. Raise to increase quality.
  • -crf 31: The encoding quality. Should be anumber betwee 0 and 63 - higher means lower quality and a lower filesize.

Read more about encoding VP9 webm videos here: Encode/VP9- ffmpeg

Converting audio / video files

I've talked about downmuxing audio before, so in this post I'm going to be focusing on video files instead. The above command for encoding to webm can be used with minimal adjustment to transcode videos from 1 format to another:

ffmpeg -hide_banner -i "path/to/input.avi" -c:v libvpx-vp9 -c:a libopus -crf 31 -b:v 0 "path/to/output/webm";

In this one, we handle the audio as well as the video all in 1 go, encoding the audio to use the opus codec, and the video as before. This is particularly useful if you've got a bunch of old video files generated by an old camera. I've found that often old cameras like to save videos as raw uncompressed AVI files, and that I can reclaim a sigificant amount of disk space if I transcode them to something more efficient.

Naturally, I cooked up a one-liner that finds all relevant video files recursively in a given directory and transcodes them to 1 standard efficient format:

find . -iname "*.AVI" -print0 | nice -n20 xargs --verbose -0 -n1 -I{} sh -c 'old="{}"; new="${old%.*}.webm"; ffmpeg -hide_banner -i "${old}" -c:v libvpx-vp9 -c:a libopus -crf 30 -b:v 0 "${new}" && rm "${old}"';

Let's break this down. First, let's look at the commands in play:

  • find: Locates the target files to transcode
  • nice: Runs the following command in the background
  • xargs: Runs the following command over and over again for every line of input
  • sh: A shell, to execute multiple command in sequence
  • ffmpeg: Does the transcoding
  • rm: Deletes the old file if transcoding completes successfully

The old="{}"; new="${old%.*}.webm"; bit is a bit of gymnastics to pull in the path to the target file (the -I {} tells xargs where to substitute it in) and determine the new filepath by replacing the file extension with .webm.

I've found that this does the job quite nicely, and I can set it running and go off to do something else while it happily sits there and transcodes all my old videos, reclaiming lots of disk space in the process.

Found this useful? Got a cool command for processing audio/video/image files in bulk? Comment below!

Monitoring latency / ping with Collectd and Bash

I use Collectd as the monitoring system for the devices I manage. As part of this, I use the Ping plugin to monitor latency to a number of different hosts, such as GitHub, the raspberry pi apt repo, 1.0.0.1, and this website.

I've noticed for a while that the ping plugin doesn't always work: Even when I check to ensure that a host is pingable before I add it to the monitoring list, Collectd doesn't always manage to ping it - showing it as NaN instead.

Yesterday I finally decided that enough was enough, and that I was going to do something about it. I've blogged about using the exec plugin with a bash script before in a previous post, in which I monitor HTTP response times with curl.

Using that script as a base, I adapted it to instead parse the output of the ping command and translate it into something that Collectd understands. If you haven't already, you'll want to go and read that post before continuing with this one.

The first order of business is to sort out the identifier we're going to use for the data in question. Collectd assigns an identifier to all the the data coming in, and it uses this to determine where it is stored on disk - and subsequently which graph it will appear in on screen in the front-end.

Such identifiers follow this pattern:

host/plugin-instance/type-instance

This can be broken down into the following parts:

Part Meaning
host The hostname of the machine from which the data was collected
plugin The name of the plugin that collected the data (e.g. memory, disk, thermal, etc)
instance The instance name of the plugin, if the plugin is enabled multiple times
type The type of reading that was collected
instance If multiple readings for a given type are collected, this instance differentiates between them.

Of note specifically here are the type, which must be one of a number of pre-defined values, which can be found in a text file located at /usr/share/collectd/types.db. In my case, my types.db file contains the following definitions for ping:

  • ping: The average latency
  • ping_droprate: The percentage of packets that were dropped
  • ping_stddev: The standard deviation of the latency (lower = better; a high value here indicates potential network instability and you may encounter issues in voice / video calls for example)

To this end, I've decided on the following identifier strings:

HOSTNAME_HERE/ping-exec/ping-TARGET_NAME
HOSTNAME_HERE/ping-exec/ping_droprate-TARGET_NAME
HOSTNAME_HERE/ping-exec/ping_stddev-TARGET_NAME

I'm using exec for the first instance here to cause it to store my ping results separately from the internal ping plugin. The 2nd instance is the name of the target that is being pinged, resulting in multiple lines on the same graph.

To parse the output of the ping command, I've found it easiest if I push the output of the ping command to disk first, and then read it back afterwards. To do that, a temporary directory is needed:

temp_dir="$(mktemp --tmpdir="/dev/shm" -d "collectd-exec-ping-XXXXXXX")";

on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

This creates a new temporary directory in /dev/shm (shared memory in RAM), and automatically deletes it when the script terminates by scheduling an exit trap.

Then, we can create a temporary file inside the new temporary directory and call the ping command:

tmpfile="$(mktemp --tmpdir="${temp_dir}" "ping-target-XXXXXXX")";
ping -O -c "${ping_count}" "${target}" >"${tmpfile}";

A number of variables in that second command. Let me break that down:

  • ${ping_count}: The number of pings to send (e.g. 3)
  • ${target}: The target to ping (e.g. starbeamrainbowlabs.com)
  • ${tmpfile}: The temporary file to which to write the output

For reference, the output of the ping command looks something like this:

PING starbeamrainbowlabs.com (5.196.73.75) 56(84) bytes of data.
64 bytes from starbeamrainbowlabs.com (5.196.73.75): icmp_seq=1 ttl=55 time=28.6 ms
64 bytes from starbeamrainbowlabs.com (5.196.73.75): icmp_seq=2 ttl=55 time=15.1 ms
64 bytes from starbeamrainbowlabs.com (5.196.73.75): icmp_seq=3 ttl=55 time=18.9 ms

--- starbeamrainbowlabs.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 15.145/20.886/28.574/5.652 ms

We're only interested in the last 2 lines of output. Since this is a long-running script that is going to be executing every 5 minutes, to minimise load (it will be running on a rather overburdened Raspberry Pi 3B+ :P), we will be minimising the number of subprocesses we spawn. To read in the last 2 lines of the file into an array, we can do this:

mapfile -s "$((ping_count+3))" -t file_data <"${tmpfile}"

The -s here tells mapfile (a bash built-in) to skip a given number of lines before reading from the file. Since we know the number of ping requests we sent and that there are 3 additional lines that don't contain the ping request output before the last 2 lines that we're interested in, we can calculate the number of lines we need to skip here.

Next, we can now parse the last 2 lines of the file. The read command (which is also a bash built-in, so it doesn't spawn a subprocess) is great for this purpose. Let's take it 1 line at a time:

read -r _ _ _ _ _ loss _ < <(echo "${file_data[0]}")
loss="${loss/\%}";

Here the read command splits the input on whitespace into multiple different variables. We are only interested in the packet loss here. While the other values might be interesting, Collectd (at least by default) doesn't have a definition in types.db for them and I don't see any huge benefits from adding them anyway, so I use an underscore _ to indicate, by common convention, that I'm not interested in those fields.

We then strip the percent sign % from the end of the packet loss value here too.

Next, let's extract the statistics from the very last line:

read -r _ _ _ _ _ _ min avg max stdev _ < <(echo "${file_data[1]//\// }");

Here we replace all forward slashes in the input with a space to allow read to split it properly. Then, we extract the 4 interesting values (although we can't actually log min and max).

With the values extracted, we can output the statistics we've collected in a format that Collectd understands:

echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_droprate-${target}\" interval=${COLLECTD_INTERVAL} N:${loss}";
echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping-${target}\" interval=${COLLECTD_INTERVAL} N:${avg}";
echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_stddev-${target}\" interval=${COLLECTD_INTERVAL} N:${stdev}";

Finally, we mustn't forget to delete the temporary file:

rm "${tmpfile}";

Those are the major changes I made from the earlier HTTP response time monitor. The full script can be found at the bottom of this post. The settings that control the operation of the script are the top, which allow you to change the list of hosts to ping, and the number of ping requests to make.

Save it to something like /etc/collectd/collectd-exec-ping.sh (don't forget to sudo chmod +x /etc/collectd/collectd-exec-ping.sh it), and then append this to your /etc/collectd/collectd.conf:

<Plugin exec>
        Exec    "nobody:nogroup"        "/etc/collectd/collectd-exec-ping.sh"
</Plugin>

Final script

#!/usr/bin/env bash
set -o pipefail;

# Variables:
#   COLLECTD_INTERVAL   Interval at which to collect data
#   COLLECTD_HOSTNAME   The hostname of the local machine

declare targets=(
    "starbeamrainbowlabs.com"
    "github.com"
    "reddit.com"
    "raspbian.raspberrypi.org"
    "1.0.0.1"
)
ping_count="3";

###############################################################################

# Pure-bash alternative to sleep.
# Source: https://blog.dhampir.no/content/sleeping-without-a-subprocess-in-bash-and-how-to-sleep-forever
snore() {
    local IFS;
    [[ -n "${_snore_fd:-}" ]] || exec {_snore_fd}<> <(:);
    read ${1:+-t "$1"} -u $_snore_fd || :;
}

# Source: https://github.com/dylanaraps/pure-bash-bible#split-a-string-on-a-delimiter
split() {
    # Usage: split "string" "delimiter"
    IFS=$'\n' read -d "" -ra arr <<< "${1//$2/$'\n'}"
    printf '%s\n' "${arr[@]}"
}

# Source: https://github.com/dylanaraps/pure-bash-bible#use-regex-on-a-string
regex() {
    # Usage: regex "string" "regex"
    [[ $1 =~ $2 ]] && printf '%s\n' "${BASH_REMATCH[1]}"
}

# Source: https://github.com/dylanaraps/pure-bash-bible#get-the-number-of-lines-in-a-file
# Altered to operate on the standard input.
count_lines() {
    # Usage: count_lines <"file"
    mapfile -tn 0 lines
    printf '%s\n' "${#lines[@]}"
}

# Source https://github.com/dylanaraps/pure-bash-bible#get-the-last-n-lines-of-a-file
tail() {
    # Usage: tail "n" "file"
    mapfile -tn 0 line < "$2"
    printf '%s\n' "${line[@]: -$1}"
}

###############################################################################

temp_dir="$(mktemp --tmpdir="/dev/shm" -d "collectd-exec-ping-XXXXXXX")";

on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

# $1 - target name
# $2 - url
check_target() {
    local target="${1}";

    tmpfile="$(mktemp --tmpdir="${temp_dir}" "ping-target-XXXXXXX")";

    ping -O -c "${ping_count}" "${target}" >"${tmpfile}";

    # readarray -t result < <(curl -sS --user-agent "${user_agent}" -o /dev/null --max-time 5 -w "%{http_code}\n%{time_total}\n" "${url}"; echo "${PIPESTATUS[*]}");
    mapfile -s "$((ping_count+3))" -t file_data <"${tmpfile}"

    read -r _ _ _ _ _ loss _ < <(echo "${file_data[0]}")
    loss="${loss/\%}";
    read -r _ _ _ _ _ _ min avg max stdev _ < <(echo "${file_data[1]//\// }");


    echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_droprate-${target}\" interval=${COLLECTD_INTERVAL} N:${loss}";
    echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping-${target}\" interval=${COLLECTD_INTERVAL} N:${avg}";
    echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_stddev-${target}\" interval=${COLLECTD_INTERVAL} N:${stdev}";

    rm "${tmpfile}";
}

while :; do
    for target in "${targets[@]}"; do
        # NOTE: We don't use concurrency here because that spawns additional subprocesses, which we want to try & avoid. Even though it looks slower, it's actually more efficient (and we don't potentially skew the results by measuring multiple things at once)
        check_target "${target}"
    done

    snore "${COLLECTD_INTERVAL}";
done

PhD Update 4: Hyper optimisation and frustration

Hello there again! It's been longer than I anticipated since the last proper post in this series. Before I continue, here's a list of all the (proper) posts in this series so far:

I've haven't managed to get as much done since last time as I was hoping (partly due to the fact that I'm currently having to work from home, which is more challenging than I expected), but I have finished my implementation of the Temporal CNN, and am now working on hyperparameter optimisation. I've also fixed a number of issues in my rainfall radar data downloader and processing programs - which I'll talk about in more detail below

HAIL-CAESAR and the iterative improvements

Someone at the University recently approached me (if you are reading this and have a blog, comment below and I'll update) to ask if they could use my rainfall radar data downloader program to download some rainfall radar data for their project. Naturally, I helped them out. This turned out to be a great thing for me as well, as with their help I managed to uncover a number of very nasty issues with the data pipeline I had been building up to that point:

  • The hydro index file that HAIL-CAESAR uses was completely scrambled
  • The date on the data downloaded was a month out
  • The data downloaded was (and still is) rotated by 90° on disk
  • The data was out by a factor of 32

While fixing each of these bugs was a (relatively) simple process, I can't help but wonder how they managed to escape my notice until (for all but 1 of them) someone else told me about them.

The other issue was that because of the amount of data I'm working with, it took forever to re-run the program to test to see if I had managed to fix the problem - and if I had, I'd encounter another problem. This long iteration process makes implementing a new feature or fixing a bug a very time-consuming process.

Despite fixing all these issues, I'm still experiencing issues with my latest refactoring of the rainfall radar data downloader (namely a hang in the event system when reading tar files). My current thinking is that I'm going to completely reimplement it (using snippets from the old programif I need to use it again in the future, as it is currently neither particularly efficient (it's single-threaded) nor easy to bugfix (it's pretty complicated).

I've got an idea for a parallel system that processes each tar file separately first, and then only after all the tar files have been converted separately are they strung together into the actual files the existing implementation spits out currently.

Temporal CNN delight

Last time, I had just started my implementation of a Temporal CNN. This is now pretty much complete, and I've also been able to run it and get some results! Check out this graph:

A graph showing the root mean squared error while training my Temporal CNN implementation - more details below.

This graph shows the root mean squared error when training on 1000 time steps of data (about 3 days 11 hours or so). Epochs are along the X axis, and the root mean squared error is on the Y axis.

A few things to note here. Firstly, the implementation I've come up with essentially does video-to-image translation. The original model in the paper I've linked to is demonstrates a classification task (specifically land use over time) - so what I'm doing is a little different.

Secondly, I've omitted the root mean squared error for the first epoch. It was so high that it made the rest of the graph impossible to see - hence the omission.

I'm pretty pleased with this result so far - as I have a nice downwards curve indicating that the model is (probably) learning something useful.

I am still rather nervous about the output though, as due to the way I've implemented the network I haven't actually been able to 'see' the output of the network at all as an image yet. Doing so would take a while to implement, so I haven't done so for now (although I really should do this soon). It would be really cool to see a short video (maybe at ~10fps) of the network output as the epochs move forwards to visualise the network training process.

Hyperparameter frustrations

Lastly, at the suggestion of my supervisor I've been working on hyperparameter optimisation. In short, this consists of training the model with random combinations of hyperparameters and seeing which ones work best.

A hyperparameter is a tunable parameter that controls an aspect of a model. In my case, I have 2 key hyperparameters I need to tune:

  • Filter count: CNN layers in Tensorflow.js have a filter count associated with them. I theorise that increasing this will increase the model's ability to learn spatial information.
  • Temporal depth: The number of time steps to push through the model at once. Increasing this will allow the model to make predictions based on events that occur further in the past.

My eventual aim here is to create a heatmap that has the above hyperparameters along the X and Y axes, and the colour showing the accuracy of the model that was trained - similar to the one I created previously.

To do this, I implemented a program that tries random combinations of hyperparameters - but never the same combination twice. It starts the model in a subprocess and passes the chosen filter count and temporal depth values in as CLI arguments, which the child process picks up, parses, and then trains a model based on. This CLI is the same one as the one I developed that generated the above graph in the previous section of this blog post.

This approach has the advantage that it isolates the model in a subprocess, so when the subprocess exits and a new one spawns for the next combination of hyperparameters, the environment is completely clean and there isn't anything that might interfere with it.

Unfortunately though, while I set off a run of this implementation before I took a 'holiday' - and even checked on it to ensure it was running as expected (multiple times) - it still managed to crash when I wasn't looking.

After some debugging, I discovered that problem was because the model ran out of memory while training. This was something I had expected - and used the --unhandled-rejections=strict option for Node.js, which tells Node.js to crash and exit when an UnhandledPromiseRejection is thrown - like this one:

2020-08-04 15:31:26.174395: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at conv_grad_ops_3d.cc:1783 : Resource exhausted: OOM when allocating tensor with shape[8,2,104,348,210] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
(node:62355) UnhandledPromiseRejectionWarning: Error: Invalid TF_Status: 8
Message: OOM when allocating tensor with shape[8,2,104,348,210] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
    at Object.<anonymous> (<anonymous>)
.....

Unfortunately though, I used this flag on the parent process (that drives the hyperparameter optimisation) and not the child process - leading to a situation whereby the child process crashed due to the aforementioned error and just hangs around doing nothing. Even more frustatingly, the solution si as simple as doing a quick export NODE_OPTIONS="--unhandled-rejections=strict" before running the hyperparameter optimisation program to ensure that the flag propagates to the child processes......

Very frustrating indeed - especially considering I calculate that it will take multiple weeks to gather enough data to create a meaningful heatmap.

Conclusion

Reading back over this post, I have got more done than I expected. I've started and finished my Temporal CNN implementation, and fixed lots of bugs in them existing code.

However, the long iteration times to test code I've written (despite using a small slice of the dataset to test with), the large datasets I'm working with, having work on a system remotely via SSH by pushing and pulling code with git (many times), working from home all the time, and the continued bugs I've been facing and will likely continue to face have caused and are causing significant unexpected slowdowns moving forwards.

At least the VPN is no longer dropping out every 5 minutes!

Sources and Further Reading

3D mazes with Lua, OpenSCAD, and Blender

Way back in 2015, I posted a language review about Lua. In that post, I ported an even older 2D maze generator I implemented in Python when I was in secondary school on a Raspberry Pi (this was one of the first experiences I had with the Raspberry Pi). I talked about how Lua was easy to get started with, but difficult do anything serious because everything starts from 1, not 0 - and that immutable strings are awkward.

Since then, I've gained lots more experience with the language. As an aside, I discovered a nice paradigm for building strings:

local function string_example()
    local parts = {} -- Create a table
    table.insert(parts, "This is ") -- Add some strings
    table.insert(parts, "a ")
    table.insert(parts, "string")
    return table.concat(result, "") -- Concatenate them all at once and return
end

Anyway, before I get too distracted, I think the best way to continue this post is with a picture:

Fair warning: This blog post is pretty media heavy. If you are viewing on your mobile device with a limited data connection, you might want to continue reading on another device later.

An awesome render of a 3D maze done in Blender - see below for an explanation.

Pretty cool, right? Perhaps I should explain a little about how I got here. A month or two ago, I rediscovered the above blog post and the Lua port of my Python 2d maze generator. It outputs mazes like this:

#################
#   #     #     #
### ##### ##### #
# #   #       # #
# # # # # ##### #
# # #   #       #
# ### # ### #####
#     #   #     #
#################

(I can't believe I didn't include example output in my previous blog post!)

My first thought was that I could upgrade it to support 3d mazes as well. One thing led to another, and I ended up with a 3D maze generator that output something like this:

#################
#################
#################
#################
#################
#################
#################
#################
#################

#################
#   #   #       #
# ### ###########
#           # # #
# ####### ### # #
#       #   # # #
# ### ####### # #
#   #       # # #
#################

#################
##### ### #######
#################
############### #
#################
############# # #
#################
# ########### ###
#################

#################
#               #
# ### ###########
#   #         # #
# ###############
#               #
##### # ####### #
#   # #     # # #
#################

#################
#################
#################
#################
#################
#################
#################
#################
#################

Each block of hash (#) symbols is a layer of the maze. It's a bit hard to visualise though, so I decided to do something about it. For my masters project, I used OpenSCAD to design a housing for an Internet of Things project I did. Since it's essentially a programming language for expressing 3D models, I realised that it would be perfect for representing my 3D mazes - and since said mazes use a grid, I can simply generate an OpenSCAD file full of cubes for all the locations at which I have a hash symbol in the output (the data itself is stored in a nested table setup, which I then process).

A screenshot of OpenSCAD showing a generated maze.

This is much better. We can clearly see the maze now and navigate around it. OpenSCAD's preview controls are really quite easy to pick up. What you see in the above screenshot is an 'inverted' version of the maze - i.e. instead of carving out a solid block, the algorithm walks around an empty space inside a defined region.

The algorithm that generates the maze itself is pretty much the same as the original algorithm I devised myself in Python (which I've now lost, sadly - as I didn't use Git back then).

It starts in the top left corner, and then does a random walk around the defined area. It keeps track of where it has been in a node list (basically a list of coordinates), and every time it takes a step forwards, there's a chance it will jump back to a previous position in the nodes list. Once it can't jump anywhere from a position, that position is considered complete and is removed from the nodes list. Once the node list is empty, the maze is considered complete and it returns the output.

A divider made up of orange renders of a small 7x7x7 maze rotating

As soon as I saw the STL export function though, I knew I could do better. I've used Blender before a little bit - it's a production-grade free open-source rendering program. You can model things in it and apply textures to them, and then render the result. It is using a program like this that many CGI pictures (and films!) are created.

Crucially for my case, I found the STL import function. With that, I could import the STL I exported from OpenSCAD, and then have some fun playing around with the settings to get some cool renders of some mazes:

(Above: Some renders of some of the outputs of the maze generator. See the full size image [3 MiB])

The sizes of the above are as follows, in grid squares as generated by the Lua 3d maze generator:

  • Blue: 15 x 15 x 15
  • Orange: 7 x 7 x 7
  • Purple: 17 x 15 x 11, with a path length of 4 (i.e. the generator jumps forwards by 4 spaces instead of 2 during the random walk)
  • Green: 21 x 21 x 7

Somehow it's quite satisfying to watch it render, with the little squares gradually spiralling their way out from the centre with a hilbert curve - so I looked into how to create a glass texture, and how to setup volumetric rendering. It was not actually too difficult to do (the most challenging part was getting the lights in the right place with the right strength). Here's a trio of renders that show the iterative process to getting to the final image you see at the top of this post:

(Above: Some renders of some of the blue 15x15x15 above in the previous image with a glass texture. See the full size image [3.4 MiB])

From left to right:

  1. My initial attempt using clear glass
  2. Frosting the glass made it look better
  3. Adding volumetric lighting makes it look way cooler!

I guess that you could give the same treatment to any STL file you like.

Anyway, the code for my maze generator can be found here on my private git server: sbrl/multimaze

The repository README contains instructions on how to use it. I won't duplicate that here, because it will probably change over time, and then this blog post would be out of date.

Before I go, I'll leave you with some animations of some mazes rotating. This whole experience of generating and rendering mazes has been really fun - it's quite far outside what I've been doing recently. I think I'd like to do some more of this in the future!

Update: I've re-rendered a new version at a lower quality. This should help mobile devices! The high-quality version can still be accessed via the links below.

(High-quality version: webm - vp9, ogv - ogg theora, mp4 - h264)

Partitioning and mounting a new disk using LVM

As I've been doing my PhD, I've been acquiring quite a lot of data that needs storing. To this end, I have acquired a new 2 TiB hard drive in my Lab PC. Naturally, this necessitates formatting it so that I can use it. Since I've been using LVM (Logical Volume Management) for my OS disk - so I decided to use it for my new disk too.

Unfortunately, I don't currently have GUI access to my Lab PC - instead for the past few months I've been limited to SSH access (which is still much better than nothing at all), so I can't really use any GUI tool to do this for me.

This provided me with a perfect opportunity to get into LVM through the terminal instead. As it turns out, it's not actually that bad. In this post, I'm going to take you through the process of formatting a fresh disk: from creating a partition table to mounting the LVM logical volume.

Let's start by partitioning the disk. For this, we'll use the fdisk CLI tool (install it on Debian-based systems with sudo apt install fdisk if it's not available already). It should be obvious, but for this tutorial root access is required to execute pretty much all the commands we'll be using.

Start fdisk like so:

sudo fdisk /dev/sdX

Replace X with the index of your disk (try lsblk - no sudo required - to disk your disks). fdisk works a bit like a shell. You enter letters (or short sequences) followed by hitting enter to give it commands. Enter the following sequence of commands:

  • g: Create new GPT partition table
  • n: Create new partition (allow it to fill the disk)
  • t: Change partition type
    • L: List all known partition types
    • 31: Change to a Linux LVM partition
  • p: Preview final partition setup
  • w: Write changes to disk and exit

Some commands need additional information - fdisk will prompt you here.

With our disk partitioned, we now need to get LVM organised. LVM has 3 different key concepts:

  • Physical Volumes: The physical disk partitions it should use as a storage area (e.g. we created an appropriate partition type above)
  • Volume Groups: Groups of 1 or more Physical Volumes
  • Logical Volumes: The volumes that you use, format, and mount - they are stored in Volume Groups.

To go with these, there are also 3 different classes of commands that LVM exposes: pv* commands for Physical Volumes, vg* for Volume Groups, and lv* for Logical Volumes.

With respect to Physical Volumes, these are physical partitions on disk. For legacy MSDOS partition tables, these must have a partition type of 8e. For newer GPT partition tables (such as the one we created above), these need the partition id 31 (Linux LVM) - as described above.

Next, we create a new volume group that holds our physical volume:

sudo vgcreate vg-pool-name /dev/sdXY

....replacing /dev/sdXY with the partition you want to add (again, lsblk is helpful here). Don't forget to change the name of the Volume Group to something more descriptive than vg-pool-name - though keeping the vg prefix is recommended.

If you like, you can display the current Volume Groups with the following command:

sudo vgdisplay

Then, create a new logical volume that uses all of the space in the new volume group:

sudo lvcreate -l 100%FREE -n lv-rocket-blueprints vg-pool-name

Again, replace vg-pool-name with the name of your Volume Group, and lv-rocket-blueprints with the desired name of your new logical volume. tldr (for which I review pull requests) has a nice page on lvcreate. If you have a tldr client installed, simply do this to see it:

tldr lvcreate

With our logical volume created, we can now format it. I'm going to format it as ext4, but you can format it as anything you like.

sudo mkfs.ext4 /dev/vg-pool-name/lv-rocket-blueprints

As before, replace vg-pool-name and lv-rocket-blueprints with your Volume Group and Logical Volume names respectively.

Finally, mount the newly formatted partition:

sudo mkdir /mnt/rocket-blueprints
sudo mount /dev/vg-extra-data/lv-rocket-blueprints /mnt/rocket-blueprints

You can mount it anywhere - though I'd recommend mounting it to somewhere in /mnt.

Auto-mounting LVM logical volumes

A common thing many (myself included) want to do is automatically mount an LVM partition on boot. This is fairly trivial to do with /etc/fstab.

To start, find the logical volume's id with the following command:

sudo blkid

It should be present as UUID="THE_UUID_HERE". Pick out only the UUID of the logical volume you want to automount here. As a side note, using the UUID is generally a better idea than the name, because the name of the partition (whether it's an LVM partition or a physical /dev/sdXY partition) might change, while the UUID always stays the same.

Before continuing, ensure that the partition is unmounted:

sudo umount /mnt/rocket-blueprints

Now, edit /etc/fstab (e.g. with sudo nano /etc/fstab), and append something like this following to the bottom of the file:

UUID=THE_UUID_YOU_FOUND_HERE    /mnt/rocket-blueprints  ext4    defaults,noauto 0   2

Replace THE_UUID_YOU_FOUND_HERE with the UUID you located with blkid, and /mnt/rocket-blueprints with the path to where you want to mount it to. If an empty directory doesn't already exist at the target mount point, don't forget to create it (e.g. with sudo mkdir /mnt/rocket-blueprints).

Save and close /etc/fstab, and then try mounting the partition using the /etc/fstab definition:

sudo mount /mnt/rocket-blueprints

If it works, edit /etc/fstab again and replace noauto with auto to automatically mount it on boot.

That's everything you need to know to get up and running with LVM. I've included my sources below - in particular check out the howtogeek.com tutorial, as it's not only very detailed, but it also has a cheat sheet containing most of the different LVM commands that are available.

Found this useful? Still having issues? Got a suggestion? Comment below!

Sources and further reading

Cluster, Part 9: The Border Between | Load Balancing with Fabio

Hello again! It's been a while since the last one (mainly since I've been unsure about a few architectural things), but I'm now ready to continue writing about my setup. Before we continue, here's a refresher of everything we've done so far:

In this post, we're going to look at tying off our primary pipeline. So far, we've got job scheduling with Nomad, (superglue!) service discovery with Consul, and shared storage backed with NFS (although I'm going to revisit this eventually), with everything underpinned by a WireGuard mesh VPN with wesher.

In order to allow people to interact with services that are running on the cluster, we need something that will translate from the weird and strange world of anything running somewhere anywhere, and everywhere in-between into something that makes sense from an outside perspective. We want to have a single gateway by which we can control and manage access.

It is for these purposes that we're going to add Fabio to our stack. Its configuration is backed by Consul, and it is relatively simple and easy to understand. Having the config backed by Consul nets us multiple benefits:

  • It can run anywhere on the cluster we like in a pinch
  • We can configure new routes directly from a Nomad job spec file (although we still need to update the Unbound config)
  • The configuration of Vault gains additional data redundancy being stored on multiple nodes in the cluster

Like in previous parts of this series, Fabio isn't available to install with apt directly, so I've packaged it into my apt repository. If you haven't yet set up my apt repository, up-to-date instructions on how to do so can be found at the top of its main page - just click the aforementioned link (I'm not going to include instructions here, as they may go out of date at a later time).

Once you've set up my apt repository (or downloaded the Fabio binary manually, though I don't recommend that as it's more difficult to keep up-to-date), we can install Fabio like so:

sudo apt install fabio

This should be done on your primary (controller) node in your cluster. You can also do it on a secondary node too if you like to increase redundancy. To do this, just follow these instructions on both nodes one at a time. I'll be doing this soon myself: I've just been distracted with other things :P

Next, we need a service file. For systemd users (I'm using Raspbian at the moment), I have an apt package:

sudo apt install fabio-systemd

With this installed, we need to create a (very) minimal configuration file. Here it is:

proxy.addr = :80;proto=http
proxy.auth = name=admin;type=basic;file=/etc/fabio/auth.admin.htpasswd

Pretty short, right? This does 2 things:

  1. Tells Fabio to listen on port 80 for HTTP requests (we'll be tackling HTTPS in a separate post - we need Vault for that)
  2. Tells Fabio about the admin auth realm and where it can find the .htpasswd file that corresponds with it

Fabio's password authentication uses HTTP Basic Auth - which is insecure over unencrypted HTTP. Note that we'll be working towards improving the situation here and I'll insert a reminder when we arrive to change all your passwords where we do, but there are quite a number of obstacles between here and there we have to deal with first.

With this in mind, Take a copy of the above Fabio config file and write it to /etc/fabio/fabio.properties. Next, we need to generate that htpasswd file we reference in the config file. There are many tools out there that can be used for this purpose - for example the htpasswd tool in the apache2-utils package:

htpasswd /etc/fabio/auth.admin.htpasswd username

I like this authentication setup for Fabio, as it allows one to have a single easily configurable set of realms for different purposes if desired.

If you're setting up Fabio on multiple servers, you'll want to put your config file in your shared NFS storage and create a symlink at /etc/fabio/fabio.properties instead. Do that like this:

sudo ln -s /etc/fabio/fabio.properties /mnt/shared/config/fabio/fabio.properties

....update the /mnt/... path accordingly. Don't forget to adjust the /etc/fabio/auth.admin.htpasswd path too in fabio.properties as well.

Now that we've got the configuration file out of the way, we can start Fabio for the first time! Do that like this:

sudo systemctl start fabio.service
sudo systemctl enable fabio.service

Don't forget to punch a hole in the firewall:

sudo ufw allow 80/tcp comment fabio

Fabio is running - but it's not particularly useful, as we haven't configured any routes! Let's add some routes now.The first few routes we're going to add will be manual routes, which will allow us to tell Fabio about a static route we want it to add to it's routing table.

Fabio itself actually has a web interface, which will make a good first target for testing out our new cool toy. I mentioned earlier that Fabio gets its configuration from Consul - and it's now that we're going to take advantage of that. Consul isn't just a service discovery tool you see - it's a shared configuration manager too via a fancy hierarchical distributed key-value data store.

In this datastore Fabio looks in particular at the keys in the fabio directory. Create a new key under here with the Consul CLI like so:

consul kv put "fabio/fabio" 'route add fabio fabio.bobsrockets.com/ http://NODE_NAME.node.mooncarrot.space:9998 tags "mission-control" opts "auth=admin"'

Replace NODE_NAME with the name of the node you're running Fabio on, and yourdomain.com with a domain name you've bought. Once done, update your DNS config to point fabio.bobsrockets.com to the node that's running Fabio (you might want to refer back to my earlier post on Unbound - don't forget to restart unbound with sudo systemctl restart unbound).

When you have your DNS server updated, you should be able to point your browser at fabio.bobsrockets.com. No reloading of Fabio is needed - it picks up changes dynamically and automagically! It should prompt you for your password, and then you should see your the Fabio web interface. It should look something like this:

The Fabio web interface

As you can see, I've got a number of services running - including a few that I'm going to be blogging about soon-ish, such as Vault (but I haven't yet learnt how to use it :P) and Docker Registry UI (which is useful but has some issues - I'm going to see if HTTPS helps fix some of them up as I'm getting some errors in the dev tools about the SubtleCrypto API, which is only available in secure contexts).

Those services with IP addresses as the destination are defined through Nomad, and auto-update based on the host upon which they are running.

In the web interface you can click on overrides on the top bar to view and edit the configuration for the static routes you've got configured. You can't create new ones though, which is a shame.

Using the same technique as described above, you can create manual routes for Nomad and Consul - as they have web interfaces too! If you haven't already you'll need to enable it though with ui = true the Nomad and Consul server configuration files respectively though. For example, you could use these definitions:

route add nomad nomad.seanssatellites.io/ http://nomad.service.seanssatellites.io:4646 tags "mission-control" opts "auth=admin"
route add consul consul.billsboosters.space/ http://consul.service.billsboosters.space:8500 tags "mission-control" opts "auth=admin"

If you do the Consul one first, you can use the web interface to create the definition for Nomad :D

It's perhaps worth making a quick note of some parts of the above route definitions:

  • opts "auth=admin": This bit activates HTTP Basic Auth with the specified realm
  • consul.billsboosters.space/: This is the domain through which outside users will access the service. The trailing slash is very important.

From here, the last item on the list for this post are automatic routes via Nomad jobs. Since it's the only job we've got running on Nomad so far, let's use that as an example. Adding a Fabio route in this manner requires 3 steps:

  1. Find the service stanza in your Docker Registry Nomad job file, and edit the tags list to include a pair of tags something like urlprefix-registry.tillystelescopes.fr/ and auth=admin (again, the trailing slash is important, and the urlprefix- bit instructs Fabio that it's the domain name to route traffic from to the container).
  2. Save the edits to the Nomad job file and re-run it with nom job run path/to/file.nomad
  3. Update your DNS with a new record pointing registry.tillystelescopes.fr at the IP address(es) of the node(s) running Fabio

Also pretty simple to get used to, right? From here, step 4 of the official quickstart guide is useful. It explains about the different service tags (like the urlprefix- and auth=admin ones we created above) that are supported. Apparently raw TCP forwarding is also supported - though personally I'm waiting eagerly on UDP forwarding myself for some services I would like to run.

The rest of the Fabio docs are a bit of a mess, but I've found them more understandable than that of Traefik - the solution I investigated before turning to Fabio upon a recommendation from someone over in the r/selfhosted subreddit in frustration (whoever says "Traefik is simple!" is lying - I can't make sense of anything - it might as well be written in hieroglyphs.....).

Looking into the future, our path is diverging into 2 clear routes:

  • Getting services up and running on our new cluster
  • Securing said cluster to avoid attack

While relatively separate goals, they do intertwine at intervals. Moving forwards, we're going to be oscillating between these 2 goals. Likely topics include Vault (though it'll take several blog posts to realise any benefit from it at this point), and getting some Docker container infrastructure setup.

Speaking of Docker container infrastructure, if anyone has any ideas as to how to auto-rebuild docker containers and/or auto-restart Nomad jobs to keep them up-to-date, I'd love to know in a comment below. I'm currently scratching my head over that one....

Found this interesting? Got an idea that would improve on my setup? Confused about something? Comment below!

Art by Mythdael