Viktor Vilhelm Sonesten

Implementing a NixOS router with Tailscale DNS, adblocking, and impermanence

Table of Contents

  1. Figuring IPMI and SOL out
  2. Checking in on BIOS settings
  3. Installing quiter fans
  4. Bootstrapping the router via nixos-anywhere and disko
  5. Configuring NixOS for IPMI SOL
  6. SrvOS
  7. router.nix
  8. Using the X8 for Tailscale DNS
  9. Implementing network-wide adblocking
  10. Impermanence
  11. Summa summarum

I have replaced my OpenWrt-based home router with a NixOS configuration on a Supermicro X8DTU. OpenWrt comes with enough friction that I didn't want to interact with it, but the main pain point was the imperative configuration. I've now used NixOS long enough to discard any system that I cannot configure declaratively. And now I've finally replaced my router with a system I feel I can actually keep track of, and have much better control over.

Below I'll outline and summarize the steps I took to transform the X8 into a decent home router. Code snippets are a mix of NixOS modules and flake-parts modules.

Figuring IPMI and SOL out

The SuperMicro X8DTU that I have (henceforth just "X8") is an older non-UEFI system. It has three Ethernet NICs, one of these being a dedicated IPMI interface. IPMI is… not perfect, but ipmitool makes it usable. The IPMI firmware also exposes a web interface on the regular ports, but every browser I try reject it due to deprecated TLS ciphers.

For debug purposes I wanted IPMI SOL (serial-over-lan) available. This allows remote access to BIOS settings, grub, and a tty in the case of SSH being unavailable.

I already had an IPMI user set up.

The ipmitool commands below are an alias of

ipmitool -H <IP> -U tmplt -P <password> -I lanplus

Some exploration led to the following:

  • SOL is setup via some "payload" indices; ipmitool user list gave that tmplt had index 6.
  • ipmitool sol payload enable 1 6 was required so allow (?) my user to use SOL. Without it I got some "payload disabled" error.
  • ipmitool sol activate is then used to jump into a SOL session. Some times this will fail (usually during power cycles), and I'll have to ipmitool sol deactivate first.

A lot of reboots were performed via ipmitool chassis power cycle.

Checking in on BIOS settings

With SOL and remote BIOS access working I changed some system settings (after the regular quick-time event of spamming F4/Del during boot):

  • booting from USB was prioritized, and iPXE disabled;
  • fan speed were put on lowest, but the stock Nidec 40mm jet engines were too loud at 6k RPM;
  • verify latest BIOS version (unfortunately the system was too old for UEFI).

Installing quiter fans

The Nidecs could be heard through the solid-wood door the server rack is hidden behind, so I ordered a bunch of Noctua NF-A4x20 for replacement. These are 5.5 CFM at 15 dBA, down from 24 CFM at 56 dBA. Much quiter, but less airflow; good enough for a router.

Unfortunately, now I hear the PSU fans instead. Go figure.

Bootstrapping the router via nixos-anywhere and disko

I've deployed a bunch of systems lately and the combination of nixos-anywhere and disko is a significant time saver. In my infra.git I have a manual-install-medium.nix that I use a bunch. It is just the regular installation-cd-minimal.nix from nixpkgs, but with a more descriptive hostname and my public SSH key:

# Declares a manual installation medium, with SSH key set.
{ inputs, ... }: {
  perSystem = { config, system, pkgs, ... }: {
    packages.manual-install-medium = (inputs.nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules = [
        ({ modulesPath, ... }: {
          imports =
            [ "${modulesPath}/installer/cd-dvd/installation-cd-minimal.nix" ];

          users.extraUsers.root.openssh.authorizedKeys.keys = [
            # ...
          ];

          networking.hostName = "nixos-livecd";
        })
      ];
    }).config.system.build.isoImage;
  };
}

Writing the built ISO to a USB drive and booting from it, we're just one nixos-anywhere invocation away from a fully installed (and configured system).

I stumbled upon some issue with the lack of UEFI-support, but thanks to the disko issue tracker, I arrived at a hybrid approach (almost a verbatim copy):

{
  boot.loader.grub = {
    enable = true;
    zfsSupport = true;
  };

  disko.devices = {
    disk.main = {
      type = "disk";
      content = {
        type = "gpt";
        partitions = {
          grub = {
            size = "1M";
            type = "EF02"; # for grub MBR
            priority = 1;
          };
          boot = {
            size = "1G";
            type = "EF00";
            content = {
              type = "filesystem";
              format = "vfat";
              mountpoint = "/boot";
            };
            priority = 2;
            hybrid.mbrBootableFlag = true;
          };
          root = {
            size = "100%";
            content = {
              type = "zfs";
              pool = "zroot";
            };
            priority = 4;
          };
        };
      };
    };

    zpool.zroot = let
      unmountable = { type = "zfs_fs"; };
      filesystem = mountpoint: {
        type = "zfs_fs";
        options = {
          canmount = "noauto";
          inherit mountpoint;
        };
        inherit mountpoint;
      };
    in {
      type = "zpool";

      rootFsOptions = {
        "com.sun:auto-snapshot" = "false";
        canmount = "off";
        xattr = "sa";
      };
      options = { compatibility = "grub2"; };
      datasets = {
        "local" = unmountable;
        "local/root" = filesystem "/" // {
          postCreateHook = "zfs snapshot zroot/local/root@blank";
        };
        "local/nix" = filesystem "/nix";
        "local/state" = filesystem "/state";

        "safe" = unmountable;
        "safe/persist" = filesystem "/persist";
      };
    };
  };
}

With this disk-config.nix, the one drive in the system is split into three: 1M for grub with legacy MBR, 1G for a regular /boot, and the remainder for a ZFS pool. The zpool.zroot.options.compatibility = "grub2"; is critical here, lest grub will halt and catch fire.

When the system has been installed, ZFS reports the following:

$ zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zroot  3.62T  2.03G  3.62T        -         -     0%     0%  1.00x    ONLINE  -

$ zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
zroot               2.03G  3.51T    96K  none
zroot/local         2.01G  3.51T    96K  none
zroot/local/nix     2.01G  3.51T  2.01G  /nix
zroot/local/root     796K  3.51T   716K  /
zroot/local/state     96K  3.51T    96K  /state
zroot/safe          6.64M  3.51T    96K  none
zroot/safe/persist  6.54M  3.51T  6.54M  /persist

With the postCreateHook, we create a snapshot for an empty root filesystem. I will use this later to configure impermanence.

I summarize the installation procedure:

  1. boot into manual install medium;
  2. find name of drive: /dev/disk/by-id/...;
  3. update configuration by setting disko.devices.disk.main.device; commit; and
  4. nix run github:nix-community/nixos-anywhere -- --flake .#system --target-host root@nixos-livecd.

I believe this will be how I install all my systems from now.

Configuring NixOS for IPMI SOL

In order to get IPMI SOL working when entering grub and Linux proper we need to tell them to use COM2 which the SOL is connected to:

# With this config, we get GRUB, kernel and a user tty over IPMI SOL.
let
  serialSettings = {
    # Corresponds to COM2, which is what IPMI SOL is connected to.
    deviceIdx = "1";
    speed = "115200";
    wordSize = "8";
  };
  ss = serialSettings;
in {
  boot.loader.grub.extraConfig =
    "serial --speed=${ss.speed} --unit=${ss.deviceIdx} --word=${ss.wordSize} --parity=no --stop=1; terminal_input serial; terminal_output serial";
  boot.kernelModules = [ "lanplus" ];

  boot.kernelParams =
    [ "console=tty0" "console=ttyS${ss.deviceIdx},${ss.speed}n${ss.wordSize}" ];

  # This gives us a user login. This is only needed if sshd fails.
  systemd.services."serial-getty@ttyS${ss.deviceIdx}".wantedBy =
    [ "multi-user.target" ];

  srvos.boot.consoles = [ "tty0" "ttyS${ss.deviceIdx},${ss.speed}" ];
}

The serial-getty@tty1 also gives us the ability to login over SOL. It is painfully slow, but it did come in handy at a few points during iteration.

SrvOS

I stumbled upon Numtide's SrvOS recently and have meant to try it out. Now was a good time. In short they expose some opinionated system profiles from their own learnings. After exploring the modules I found the configurations sane for my use-case, too. Enabling it is easy:

{
  imports = [ inputs.srvos.nixosModules.server ];

  # Ensure our SOL support isn't broken
  srvos.boot.consoles = [ "tty0" "ttyS1,115200" ];
}

router.nix

Now to the meat and potatoes: the actual router configuration. The NixOS module below follows from a mix of sources, all of which I have forgotten and unfortunately never wrote down. Summarized, the effect of the configuration is:

  • one interface is set as the WAN interface;
  • all other interfaces (here, just one) are bridged as a br-lan interface;
  • DHCP serves IPv4 (10.0.0.0/16) and IPv6;
  • an intranet domain is set, in.tmplt.dev;
  • a bunch of intranet static addresses are set;
  • the default firewall is disabled in favor of nftables;
  • nftables allows LAN->WAN but limits WAN->LAN to ICMP, if not already established traffic;
  • dnsmasq is setup for local DNS, forwarding cache-missed queries to Cloudflare.

Here it is:

{ lib, pkgs, ... }:
let
  wan = "enp1s0f0";
  lans = [ "enp1s0f1" ];
  hostName = "praecursoris";
  domain = "in.tmplt.dev";
in {
  # Enable routing
  boot.kernel = {
    sysctl = {
      # Forward on all interfaces
      "net.ipv4.conf.all.forwarding" = true;
      "net.ipv6.conf.all.forwarding" = true;

      # By default, do not automatically configure any IPv6 addresses.
      "net.ipv6.conf.all.accept_ra" = 0;
      "net.ipv6.conf.all.autoconf" = 0;
      "net.ipv6.conf.all.use_tempaddr" = 0;

      # On wired WANs, allow IPv6 autoconfiguration and tempory address use.
      "net.ipv6.conf.${wan}.accept_ra" = 2;
      "net.ipv6.conf.${wan}.autoconf" = 1;

      "net.ipv4.conf.br-lan.rp_filter" = 1;
      "net.ipv4.conf.${wan}.rp_filter" = 1;
    };
  };

  networking = {
    inherit hostName domain;
    useNetworkd = true;
    useDHCP = lib.mkForce false;

    # Don't use the standard-issue firewall: we configure our own via
    # nftables below.
    nat.enable = false;
    firewall.enable = lib.mkForce false;

    nftables = {
      enable = true;
      checkRuleset = false;
      ruleset = ''
        table inet filter {
          chain input {
            type filter hook input priority 0; policy drop;

            iifname { "br-lan" } accept comment "Allow local network to access the router"
            iifname { "${wan}" } ct state { established, related } accept comment "Allow established traffic"
            iifname { "${wan}" } icmp type { echo-request, destination-unreachable, time-exceeded } counter accept comment "Allow select ICMP"
            iifname "${wan}" counter drop comment "Drop all other unsolicited traffic from wan"
            iifname "lo" accept comment "Accept everything from loopback interface"
          }

          chain forward {
            type filter hook forward priority filter; policy drop;

            iifname { "br-lan" } oifname { "${wan}" } accept comment "Allow LAN to WAN for all systems"
            iifname { "${wan}" } oifname { "br-lan" } ct state { established, related } accept comment "Allow established back to LANs"
          }
        }

        table ip nat {
          chain postrouting {
            type nat hook postrouting priority 100; policy accept;
            oifname "${wan}" masquerade
          }
        }
      '';
    };
  };

  systemd.network = {
    enable = true;

    # Allow interfaces to be plugged and unplugged dynamically.
    wait-online.anyInterface = true;

    # Create the bridge interface
    netdevs = {
      "20-br-lan" = {
        netdevConfig = {
          Kind = "bridge";
          Name = "br-lan";
        };
      };
    };

    networks = let
      enslaveToBridge = name: {
        "30-${name}" = {
          matchConfig.Name = name;
          networkConfig = {
            Bridge = "br-lan";
            ConfigureWithoutCarrier = true;
          };
          linkConfig.RequiredForOnline = "enslaved";
        };
      };
      # Enslave all Ethernet ports other than WAN for the LAN; connect
      # them to the bridge.
      enslavedLans = lib.mergeAttrsList (map enslaveToBridge lans);
    in enslavedLans // {
      "10-wan" = {
        matchConfig.Name = "${wan}";
        networkConfig = {
          # Start a DHCP client for IPv4
          DHCP = "ipv4";
          # Accept SLAAC
          IPv6AcceptRA = true;
          DNSOverTLS = true;
          DNSSEC = true;
          IPv6PrivacyExtensions = false;

          IPv4Forwarding = true;
        };
        # make required dependency for network-online.target
        linkConfig.RequiredForOnline = "routable";
      };

      # Configure bridge
      "40-br-lan" = {
        matchConfig.Name = "br-lan";
        bridgeConfig = { };
        address = [ "10.0.0.1/16" ];
        networkConfig = { ConfigureWithoutCarrier = true; };
        linkConfig.RequiredForOnline = "no";
      };
    };
  };
  services.resolved.enable = false;

  services.dnsmasq = {
    enable = true;
    settings = {
      server = [ "1.1.1.1" "1.0.0.1" ];
      # Forward queries to all servers; use the fastest response.
      all-servers = true;

      # Don't cache negative responses.
      no-negcache = true;

      # Don't forward queries for plain names; its a hostname on the
      # tailnet.
      domain-needed = true;

      # Don't forward reverse lookups of private IP ranges.
      bogus-priv = true;

      # Disregard /etc/{hosts,resolv.conf}
      no-resolv = true;
      no-hosts = true;

      # Same cache size as used by Pi-Hole
      cache-size = 10000;

      dhcp-range = [ "br-lan,10.0.0.50,10.0.0.254,24h" ];
      interface = [ "br-lan" ];
      dhcp-host = [
        # Static leases based on MAC/hostname
        "10.0.0.2,switch,infinite"
        # For all other hosts, aquire dynamic IP
        "10.0.0.1"
      ];

      local = "/${domain}/";
      inherit domain;
      expand-hosts = true;

      # Map addresses for domain
      address = [
        "/${hostName}.${domain}/10.0.0.1"
        "/router.${domain}/10.0.0.1"
      ];
    };
  };
}

Using the X8 for Tailscale DNS

I connect all my systems via Tailscale. This homogenizes things a bit. For one, when away from home, I can use the same settings as if I were on my LAN; it very much streamlines access to my intranet services. But all this requires a DNS service where I can map domains to Tailscale's 100.0.0.0/8 network. I had previously used NextDNS for this (it was natively supported via Tailscale) but the nameserver kept dropping out and the adblock was too agressive.

I have dnsmasq on the X8 now, so I figured I'd use that for my tailnet too. There was apparently an official guide for this (kinda), but what I did was:

  • allow incoming from Tailscale in nftables;
  • make dnsmasq listen to the same interface;
  • under DNS on Tailscale's admin interface, override DNS servers with the tailnet IP of the X8; and
  • modify dnsmasq for my intranet services.

Configuration-wise:

{
  services.dnsmasq.settings = {
    interface = [ "tailscale0" ];

    address = let
      domain = "in.tmplt.dev";
      # these are Tailscale IPs
      dulcia = "...";
      temeraire = "...";
      zfs-offsite = "...";
    in [
      "/git.${domain}/${dulcia}"
      "/torrent.${domain}/${dulcia}"
      "/feeds.${domain}/${dulcia}"
      "/music.${domain}/${dulcia}"
      "/immich.${domain}/${temeraire}"
      "/fs.${domain}/${dulcia}"
      "/zfs-offsite.${domain}/${zfs-offsite}"
    ];
  };

  # Confer with previous configuration. This configuration probably
  # does not build.
  networking.nftables.ruleset = ''
    table inet filter {
      chain input {
        iifname { "br-lan", "tailscale0" } accept comment "Allow local/ts network to access the router"
        iifname { "${wan}", "tailscale0" } ct state { established, related } accept comment "Allow established traffic"
        iifname { "${wan}", "tailscale0" } icmp type { echo-request, destination-unreachable, time-exceeded } counter accept comment "Allow select ICMP"
      }
    }
  '';
}

And that was it. Surprisingly easy, just adding "tailscale0" in a few places, aside from the address-list.

Implementing network-wide adblocking

With dnsmasq setup as the name server for my whole infrastructure, I figured I'd see if I could implement an adblocker. This too turned out to be easier than expected, thanks to StevenBlack:

{
  services.dnsmasq.settings.addn-hosts = let
    blocklist = pkgs.fetchFromGitHub {
      owner = "StevenBlack";
      repo = "hosts";
      rev = "3.16.59";
      hash = "sha256-gPG7wu3K0wLwpV0nPJt7sIrLP3PrgOS/4POM5zwerVs=";
    };
  in [
    "${blocklist}/hosts" # use the unified list of domains
  ];
}

I'll have to bump this in the future. Perhaps I should make it a flake input?

Impermanence

I've been meaning to try impermanence out. The first half of grahamc's Erase your darlings provides a good rationale. But in short: after a first installation a system (even NixOS) will drift due to filesystem changes. Such changes are not tracked, but perhaps they should be. System data cannot be tracked in version control, but we can put it on a dataset we'd like to persist across reboots. Any data that isn't whitelisted to this dataset is wiped, and the system restores to a better known state every boot.

For a router, we persist the following, and rollback the root filesystem on boot:

{ lib, pkgs, ... }: {
  fileSystems."/persist".neededForBoot = true;

  environment.persistence."/persist" = {
    hideMounts = true;
    directories = [
      "/var/log"
      "/var/lib/nixos"
      "/var/lib/systemd/coredump"
      "/var/lib/dhcpcd"
    ];
    files = [
      # Required for system logs and WAN DHCP state
      "/etc/machine-id"

      # Required for LAN DHCP
      "/var/lib/dnsmasq/dnsmasq.leases"

      "/var/lib/tailscale/tailscaled.state"

      "/etc/ssh/ssh_host_ed25519_key"
      "/etc/ssh/ssh_host_ed25519_key.pub"
      "/etc/ssh/ssh_host_rsa_key"
      "/etc/ssh/ssh_host_rsa_key.pub"
    ];
  };

  # Roll back to root@blank at system boot
  boot.initrd.systemd.enable = lib.mkForce true;
  boot.initrd.systemd.services.zfs-rollback-root = {
    description = "Roll / back to @blank for impermanence";
    wantedBy = [ "initrd.target" ];
    before = [ "sysroot.mount" ];
    after = [ "zfs-import.target" ];

    serviceConfig.Type = "oneshot";

    # @blank is created at system install via disko
    script = ''
      ${pkgs.zfs}/bin/zfs rollback -r zroot/local/root@blank
    '';
  };
}

Rolling back the filesystem on boot allows us to reset even if the system crashes.

Summa summarum

This was an exercise in NixOS, filesystems, and networking in general. The labour was fun and produced some fruit, too: I now have much more control over my routing hardware, and it is noticeably faster than my previous network setup. DNS feels a magnitude faster. Perhaps a component of placebo, but at least it is stable.