LUKS Encryption with ZFS Root on Void Linux

The list of non-systemd operating systems that run ZFS on the root partition is a short list, but a valued one. Today, we install Void Linux. The documentation for this OS is a litle lacking. Parts of the OS documentation are decent, especially the advanced chroot-based installation page. There are also separate pages for installing Void Linux with LUKS and installing Void Linux with a ZFS root, but not both at the same time. Let's fix that.

This is going to be very similar to how we installed Devuan jessie, only instead of using the debootstrap tool, Void Linux provides its own rootfs tarball that we will install and configure. There are a couple of gotchas that can render your zpool unbootable if you follow the wiki, but by now these steps should seem really familiar: make a LUKS container, make a zpool in the LUKS container, extract a base OS into the zpool, chroot to the base OS, install ZFS, install bootloader, cross fingers, reboot. That's really all we're doing.

Start with an Ubuntu live CD. Ubuntu 16.04+ includes the ZFS kernel modules but not the userland utilities. We start with Ubuntu because it's faster to "apt-get install" these packages than to download and build ZFS DKMS modules from scratch (twice), but if that's what you feel like doing, hey man, go for it.

Wipe the MBR, create a new partition table, and create one partition for the LUKS container. Assuming your disk is /dev/sda:

sudo su
DEVICE=/dev/sda # set this accordingly

wipefs --force --all ${DEVICE}
# or do this, or do both:
dd if=/dev/zero of=${DEVICE} bs=1M count=2

# Set MBR
/sbin/parted --script --align opt ${DEVICE} mklabel msdos
/sbin/parted --script --align opt ${DEVICE} mkpart pri 1MiB 100%
/sbin/parted --script --align opt ${DEVICE} set 1 boot on
/sbin/parted --script --align opt ${DEVICE} p # print

# Create LUKS container and open/mount it
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --use-urandom ${DEVICE}1
cryptsetup luksOpen ${DEVICE}1 ${LUKSNAME}

# We put this UUID into an env var to reuse later
CRYPTUUID=`blkid -o export ${DEVICE}1 | grep -E '^UUID='`

Necessary on Ubuntu: install ZFS utils in live CD session:

apt-get install -y zfsutils-linux
/sbin/modprobe zfs # May not be necessary

Create your new ZFS zpool and datasets. This example will create multiple datasets for the system root directory, /boot, /home, /var, and /var/log.


/sbin/zpool create -f \
  -R ${TARGET} \
  -O mountpoint=none \
  -O atime=off \
  -O compression=lz4 \
  -O normalization=formD \
  -o ashift=12 \
  ${ZPOOLNAME} /dev/mapper/${LUKSNAME}

/sbin/zfs create -o canmount=off        ${ZFSROOTBASENAME}
/sbin/zfs create -o mountpoint=/        ${ZFSROOTDATASET}
/sbin/zfs create -o mountpoint=/boot    ${ZPOOLNAME}/boot
/sbin/zfs create -o mountpoint=/home    ${ZPOOLNAME}/home
/sbin/zfs create -o mountpoint=/var     ${ZPOOLNAME}/var
/sbin/zfs create -o mountpoint=/var/log ${ZPOOLNAME}/var/log

/sbin/zpool set bootfs=${ZFSROOTDATASET} ${ZPOOLNAME} # Do not skip this step

/sbin/zpool status -v # print zpool info

Fetch the Void Linux rootfs. Get it from any of the project's list of mirrors. Assuming your architecture is x86_64, fetching the latest Void rootfs at time of writing would look like this:

wget -N ${VOIDMIRROR}/void-x86_64-ROOTFS-20171007.tar.xz
wget -N ${VOIDMIRROR}/sha256sums.txt
wget -N ${VOIDMIRROR}/sha256sums.txt.sig

Validate the rootfs checksum. You should also fetch and verify its GPG signature, but you probably won't.

sha256sum ./void-x86_64-ROOTFS-20171007.tar.xz

Compare this checksum with the value from sha256sums.txt. If it matches, untar its contents into ${TARGET}:

tar xf ./void-x86_64-ROOTFS-20171007.tar.xz -C ${TARGET}

Create a new ${TARGET}/etc/fstab that matches your ZFS datasets. For example:

cat ~/fstab.new

/dev/mapper/cryptroot /        zfs  defaults,noatime 0 0
zroot/boot            /boot    zfs  defaults,noatime 0 0
zroot/home            /home    zfs  defaults,noatime 0 0
zroot/var             /var     zfs  defaults,noatime 0 0
zroot/var/log         /var/log zfs  defaults,noatime 0 0

chmod 0644 ~/fstab.new
mv ~/fstab.new ${TARGET}/etc/fstab

Create a LUKS key file, add it to the LUKS container, and put its info into a crypttab:


# Create key file:
dd if=/dev/urandom of=${KEYDIR}/${KEYFILE} bs=512 count=4
# or, faster:
# openssl rand -out ${KEYDIR}/${KEYFILE} 2048
chmod 0 ${KEYDIR}/${KEYFILE}

cryptsetup luksAddKey ${DEVICE}1 ${KEYDIR}/${KEYFILE} # This prompts for the LUKS container password

ln -sf /dev/mapper/${LUKSNAME} /dev

# Set crypttab:
echo "${LUKSNAME} ${CRYPTUUID} /${KEYFILE} luks" >> ${TARGET}/etc/crypttab

Mount some special mountpoints into the new FS:

for i in /dev /dev/pts /proc /sys
  echo -n "mount $i..."
  mount -B $i ${TARGET}$i
  echo 'done!'

Copy /etc/resolv.conf into the new system. You need this to resolve network endpoints.

cp -p /etc/resolv.conf ${TARGET}/etc/

chroot into ${TARGET}:

chroot /mnt

Configure the system. There are some post-installation steps documented that you can perform now with regards to setting the hostname, adding users, installing software, et cetera. At a minimum, set the root password and a locale:

echo "LANG=en_US.UTF-8" > /etc/locale.conf
echo "en_US.UTF-8 UTF-8" >> /etc/default/libc-locales
xbps-reconfigure -f glibc-locales

Update the Void Linux software package repository. You may want to pick a faster mirror first, as I've done here:

# optional:
echo 'repository=http://lug.utdallas.edu/mirror/void/current' > /etc/xbps.d/00-repository-main.conf


xbps-install -Su

The Void Linux rootfs is tiny, only about 35 MB, and very minimalistic. Install some packages that are not in the base install: a kernel, "cryptsetup" so you unlock your LUKS container, and the "GRUB" and "ZFS" packages so you can boot your system and access your zpool:

xbps-install linux cryptsetup grub zfs

A note about kernels: Void Linux has a number of kernels available. Check your mirrors for all your available options. The default kernel package is "linux", which will give you a modern kernel, but you can also select "linux-lts" which will install an older, presumably more stable, kernel. If neither of these suit you, you can review the linux kernel packages available and install the kernel that best fits you. For instance, Linux kernel version 4.17.1 was released on 2018-06-11 and had a corresponding Void package available within 3 days, so running "xbps-install linux4.xx" for any available value of "xx" is a plausible kernel you can use here. Caveat: not all kernels are created equal and for your ZFS-root Linux machine to work your kernel needs to understand ZFS, which means that a new kernel will need to compile new kernel modules. This can fail, and often does. Be careful about mixing and matching your kernels with your DKMS modules, or you may lose the ability to import your zpools. If you install a specific kernel, make sure to install the matching kernel headers or you will be unable to build your ZFS kernel modules.

Ensure that GRUB can read your ZFS root dataset:

grub-probe /

The output of this command must be "zfs". If it isn't, stop and correct your install.

Edit the dracut settings so your initrd will contain the LUKS key file.

vi /etc/kernel.d/post-install/20-dracut

Make the following change:

- dracut -q --force boot/initramfs-${VERSION}.img ${VERSION}
+ dracut -q --force --hostonly --include /boot/rootkey.bin /rootkey.bin boot/initramfs-${VERSION}.img ${VERSION}

Adjust the "/boot/rootkey.bin" and "/rootkey.bin" values as needed. These should match ${KEYDIR}/${KEYFILE} and the /${KEYFILE} value you put into ${TARGET}/etc/crypttab, respectively.

Build your new initrd. This requires you to know your exact kernel version:

xbps-reconfigure -f linux4.16

Adjust the name of your linux4.xx package accordingly. Your DKMS modules will have been built when you install the "zfs" XBPS package, but the reconfiguration step here will attempt to re-compile them if they aren't already present. Be aware: if your "spl" and "zfs" DKMS builds fail, you will not be able to boot your machine. Stop now and fix your kernel before proceeding.

Edit /etc/default/grub. You will want to edit or add the following three lines:

  • GRUB_CMDLINE_LINUX_DEFAULT # add the "boot=zfs" option

As an example, changes to your /etc/default/grub might look like this:

- GRUB_CMDLINE_LINUX_DEFAULT="loglevel=4 slub_debug=P page_poison=1"
+ GRUB_CMDLINE_LINUX_DEFAULT="loglevel=4 slub_debug=P page_poison=1 boot=zfs"
+ GRUB_CMDLINE_LINUX="cryptdevice=UUID=93a7dbeb-2ae0-48b2-bd00-c806ae9066df:cryptroot"

Install the bootloader.

mkdir -p /boot/grub
grub-mkconfig -o /boot/grub/grub.cfg
grub-install /dev/sda

Exit the chroot.

exit # leave the chroot

Unmount your mountpoints.

for i in sys proc dev/pts dev
  umount ${TARGET}/$i

Unmount your ZFS mountpoints and change their mount option to "legacy". This is a start-time mounting reliability thing that may or may not be necessary for you, but I've found that some systems that use ZoL have problems with letting ZFS automatically manage mounting their datasets.

/sbin/zfs unmount -a

for dataset in boot home var/log var
  /sbin/zfs set mountpoint=legacy ${ZPOOLNAME}/${dataset}

/sbin/zpool export -a -f



There are some scary-looking error messages in the init sequence that I haven't figured out how to fix, but they seem to be benign. The method given here boots a Void Linux system (seemingly) without trouble.

Final thoughts on Void Linux: I've been playing around with getting a LUKS+ZFS-on-root configuration in Void for at least a couple of months without success until recently. The OS itself is a nice example of a Linux distro that isn't a typical Debian/Ubuntu/Red Hat fork. It appears to have been created in 2008 by a (former?) NetBSD developer to showcase the XBPS package management system, which itself appears to be an ideological re-design of pkgsrc. The project lead went missing in January 2018, so Void has had to scramble to obtain in absentia control over their own project. They are, for lack of a better term, forking themselves. Since Void uses a rolling release model and there are no regularly-scheduled release milestones to be blessed by the guy in charge, this doesn't really affect you as an end user, but I thought it was worth mentioning that enough people care about Void to not let one person let it die.


A "review" of The Carrier (1988)

Your touch is sweet and kind
But there's something on your mind
I can't see your eyes, I can't see your eyes....

My Blu-ray copy of The Carrier came in the mail today. I tell you this because now that I have my copy, you should get yours.

I like good movies, but I loooove bad ones. I thought The Carrier would be one of the latter and discovered it to be closer to the former. I dismissed it on its face as a conventional horror movie: low budget, no-name cast, shot over two weekends with a small town of extras who had a camera, a microphone on a stick, and a lot of "can do" enthusiasm.

Boy, was I wrong. So, so wrong. If you choose to watch The Carrier, you should not make the same mistake I did. It is not your run of the mill late-80s horror film, not by any stretch of the imagination.

The movie starts with a square dance scene and an off-brand Rebel Without a Cause/The Outsiders-lookin' young man crashing the party. A group of locals is casually talking about "the black thing" that has begun appearing on the edge of town and debating whether to fear it or doubt its existence. "Oh," I thought. "This is going to be a by-the-numbers horror movie with a not-so-subtle social message about bigotry in it." I lost interest. I let the movie play and went off into the other room to prepare dinner, occasionally poking my head in to see if the creature feature special effects would be goofy enough to make me chuckle.

Do not do this. Sit down. Watch the film. There's a lot going on in this movie that is easy to miss, even if you think you're paying close attention.

I first saw The Carrier last month, and I haven't stopped thinking about it since. I half-heartedly watched it out of the side of my eye at first and then became so engaged and so haunted by its imagery that I sat down and watched it again the next day. Then I kicked myself for missing so much of it the first time around.

Then I went out and bought it on Blu-ray.

I don't even own a Blu-ray player.

Upon first glance, I considered The Carrier to be a fun, low-budget horror flick, something MST3K-worthy in the same vein as The Bloodwaters of Dr. Z or Boggy Creek (I or II, take your pick), or even The Giant Spider Invasion. Especially The Giant Spider Invasion. All of these films are so-bad-they're-good in the "get drunk with some friends and make fun of the hillbillies" way.

I was wrong. Very, very wrong.

The Carrier is not a perfect film and it certainly contains flaws that plague every low-budget movie. If you give it a chance, you will bear witness to an intelligent story of alienation and fear. Part The Outsiders, part The Crucible, part The Purge, this film is the story of a young man estranged from his community who becomes, very literally, toxic to everyone around him. I dare not say more. Perhaps I've already said too much. Don't spoil this movie for yourself. Just push play and observe.

The film is a delight to watch. It was shot on that grainy sorta-expired film stock from the mid 1980s that makes it look like it could have been filmed in the 1970s. The acting is poor but heartfelt and earnest. The special effects are exemplary for the budget of the movie, and just when you start thinking this is going to be a fun, cheesy "guy in a rubber suit" movie, it veers you down a sharp left turn of paranoia and old-fashioned, home-grown, utterly volatile clan warfare. What?

The Hatfields and the McCoys were a spat over a Scrabble game compared to The Carrier. This movie gets bonkers. Full-on mob mentality, red scare, wrap-yourself-in-plastic-to-stay-pure, murder-your-neighbor-for-his-stuff, end-of-days mania. And it is engrossing and frightening and bizarre and wonderful. Every so often you find a gem like The Carrier, where a humble band of folks just try to make a little cinematic enterprise for funsies and end up with a secret masterpiece of dread and creeping, amorphous, "fear of the unknown" terror. The film underscores the horrifying notion that anything can kill you at any time and you can't know what or how, but you have to do something — anything — to protect yourself and your loved ones at any cost.

And in that mad rush to find your own tentative safety, what are you willing to sacrifice to reach it?

10 out of 10. Make sure your cats are safely upstairs, wrap yourself in a big plastic sheet, and watch The Carrier. This movie knocked my socks off and that was when I was only half paying attention to it. The full experience is shocking, bizarre, and will stay with you for days.

The Blu-ray contains two cuts of the movie, a director's cut and a theatrical cut. The theatrical cut contains a 2-second error at the half-way point that forces the video and audio to be unwatchably out of sync and I still bought it anyway because it's just that good. If you want to see a small town of hillbillies lose their god-damned minds over barn cats, it's on YouTube in its entirety.


Ansible Week - Bonus - Automating OpenBSD Installs

Manually setting up your OpenBSD VMs is for chumps.

The recent release of OpenBSD 6.3 gave me an excuse to finally sit down and start teaching myself how to use the builtin OS autoinstall feature.

OpenBSD had supported installation templates for a few years now, but I was always mired in the artisinal mindset. I believed that setting up a new machine was a labor of love and in spite of the simplicity of the install wizard felt one needed to spend at least that long minute or two crafting the hard and fast rules by which the system will live forever.

ZFS sure would be nice to have on the platform, but no. There's no way in hell that's gonna happen.

You may want to take that time thinking about the disk layout of your next mail server or firewall or whatever, but when it comes to a VM image you want to run at scale in the cloud, there are advantages to finding ways to streamline the process after you've made those decisions the first time.

The central tool of autoinstallation of OpenBSD is the "install.conf" file, which contains answers to every question that the install wizard would normally ask you interactively.

An example install.conf would look like this:

System hostname = mymachine
Which network interface do you wish to configure = hvn0
IPv4 address for hvn0 = dhcp
IPv6 address for hvn0 = none
Which network interface do you wish to configure = done
Password for root = $2b$08$sjHcRpZW2Jg7ryPxeHEBNu7DsyA3Fg8FrDvqLSqkx7TFmbUST9z/C
Public ssh key for root account = none
Start sshd(8) by default = no
Do you expect to run the X Window System = no
Do you want the X Window System to be started by xenodm(1) = no
Change the default console to com0 = no
Setup a user = no
Allow root ssh login = no
What timezone are you in = UTC
Which disk is the root disk = sd0
Use (W)hole disk MBR, whole disk (G)PT, (O)penBSD area or (E)dit = Whole
Use (A)uto layout, (E)dit auto layout, or create (C)ustom layout = A
URL to autopartitioning template for disklabel = none
Location of sets = cd
Set name(s) = done
Directory does not contain SHA256.sig. Continue without verification = yes
Location of sets = done

This is enough to set up a machine in short order. You can customize it to your wishes, and there's even a disklabel template format you can provide in a separate file:

/		250M 
swap		80M-256M 10% 
/tmp		120M-4G	8%

This is really nice, because you can put this disklabel template online and set its URL in the "URL to autopartitioning template for disklabel" line of install.conf and get a very-close-to-hands-free OpenBSD install just using two config files on a trusted internal webserver and the default OpenBSD installXX.iso.

You can even embed the install.conf into custom install media to make it totally automated if you want.

So, in conclusion, OpenBSD autoinstallation features, plus an Ansible system setup playbook, and scriptable Azure utilities can combine to create a very nice cloud service platform. Reshape the world as you see fit.


Ansible Week - Part 4

Ansible is powerful, so let's put together a real-world example of how to use it for installing some software.

We have recently been blessed with a new crypto library aimed at maintaining strong security that doesn't depend on the difficulty of factoring composite numbers.

It's so new, there isn't a package for it, and its installation steps are quirky, to say the least. It boils down to:

  1. Install some pre-requisite tools (OpenSSL, GMP, Python3, gcc)
  2. Create a new user, libpqcrypto
  3. Fetch the software as that user
  4. Create some symlinks
  5. Compile the library

Ansible was made to handle all of this, and to demonstrate the power of Ansible roles in our playbook, we're going to split the prerequisite installation steps out into their own separate pieces, or "roles". Roles are useful for when we want a set of commands we can use and reuse to set up a build environment, even if we end up not using that environment to build this exact libpqcrypto library again in the future.

By putting your work into roles, Ansible allows you to group your tasks into distinct phases and sort them based on your defined dependencies between them. In other words, if you have a role called "fasten-seatbelt", you can define other roles as unique dependencies for it, maybe ones called "sit-in-seat", "have-keys-in-hand", and "buy-a-car". Each of these roles are generalizable into other things, so if you ever write "ride-rollercoaster", you can reuse the tasks of "sit-in-seat", though perhaps without the car-buying dependency.

Maybe we should just build the role and show you.

First, we set up a target host. We've done this before, but for this exact example we're going to install our OS (a Debian-based Linux in this case), patch it to current, add our ansible user, and create our OpenSSH key:

sudo apt install -y openssh-server
sudo groupadd ansible
sudo useradd -g ansible -m ansible
ssh-keygten -t ed25519 -N '' -q -f ~/.ssh/id_ed25519
cd .ssh
cp -p ./id_ed25519.pub ./authorized_keys.new
chmod 0600 ./authorized_keys.new
mv ./authorized_keys.new ./authorized_keys
sudo visudo
#add line:

Sync this key, ~ansible/.ssh/id_ed25519 to your ansible host and build your inventory file:



We run ansible -i ./hosts.libpqcrypto -m ping libpqcrypto and our ping got ponged, so we can write our first role. It's a .YML file outlining which packages we want to install before we go about doing anything else with our machine:

- name: install compiler and libpqcrypto pre-reqs (apt)
  become: yes
    name: "{{item}}"
    state: present
    cache_valid_time: 86400
    update_cache: yes
    - build-essential
    - gcc
    - libssl-dev
    - libgmp-dev
    - make
    - python3
  when: ansible_pkg_mgr == "apt"

This is just a normal Ansible playbook. We turn it into a role by putting it in an exact location on our Ansible machine: ./roles/libpqcrypto-prereqs/tasks/main.yml.

This creates a role called "libpqcrypto-prereqs" we can reference in our role that will fetch the libpqcrypto source, configure the host to create a user, make some symbolic links, and compile the code as per the instructions on the web site. Let's make another role to do these steps. If our previous role has run successfully, we know we have our compiler and dev libraries on the target host and can just do the other steps. So we make a role, "libpqcrypto-build", and put this into "./roles/libpqcrypto-build/tasks/main.yml":

- name: create group
  become: yes
    name: libpqcrypto
    state: present

- name: create user
  become: yes
    name: libpqcrypto
    createhome: yes
    group: libpqcrypto
    home: /home/libpqcrypto
    shell: /bin/false
    state: present

- name: fetch latest version string
  become: yes
  become_user: libpqcrypto
    url: https://libpqcrypto.org/libpqcrypto-latest-version.txt
    dest: /home/libpqcrypto/libpqcrypto-latest-version.txt
    validate_certs: false # ouch

- name: read latest version string
  shell: cat /home/libpqcrypto/libpqcrypto-latest-version.txt
  register: version

- name: stat libpqcrypto file
    path: libpqcrypto-{{version.stdout}}.tar.gz
  register: st

- name: fetch libpqcrypto
  become: yes
  become_user: libpqcrypto
    url: https://libpqcrypto.org/libpqcrypto-{{version.stdout}}.tar.gz
    dest: /home/libpqcrypto/libpqcrypto-{{version.stdout}}.tar.gz
    validate_certs: false
  when: st.stat.exists == False

# never use unarchive
- name: untar libpqcrypto
  become: yes
  become_user: libpqcrypto
  shell: tar -xzf /home/libpqcrypto/libpqcrypto-{{version.stdout}}.tar.gz
    chdir: /home/libpqcrypto/
    creates: /home/libpqcrypto/libpqcrypto-{{version.stdout}}/

- name: create symlinks
  become: yes
  become_user: libpqcrypto
    src: /home/libpqcrypto
    dest: /home/libpqcrypto/libpqcrypto-{{version.stdout}}/{{item}}
    owner: libpqcrypto
    group: libpqcrypto
    force: yes
    state: link
    - link-build
    - link-install

- name: remove clang compiler option
  become: yes
  become_user: libpqcrypto
    path: /home/libpqcrypto/libpqcrypto-{{version.stdout}}/compilers/c
    regexp: "^clang.*"
    state: absent
- name: timestamp
  shell: date
  register: timestamp

- name: start compile libpqcrypto
    msg: "{{timestamp.stdout}}"

- name: compile libpqcrypto
  become: yes
  become_user: libpqcrypto
  shell: ./do
    chdir: /home/libpqcrypto/libpqcrypto-{{version.stdout}}

- name: timestamp
  shell: date
  register: timestamp

- name: end compile libpqcrypto
    msg: "{{timestamp.stdout}}"

There's a lot going on here, but you can pretty much tease out what each of these steps is doing to your target host. Many Ansible tasks have an argument called "state" that can be either "present" or "absent". The task doesn't necessarily perform the work if it's already been done, so what we're really setting up is a "configuration outlining the desired state of the system" or a "desired state configuration" for short. This is a term I just now invented all by myself. You're welcome.

We ensure there's a group called "libpqcrypto" and a user in that group with the same name. We fetch the libpqcrypto version string and, optionally, fetch that particular version of the software package if that tarball doesn't exist on the target host. We check the existence of that file with the "stat" module and use a "when:" conditional to tell Ansible to run that task only if it needs to satisfy the condition.

Then we create some symlinks with the "file" module, and then we go off script for a second to make a one-line change to the C compilers setting to remove the clang line. This can be skipped if the host has clang installed. We could tailor this task with a "when:" conditional, either checking for "/usr/bin/clang" on the machine, or by comparing the task against what Ansible determines the machine's OS to be. You can pull the list of values that Ansible checks by running the "setup" module: "ansible -i ./hosts.file -m setup hosts-group-name".

We print the date by creating a "register" called "timestamp" populated with the output of the date command. We do this again when the compile task runs and that gives us a notice of how for long the task in between ran.

Finally, (except for that last timestamp task) we compile the libpqcrypto software by running ./do (with the "shell" module), as the libpqcrypto user (with become_user), in a specific directory (identified with the "chdir" argument).

Great! But how do we actually run this role? We need to point out that this role has a dependency on installing the pre-requisites, so we list them in "./roles/libpqcrypto-build/meta/main.yml":

  - { role: libpqcrypto-prereqs }

Then we put put our "libpqcrypto-build" role, complete with its listed pre-req role(s), into a new playbook file, libpqcrypto.yml:

- hosts: libpqcrypto
    - { role: libpqcrypto-build }
    - group_by:
        key: "{{ansible_distribution}}"

Note that we never call the libpqcrypto-prereqs role directly. We call one role in the "roles:" section and with its dependency file in .../meta/main.yml Ansible figures out what to do and in which order to do it.

Naturally you can make this a fairly complicated web of dependencies: "car" requires "tires", "tires" requires "hubcap", and so on as I explained earlier. I haven't seen Ansible have a problem with sorting a dependency chain so long as it can all eventually be collapsed into a linear sequence per host.

Note also that we execute our tasks by groups of their distributions. I started doing this in order to avoid having to create specific conditionals for various target hosts:

- hosts: webservers
     - { role: debian_stock_config, when: ansible_os_family == 'Debian' }

We can set up different roles on different machines based on how we know they'll need to execute our playbook tasks. If all your machines are homogenous, you can skip grouping your tasks, since you won't have variants. "group_by" is much more powerful than this, since you can use it to create groups on the fly by adjusting the value of "key".

Ansible Week - Part 3

We have a target machine and we've put its access credentials into our inventory. We can ping it with Ansible, but it's time for us to do something. You can define these steps into a "playbook", which is just a specially-formatted file that outlines your chosen actions and which machines upon which you want to act. This file is in YAML, which Isn't Terrible so you Shouldn't Be Afraid of It:

$ cat ./playbook.yml
- hosts: mymachines
  - name: Create a new file
    shell: echo "Hello world" >> {{ansible_ssh_user}}/hw.txt
      creates: "{{ansible_ssh_user}}/hw.txt"

I'm fairly certain no one actually understands the technical syntax of YAML. Everyone simply takes an existing valid YAML file and edits it as they desire into a new YAML file. When they need another change, they edit the previous one and make a new YAML file, and so on. The YAML specification, then, is just a matter for people writing YAML parsers. YAML writers can just steal old .YML files and go on with their lives, which is nice.

This is a simple playbook that just creates a new file on the target host. It defines your target hosts ("mymachines") and defines a task to run on it. Tasks have a name and a type, in this case the type is "shell" and the shell type takes one argument called "creates", which tells ansible that this task has a defined end result: it creates a specific file. Therefore, if this file already exists, Ansible will skip performing this task.

Now we run it:

ansible-playbook --inventory=./hosts.my ./playbook.yml

Note that the actual action of this task is echo "Hello world" and it is appended to a file. Without the "creates" line in this task, running this playbook multiple times will continue to add "Hello world" to that file over and over again. These are the kinds of tricks and gotchas that will occupy your mind as you create bigger and more complicated playbooks and which I began learning and correcting in my own sysadminning back in the days of mailcom scripting.

Note, too, that we are using a variable in the playbook called "ansible_ssh_user", which we defined in our inventory file. This means that we can reuse this playbook against a number of different inventories, across any number of environments, anywhere we want to have this new "hw.txt", without having to customize a playbook for every environment. Ansible strives to be modular, reusable, and composable, which are fancy words that really just mean "easy to mix and match pieces so I don't end up reinventing the wheel all the time".

This modularity really starts to take effect when you stop putting your actions into playbooks and evolve into putting them into roles. What's a role? It's tasks on steroids.

Next time: Role playing.


Ansible Week - Part 2

We have previously added a service account with a new SSH key to a target host. With that key, we can start using our Ansible setup on it to make administrative changes.

Remember that Ansible is agentless. The service account and SSH key you created are the only elements needed to authenticate your changes against your target host from a central command machine. Because this machine literally holds the keys to your proverbial kingdom, you want to take extra-special precaution that it doesn't fall into the wrong hands. I would run my Ansible node in a dedicated VM, and I'd use full disk encryption on it so its data is protected when not in use.

First thing to do when you're setting up your Ansible config is test that you can remote into your machines. Start with a simple hosts file to describe your environment. Ansible calls this an "inventory":

cat ./hosts.my


You can see that there is a "[mymachines]" section with an IP address listed, and a "[mymachines:vars]" section with some values defined for it. The IP address is my first target host, and the vars are how I'd set up ansible: using the "ansible" user, connecting via SSH to port 22/tcp, and using a specific SSH key called "id_ed25519.mymachines". This "id_ed25519.mymachines" key was created on the target host in the last post; its corresponding public key will be on the target machine in "~ansible/.ssh/authorized_keys".

You can tell that this box is an OpenBSD machine because the "become" method is "doas", which is a system-specific replacement for "sudo". When the "ansible_ssh_user" account needs root permissions, it supports a number of builtin privilege escalation options, including "doas" and "sudo". It considers this the designated "become method". In other words, this variable answers the question "Which method do I use to become root?"

Now we test:

ansible --inventory=./hosts.my --module-name=ping mymachines

Run ansible, with the inventory file that defines your own target host and which credentials to use to remote to it. We are using the "ping" module, and we are defining the "mymachines" section of the inventory. You can infer from this that an inventory file can be enormous and you can have multiple sections within one file that you can then reference in specific lines as needed, like so:

$ cat ./hosts.full-inventory



If we just wanted to do something with our [web] machine among the full inventory, we could single it out:

$ ansible --inventory=./hosts.full-inventory -m ping web

Or just the [dns] hosts:

$ ansible --inventory=./hosts.full-inventory -m ping dns

The module we're using, "ping", is the Ansible version of a network test. Can you remote into your targets? Ansible-ping them. The results should be simple:

ansible --inventory=./hosts.my --module-name=ping mymachines | SUCCESS => {
  "changed": false,
  "ping": "pong"

Ansible reports back (1) the host it reached, (2) success or failure on the operation, and (3) the result, which was no changes made to the target machine and the response to "ping" was "pong". You're all set to manage this machine with Ansible now.

But just pinging machines is boring. You want to add software, add and remove users, and make configuration changes. Ansible is good for setting up new machines and pushing changes to existing boxes. And the art of crafting a new Ansible configuration exists in how well you write a "playbook", which is a sequence of instructions you define, which are run against one or more hosts in your inventory.

Are you getting a sense of how powerful Ansible can be now?

Next time: Yet Another Yet Another Markup Language post.

Ansible Week - Part 1

Let's get the bad news out of the way right now: Ansible is built on Python.

I know, I know. We'll get through it, guys... somehow.

I found that a number of other things I like have Python as a pre-requisite, including but not limited to the Microsoft-blessed WALinuxAgent you need on non-OpenBSD/LibreSSL Azure VM images. I have also found that I am, like, the only guy on the planet who hates Python with a bright, firey passion. So adopting Ansible is probably going to be easier for you than it was for me.

Back to mailcom for a sec. mailcom was very much a sysadmin's tool, written by a sysadmin for sysadmins. It was sharp, unbalanced, and unapologetic. You needed to learn how to "read" mailcom's logs to see if your plan worked as expected and there was zero chance it would hold your hand or ask you for clarification before obliterating your service like it was so much your hopes and dreams. There was a skill in "seeing" the mailcom activity in your mailcom logs.

When I started looking at kinder, gentler mailcom tools, I started with Puppet. Puppet does really interesting things, but Puppet's design requires an agent be present on the host. As I recall from years back when I last worked with Puppet, you needed to install Puppet, run it as root, and have it phone home regularly. From a security perspective, this could be A Bad Idea if you don't do it right and, as a neophyte Puppeteer, that was very likely.

Ansible's one big advantage, in my opinion, is that it is totally agentless. You still need a superuser account with sudo or doas permissions, but the remote access is managed entirely through SSH. (Ansible apparently also supports Windows via WinRM. Whatever.)

So maintaining your Ansible army is an exercise in SSH key management. Modern security experts wring their hands panicking over what they call "lateral movement", where an attacker on your network who has compromised one machine can access more machines with the same set of credentials. This is why people tell you not to use the same password on multiple accounts, but when it comes to your SSH keys, it's unbelievably easy to create one key on your base image and that's the key you use on every host cloned from that image.

In general, you want to avoid doing as much as root as you can. So to prep your target machine(s), you would want to create the following:

groupadd ansible
useradd -g ansible -m ansible

You may need to add your new local service account to the root or wheel group on your host. Add it to your /etc/sudoers file or, on OpenBSD, your doas.conf:

cp -p /etc/doas.conf /etc/doas.conf.bak
cp -p /etc/doas.conf /etc/doas.conf.new
echo permit nopass ansible as root >> /etc/doas.conf.new
mv /etc/doas.conf.new /etc/doas.conf

Once you have your service account, configure an SSH key to use to authenticate into that host.

# su -l ansible
$ ssh-keygen -t ed25519 -N '' -q -f ~/.ssh/id_ed25519
$ cd ~/.ssh
$ cat ./id_ed25519 >> ./authorized_keys
$ exit

You may need to fix permissions on your authorized keys file (chmod 0600) if it is not correct. Ensure that your sshd service is running and get ready to hook up this new SSH key to your ansible machine.

Next time: Verifying your distance to the target. One ping only.