This is my IT Infrastructure wiki. This where I talk about all things infrastructure.

This feed contains some of my blog posts that are tagged with infrastructure and other essays that link here.

James Richardson
Different Way of Thinking About Systems

As you know, I've been migrating my sytems to guix, which at the moment consists of packaging things that aren't yet present. I'm starting to wonder if the current standard of putting systems together is wrong. I know, we've been doing this way since the beginning, and saying this is wrong is perhaps too strong a statement. But maybe it is not.

What is a system?

Before I make a statement such as "The current standard of putting systems together is wrong.", I should probably define what is meant by system. I am using the word system to refer to a thing composed of a kernel and an operating system composed of packages. I'm thinking of a Linux kernel, with a GNU operating system, but the arguments can be abstracted to UNIX-like systems, such as any of the proprietary ones left (e.g. AIX, Solaris, or HP/UX), the *BSDs, and possibly systems around the GNU Hurd. For sake of concreteness, I'll assume a Linux kernel and a GNU operating system, typically any mainline GNU/Linux distribution available. The vast majority of my GNU/Linux work has been with Debian, so much of this may be colored by such experience.

A system is then a Linux kernel and a GNU operating system, which is composed of packages. The Linux kernel is also usually packaged along with the operating system and there is typically a bootstrap process which facilitates installing the operating system and kernel onto a machine, either physical or virtual.

So by saying the current standard of putting systems together is wrong, I'm really getting to the point that the way we package things is wrong.

Why are mainstream package managers doing things the wrong way?

In short, they depend on /usr and its children to maintain state. What's wrong with that? /usr is mutable and its mutability is not strictly managed by the package manager. Although package managers allow mutation in a controlled matter, they largely ignore users. If I as a user am developing a python application that depends on a particular version of a python library and the system administrator upgrades the python library to a newer version that has a different api, my application just broke.1 If I read the FHS, it says /usr/local is reserved for local software, or by de facto standards software that is not part of the operating system. It doesn't say how to handle cases where software in /usr/local depends on packages in the operating system or specific version of packages in the operating system. There typically doesn't seem to be a way of marking operating system packages as required by locally installed software or handling the case of having multiple versions of a given package installed.

Is there a better way?

Probably. Typical package managers are imperative, maintain meta-state (e.g. which packages are installed, at which versions, and dependencies) database and keep state in /usr.

Instead of an imperative approach (e.g. how to install package X), can we use a declarative approach (e.g. install package X)? Can we also remove state from the system and have state contained within the package definition? The answer to both questions is yes. I'm not going to attempt to argue the reasons in this blog post, because I wouldn't be able to so without spending much more time than I would like. Instead I will refer you to Eelco Dolstra's PhD thesis, The Purely Functional Software Deployment Model. For a much lighter read, see NixOS: A Purely Functional Linux, which speaks to building an entire stateless operating system.

  1. I know about language specific package managers (.e.g. pip, gems, cpan), I don't think they help outside of the single user case. ↩

James Richardson
Why Don't I Document Things

Einstein defines insanity as doing the same thing repeatably and expecting different results. By this definition, I'm probably insane1. I add services to my infrastructure, say for instance, MediaGoblin. I didn't document what I had to do to install it, get it to run or anything. I can remember. Well I didn't remember last time I did something. That's ok, I'll remember this time as I was paying better attention. Oh, I understand now, I'm insane, expecting I'll remember why I did something next time, unlike every previous time. As another example, I'm sure there is a reason I run the house with a 10/25 netmask. Maybe to keep game consoles off of my main network, I don't know. I think sane people would use a couple of 10/8 networks or even something like 10.0.0/24 and 10.0.1/24 and squash these into a 10.0/16 at the edge if needed. But being insane, I split the network at I neglected to document why I did this.

But documentation is hard

Writing documentation is hard. Writing good documentation is even harder. Harder still is writing good documentation that is actually useful. On top of that documentation is hard, it's also not fun.

Now that we know documentation is hard and not fun, and the we are insane thinking we can rely on our memories in lieu of documentation, how do we resolve this dilemma?

Removing (some) insanity

Well, the obvious thing would be to document everything. I know, such is hard and not fun. I can't do much about the hard part. Perhaps practice. I know from this blogging thing I'm doing lately, blogging is becoming easier, perhaps the same works for documentation.

The easiest way for me to write text is with Emacs. I like org-mode and use it for most everything else. Why not use org-mode for system documentation? Well, I've done so in the past, even to the point of publishing a web site with said documentation. Well, it wasn't really workable and rather cumbersome. My new idea is instead to use markdown and publish documentation to an ikiwiki instance, the same software and workflow that powers this bliki. I get to use the same workflow as I use for publishing this site; I have nothing new to learn, there is no impedance mismatch. I use emacs to create markdown files, commit them into git, push to the remote, which builds the website.

I have created a site for my own use on my intranet which seems to be working out quite well. As I'm going through thinking about these things, I am realizing we perhaps need a new approach to thinking about system construction.

  1. I'm probably insane by other definitions, also. ↩

James Richardson
Restructured DNS Zones

Well, it is time for my DNS infrastructure to evolve, again. I run services in the domain behind my cable modem (shhhh! don't tell my provider). Initially I used the DNS services provided by my registrar and only published public names. I quickly discovered running a mail server behind a cable modem is, well, nigh impossible. The ip is listed in a dynamic pool which most (if not all) mail servers consider a spam ip from which mail will not be accepted.

I purchased a small linode, moved my email, followed soon by many other services I run. This worked out quite well, as the linode has a higher uptime than my servers ;). DNS became rather interesting. I still wanted to keep all the hosts behind my cable modem in DNS. I needed to keep hosts and various (SRV, MX) records in DNS for public facing services. I never really liked split-horizon or split-brain configuration. Always seemed like a small error could either break the system or expose internal names to the internet.

I decided to run a public dns managed by linode's dns servers and a private dns server only accessible from nodes behind the cable modem. This worked quite well at first. The issue came when I added a new public service. I had to update the DNS at linode and I had to update the DNS locally. I would usually forget one or the other. There had to be a better way.

I added another subdomain into the mix. Everything behind the cable modem went into the domain, my linode became The dns server for is behind my cable modem, only accessible to my private network. I am now able to publish my private network to its own dns server without leakage or interference with my public zones. I still create A records in the space to advertise public services. I could do cnames into, but I'm not a big fan of cnames. I think cnames should be reserved for redirecting a name to a name into another zone under different control. I control and, so I don't really see the point of a cname, as it does put extra load on resolvers.

I've been running this way for a while now. It seems to work much better for me than trying to keep everything in a flat name space. I did have to add to linode. I still a service that goes through my cable modem, that I need to be able to access from outside. All that is left now is ?update-dns-with-cable-modem-ip.