This is my wiki where I talk about GNU/Linux systems engineering. Mostly thoughts and ideas, I'll share code where I can, but my professional bits are trapped in $WORK, so this is mostly the crazy stuff I do at the house. The idea being if I write about it, I can organize my thoughts better, identify corner cases, and achieve a better understanding. I publish this on the web just in case someone else will find it useful (or a headhunter or hiring manager will find my skill set useful ;) ).

I had a rather nasty system failure from which I am still sort of recovering. I don't think I've lost anything, but nothing is where is should be. I'm finding it rather hard to get things working like I wish. I wish I already had ?cfengine setup so it could just fix things. My wireless router is handling firewall, DNS, and DHCP. While this is functional in some sense of the word, it is not very configurable. Ideally cfengine would help me out here, but I don't have a good version right now.

So how do I get out of this mess and be able to manage my systems again? How do I prevent this from happening again?

Well I blogged about building an infrastructure previously, seems now would actually be a good time to build such a thing here. So what do I really want to accomplish?

  1. Survive a system failure with no data loss.
  2. Recovery from a failure by updating a few config files and netbooting a box.
  3. Refresh hardware by updating a few config files and netbooting a box.
  4. Everything documented so if any of the above fails, I'm still ok.

Survive a system failure

I think this is ok, maybe. I have amanda backing up important data from servers and laptops (in as much as laptops are actually on the network when amanda runs). The idea being I can automatically build a server by netbooting it, after which I just need to recover user data. Phones and tablets aren't here yet, unfortunately.

Recovery from a failure

This is the hard part, especially in the case a file server fails. To start thinking here, I need tools like a VCS, configuration management, system builders, and directory servers. It becomes particularly difficult when there is nothing to start with.

Refresh hardware

I will argue this is just a degenerate case of recovery. If I get a new kit, just boot it into the current infrastructure.

Everything documented

Well that is the purpose of this page.

Revision Control

I think revision control is an absolute requirement to this thing properly. I would argue that a precondition is that all configurations should be described via text files (contrary to the systemds and Object Data Managers of the world). I have even gone to great lengths to put my ?$HOME into git. I choose git and gitolite for these reasons. My public code repositories are here.

$ sudo aptitude install git/wheezy-backports
$ sudo aptitude install git-annex/wheezy-backports
$ sudo aptitude install gitolite

should do the particulars.

Might as well install nginx and gitweb.

$ sudo aptitude install nginx-full/wheezy-backports libcgi-fast-perl libgd2-xpm -t wheezy-backports
$ sudo aptitude markauto libgd2-xpm
$ sudo aptitude install gitweb/wheezy-backports
$ sudo aptitude install fcgiwrap spawn-fcgi

Remember to make www-data a member of the git group, see here.

Firewall

At the moment I like shorewall, so that is what I will use. I'm looking at shorewall-lite, but I don't think I can publish the configuration, and still be secure.

Directory servers

Current projects

automation ?cfengine ?freeradius ?kerberos ?openldap ?linux containers

Notes

I finally have a ?split horizon dns functioning properly.

Miscellaneous

It seems I created a sf.net project to do similar things. The project was registered . It has stagnated since then.

Blog entries tagged sysadm


Split Horizon DNS
Posted
FreeRADIUS, OpenLDAP, Kerberos, oh my!
Posted
Version Control For System Administration
Posted
gpxe-as-lan-boot-rom-in-virtualbox
Posted
Building an Infrastructure
Posted