Robin Lee Powell's Free-Form Resume
Senior UNIX/Linux Systems Administrator
After leaving my job at LookSmart, I realized that I've reached the point in my career where a normal resume doesn't make a whole lot of sense, at least once it's in the hands of a hiring manager rather than a recruiter.
That deserves some explaining, but first I note that from here on out, I'm assuming that I'm talking to a hiring manager. Here's the thing: I've been a sysadmin for so long that a lot of the signs you'd look for in a junior or intermediate sysadmin no longer apply. If you are wondering "How many years of DNS experience does he have?", or "Has he ever configured sendmail?", or "What flavour of Linux is he best with?", you don't want me. If you're looking through my resume for buzzwords, you don't want me. If you're adding up my various jobs to try to figure out how many years of experience I have with Solaris, you don't want me.
If you're looking for someone like me, you care about very different things. You want to know, can you hand me a broken computer, or a hundred of them, or a thousand, explain the problem, and simply walk away, knowing that the problem will be fixed, and that no further steps on your part are required to get it fixed just about as fast as it possibly can? You want to know, can you give me a specification for the performance requirements of your giant server farm and expect back a coherent, well documented series of steps required to go from where you are to where you want to be? You want to know, after three months on the job will you be able to come to me and ask me which part of the overall infrastructure is bottlenecking performance?
My answer to all those questions is yes.
Now, obviously, this isn't going to be true for every possible system; I'm a highly skilled sysadmin, not a god. If you're still reading, you've decided to give me the benefit of the doubt, and now you want to know which sorts of systems I can be relied on to work magic with.
The rest of this resume answers that in the best way I know how: by telling you about the things I've done that I'm particularly proud of.
As an aside, though, I do ask that you don't take my word for any of this. Talk to my former co-workers and bosses if you wish, but better than that, if you're considering me for a position, please bring me in, sit me down in front of a machine you've broken for that purpose, and see what I can do. Regular interviews (asking me about my greatest weakness, or whatever) aren't going to show you my skills in any useful way. Why not actually test me properly?
Things I'm Proud Of
Not a complete list, obviously, but these are the things that come to mind as exceptional. In approximately reverse date order (i.e. newest first).
FAI Doesn't Like Our Environment
At LookSmart we had been using a couple of different internally developed imaging applications, generally without benefit of programmers assigned to them. This is about as fun as it sounds.
I decided to leave LookSmart (and let people know I would be
leaving) a fair bit before I actually did, and I wanted everybody to
have fond memories, so I spent several weeks (including weekends,
for the most part) making
FAI
work for installed
Debian Linux, both etch and lenny, and both 32-bit and 64-bit.
This shouldn't have been a serious undertaking, except for a few details:
Presumably because of the unusual things we were doing, I hit quite a number of bugs in the various packages that FAI uses. I ended up having to insert a sort of patching system into FAI so that after it created the client image, it would overwrite fixed versions of certain files.
FAI expects all machine's interface information be managed by DHCP both before and after installation. We used DHCP only during installations. This meant that scripting needed to be written to modify the network information on the host at installation time.
FAI is designed for having one single client image. Making it deal with 4 different images was quite a task all by itself, especially since each test takes 30+ minutes (client image generation + installation).
On top of all this, I did everything in
cfengine
, so that simply by creating a
host with the same naming convention as the fai server I built, you
would end up with a working FAI server to our specifications. I
proved this by re-imaging the server before final testing.
Furl Actually Worked
Furl
was one of the products we ran at
LookSmart. I say "worked" because it's been sold to Diigo. Before
that, though, I was basically the sole sysadmin on the project. I
also ended up working with Furl longer than anyone else, including
the creator.
When I arrived, LookSmart had just acquired Furl and didn't really know how to lay it out in terms of what parts to put on what servers. About a week after I arrived, it had a catastrophic failure and I had to figure out how to regenerate a bunch of the data.
Furl was always really hard to optimize, especially since it never had any money, so we couldn't simply throw hardware at the problem. It had a number of features that required collating data from every user, so we couldn't separate users out into groups across different servers. On top of that, management was basically actively hostile to doing the things needed to keep it running smoothly. When I left in March 2009, we had ~1.5TiB of user cache and index data, ~100GiB of MySQL data, and we were still running MySQL 4.0.24, which had had active support terminated in September of 2006. I had been requesting upgrades to 5.0 or better since I arrived in December 2004, but that required developer time, and management never allowed it.
When I left, however, it was running smoothly, and I take a lot of the credit for that. I engaged in immense amounts of MySQL tuning over the years. I implemented our load balancing scheme in haproxy when our hardware load balancing couldn't play well enough with Ruby On Rails. I routinely found bad database queries and forwarded them to the developers, often along with suggestions for how to fix them.
For the first year or so of my time with LookSmart, Furl was one of the two biggest offenders for oncall time. By the time I left, Furl was running so smoothly that on the rare occasions that it did wake somebody up, they usually had to call me anyways because something completely bizarre and unique had happened.
If some sysadmin hadn't made Furl their pet project, it would very quickly have become totally unusable. Furl was my baby, and I'm proud of how well it ran until they took it away.
Multilingual MUDding
This isn't something I did for money, and it mostly isn't sysadminning, but it was one hell of a hard problem to solve and I'm still proud of it, so here it is.
MUDs are text-based virtual worlds; like MMOGs but text-only. I'm
heavily involved with the the Lojban
project
, and wanted to make a MUD for it. Yes, I'm a giant nerd.
Anyway, MUDs are generally tied tightly to whatever language they are designed to process (i.e., English, almost always). Like, the language parsing is implicit throughout the code. Converting to another language is usually quite tricky. But I decided that wasn't good enough for me; I wanted a MUD that could do multiple languages (theoretically, any languages, and any number of them) ''at the same time''. This means that when a user enters the room, they are presented with a description in their language, if such a thing exists.
This is a lot harder than it sounds, because a room consists of a bunch of objects, any of which can be made by any user on the MUD. A given user may not realize (or care) that the MUD is multilingual, and may or may not have the capability to translate into all the languages commonly used there even if they do. So you can have cases where for one user you want to display the information from the object itself, because it's in the best language for them, but for other users you want to get the attribute from the object's parent, even though the attribute is defined on the object, which is a rather substantial violation of the normal OO flow.
Figuring
out what to do algorithmically
was hard, but I managed it (with
help), and I managed to code it, and it works. I used the
mooix server base
, and this makes mooix the only
generally multilingual MUD on the planet. Believe me, I checked.
For relevance to my career, mooix actually uses UNIX itself (in this
case Debian Linux) as its infrastructure. That is, users of the MUD
are system users, code is run as actual UNIX processes, etc. Love the
security implications! This lead to my first real experience with
virtualization, so I could figure out how to make a separate
instance of Debian for the MUD to run in all by its lonesome. It
uses Linux
VServer
.
Imaging In Boxes With dd
That's not a typo; I actually do mean imaging in boxes. We ended up in a situation at Recourse where it was actually not in our best interests to take the machines out of their cardboard boxes to image them. You see, we produced fully-configured honeypot machines for our customers. Sales would generally give us about a day of warning to get them shipped, which wasn't even enough time to get them to the office. We didn't have the office space, or money, to keep them on hand, and we didn't have room or time to rack them.
It wasn't like we could just load the OS on to them like normal people, either: for reason that remain not entirely clear to me, they had to be Solaris, but we had gone with x86 machines for cheapness. Better still, they were Dell servers with (for Solaris x86 anyways) weird hardware. Figuring out how to get the first one imaged at all actually took me something like a week. Even once I had it down to a science, you had to swap floppies (yes, floppies; Solaris x86, at least at the time, was a grotesque abomination that absolutely required all the driver loading happen from floppies as far as I ever could determine) something like 4 times.
So eventually, since the drives were always identical, I got into the routine of simply keeping a copy of a finished drive around and a copy of tomsrtbt on floppy, and running dd to do the copy. As an aside, did you know that dd's runtime performance changes dramatically if you futz with the block size? Try a large copy with no bs= argument, and then run it again with bs=16M. Anyways, this method worked fairly well for our purposes.
Then one day 44 servers showed up, and we were told that we needed to ship them out by close of business that day.
I was unimpressed, but it needed to be done. So I started with the first box, made 3 drive copies (there were 4 bays in these boxes), and basically commandeered the rest of the operations team, cracking open boxes, running power cables to them, slotting in drives, etc. I'd then go around and get the dd commands started.
It was a hell of a day, but we got them all out of the door. I was pretty pleased with myself, because there was nothing about our previous requirements that had led me to set up a one-command imaging system for these boxes: I hod done it solely because it seemed like the right thing to do.
Created by rlpowell. Last Modification: Tuesday 14 of April, 2009 12:27:38 PDT by rlpowell.