Anthea Bell Interview: I stumbled across an interview with the English translator of Asterix. I grew up reading these, but it was not until I was older that I could fully appreciate the sophistication of the translation work. Peerless.
This old blog post about the contrasting approaches to programming to solve a particular problem shot past my RSS reader recently. It's a lovely read about a dialogue-by-article that occured between Donald Knuth and Doug McIlroy, as guest columnists in Communications of the ACM back in the day.
The post is short, wonderfully written, and serves as something of a meta-commentary about the nature of writing about code, and how to communicate the intent and the implementation of a computer program by way of documenting it. It has a wonderful flavour of a parable from the ages, because it's re-telling a story of how the giants from the old days solved a problem in a witty and entertaining way.
You could easily read this as a clash of ancient demigods with the victor being the last man standing, but I think that's probably a mistake. 'What problem are you trying to solve?' is one of my favourite pat-rules about program design, and I don't think the two authors here are trying to solve exactly the same problem.
What problem is literate programming trying to solve? Did it solve it? Are there any better ways to solve that? Wasn't UNIX designed for use in interactive text-processing?
It makes me think about man vs horse races . What problem are they trying to solve?
It is quiet again in the house, in the small hours of the morning, since I rediscovered the time locks in the kids kindle profile. 05:00 Minecraft is noisy
Now I can post short updates from the most basic client possible, which is pretty low friction. I should be able to extend this all the way to full articles eventually
Further work at hooking into micro.blog. If I extend my slightly moribund 'linkblog' special case formatting to a new 'short post' class, and then generalize this to cover indieweb 'notes' style posting, then I should be able to build a dedicated feed for notes that fits in better with micro.blog. These short updates will just appear inline on the blog site as de-emphasised text, without article formatting.
I finally wired up micro.blogs! I have a micro blog. I'm not really sure what it is for but it's there. I like it anyway, because it's called cms. In order to get it working, I had to make RSS work slightly better than 'barely', and so now I have an RSS 2.0 feed. The 00's are back! It's all about microformats and POSSE and syndication and decentralization, and taking back the web.
I appreciate it's an outside chance, but should you have a micro.blog account you can follow me on there, and reply back, and be friends, and stuff. Should you not have a micro.blog account, but think you might like one, HMU, I probably have some invites or something The indie web can never die!
I love making New Year's resolutions. I'm not sure I am good at them, but I enjoy the tradition. Every year, I tend to set at least a small handful. Sometimes they're entirely private, sometimes I publicise them. I don't know that they ever tend to be that successfully realised, but to me that's part of the system. I like to think there's much that is useful to learn from how I fail to achieve them, and maybe that's some part of the benefit. Sometimes I pull them off with aplomb, and that always feels pretty good.
This year I set myself an extraordiary reading challenge. A couple of years ago, on some kind of whim, I think prompted by a positive review in either The Guardian, or Word Magazine (RIP), I purchased, and read Lives of the Novelists by John Sutherland. This is an engaging biographical chronology of the English novel. It covers 294 novelists of significance in historical sequence. The format is readable, there's a 3-4 page potted biography of each author, and then a short summary. So you end up with a sort of illuminated canon starting in the seventeeth century, and leading you up to the roughly present day. The author choices are not always obvious, but neither are they deliberately obscure, and the tone is light, cheeky and erudite, and it works as an entertainment just as much as a reference piece, I unreservedly recommend it.
Part of the summary notes at the end of each chapter, suggests a Must Read Tome, for each writer. These themselves are not always the most obvious work, and carry a little paragraph of justification alongside the choice. So for 2017, in a fit of optimism, alongside a grab-bag of other self-improvement goals, I decreed I would attempt to read every MRT in sequence. Clearly this was an overreach going in. 294 books is approaching a novel a day, and I was unlikely to chew through all of them in the year, but I figured I could keep going with it and see how well I did.
We've passed the halfway mark now, and I'm happy to report it's going awfully. I have managed four books. The most recent one is not even half-finshed. I guess I kind of suck at this. I think there are a couple of mistakes I made on the surface. One I have already skirted around - it's too big a target. Factor into that the fact that I have relatively little spare time for reading fiction, and I also decided to take on a pile of other resolutions that demand daily hobby time (I'm teaching myself a foreign language! Badly!) and it's even more daunting a target. One book a month would be quite a feat, if I'm honest.
I think I might organised little potted reviews as I finish each book, to try and gee myself along a bit. Also, I made a resolution to do more blogging in 2017. That's right, does it show? Book reports seem like easy-reach fruit. Watch this space.
Perhaps the most egregious error though, was to fix myself to the chronology. Whilst this means that I am starting out firmly in the lands of the copyright expired public domain, which makes legal book aquistion very economical, it also means that I've front-loaded the material with tough going books. We start out with challenging archaic language, and structure, and this makes an already sluggish project a little more slower going. I don't believe in changing the rules midstream, however, and I intend to persist.
I thought I'd broken the back of the slightly-too-hard-to-read-comfortably years, when I got through to Defoe, but I hadn't taken into account another pitfall of the early romantic novel. Verbiage. After Defoe I get Samuel Richardson. And the MRT is Clarissa. That's nine fricking volumes of epistolary marriage plot. It might be the longest novel published in the English language. I've nearly finished book one, and it's taken me three months. I sincerely doubt I'll get through this bugger before 2018. It's fascinating, heady stuff though, I am enjoying it. Many tribulations, and archaic mores. A terrifying insight into the political lot of even the priviliged eighteenth century englishwomen. Also, much swoon.
Could this be a ten year project? I'm not quitting.
2026 is the year of Linux (in Linux (in Windows)) on the desktop
I'm not a very Windows-focused computer user. In fact I think the last era I used it really extensively, with any expertise, was in the 16-bit era of Windows 3.1 - 3.11. I quite liked those versions, which operated somewhat more like an integrated interface SDK for MS-DOS than an independent operating system. As the commercial internet boom took hold, and work became focused almost entirely on building web applications targetting UNIX-like deployment environments, (typically linux), I shifted over to working natively in that environment, and never looked back. So, I don't really know how to use modern Windows at all comfortably, and I don't have any personal ease when working with it's native interface.
How I typically organise my development environments
Normal development tooling for me is a simple GUI windowing environment, running on Debian linux. I (still) use GNU emacs or increasingly zed for almost everything that isn't in a web browser, alongside a handful of running terminals, an email client, and some kind of file browser, and music player and that's about it. My development projects I tend to isolate into containers, using lxc/lxd and more lately, incus, to give me "lightweight" virtual hosts nested on my computer, connected with a virtual ethernet LAN. It effectively is a local 'cloud', offering dependency and process isolation for your work, and powerful features like snapshot/checkpointing, image templating. What I particularly like about the incus approach, compared say to other container systems, like docker, is the abstraction is well suited to longer-lived, stateful development systems - you partition one linux system into a few dozen smaller more specialised linux systems with the state hidden from each other. The unit of containment is 'operating system'. Whereas with docker-like containment, I think the unit of abstraction is something more like 'one process, with all of it's dependencies bundled', which makes more sense to me for one-shot tasks, or often as a deployment target.
Previously
Back in the 90s(!), when computers were a lot less powerful than they are today, I commonly used to use emacs 'tramp' or Ange-FTP modes, to develop remotely on a development or staging server, from a thinner client. Now I use the same approach, and the same tools, to develop on my container environments, effectively treating them like a 'remote' host, even though they're only conceptually remote. Tramp, just like most of emacs, is a kludgey wonder - it encodes the information about remote endpoints using pseudo-paths, like /ssh:user@host:/path/to/something , and emacs just works out how to edit your files. Under the hood, it strings together a glue of subprocesses and temporary file copies and the like, to take your editing and reflect it on the remote environment. And not just files. This is emacs! Almost everything in emacs works tramp-aware, so you can browse the remote filesystem using dired - launch processes for compilation or linting, use git workspaces with magit-mode, run interactive shells and debuggers, build and run your projects, it has extremely high levels of DWIM. The main price you pay is occasional latency, as things shuttle back and forth, or buffer in and out of pipes, but I’m very used to this. And compared with the days when I used to use tramp to work over dialup(!) links to servers, this modern container approach is practically turbo-charged. Eat my dust!
Zed simililarly offers remote editing using ssh connections, with a slightly different architecture. The zed remote feature is a little more modern, like VSCode - it downloads a headless install of the editor on the remote after you ssh to it, and then proxies to that backend using it's own protocols over ssh. The net effect is the same though, you work on your local keyboard screen and mouse, but your working environment is in the remote hosts. An advantage zed offers over tramp's elegant hackaround is that the latency is considerably reduced, it's not copying files backward and forwards and slowly lauching tasks in shells. A disadvantage zed offers, is that it's not emacs and it's tooling for lots of things (like git, or file browsing) are not as advanced or comprehensively scriptable. It's not uncommon for me to have both zed and emacs buffers attached to the same remote development context.
So, that's what development tends to look like to me. One or two graphical editors, on my main desktop. Many persistant projects mapped to running containers (and maybe remote hosts), with various different projects open in them for work. I get a persisting, consistent user interface over diverse projects, all of them in a full linux environment, each completely isolated from the others, but networked.
Windows re-enters the conversation
In my current job, we're using Windows, and the whole Microsoft business stack, and we have a IT managed network. It's a bit of a change from what I'm used to. But, unlike a lot of software developers, I've found that I like change! (Often, you learn stuff. So what have I learned?)
I've been issued with a pretty nice Microsoft Surface branded laptop. The hardware, at least, is nice (and higher spec than any of my own current computers). The software is, of course, Windows, which I remain suspicious of using. The surface runs it like a champ, of course.
Interestingly enough, modern Windows understands that the majority of software deployment is now to linux, or linux-like environments, and the developer tools include an integrated linux-based toolchain. It's called 'Windows subsystem for Linux' and it's on it's second major version - WSL-2. WSL-2 basically integrates a virtualized linux kernel running inside the windows enviroment, which can be used much as I've described my container approach above - you have a virtual linux host, with it's own filesystem and processes, and a convenient interface between this and the host system.
You can run your graphical applications, and even your IDE (Visual Studio Code, presumably :-)) , and browser (Edge, your AI-powered browser!) in your graphical desktop, and have a local 'server' for developoment. WSL integrates with Docker Desktop for Windows, allowing your docker containers to run natively in the linux environment, and you can even install and run multiple instances of WSL containers to have different isolated linux 'back-ends'. It's a compelling work narrative, but it is founded on the idea that your goal state is using Windows for all your user-facing software and interface. Howerver -- What if you don't want to?
It's a UNIX system, I know this!
Because the WSL environment is an optimised full linux VM, it seemed to me that I might even be able to treat the WSL environment like a remote linux system, and move my existing workflow over - use a linux desktop to remotely access a local linux "server", that just happens to be Windows, and run development inside there using my typical approach of multiple contexts isolated into separate system containers. That's more like my idea of best of both worlds - my work computer can be a locked down managed enterprise client, I can get good use from the fancy hardware, but still maintain the toolchains and client interfaces I'm most comfortable using. Assuming, of course, that I could get it all to work...
Well, it took a bit of fiddling, but I'm here to say it works, well! my Windows Laptop runs on my desk, my software projects run on it inside incus containers, and I access them from emacs or zed or terminal windows, on my ancient creaking linux desktop system. I can easily run as many of these containers as I can fit into my 24GB of WSL (actually quite a lot of headless containers, in my experience) - voila! My Windows laptop is now a cloud host provider!
Here's a detailed walkthrough of how I set it all up. Please note that you will need a local Windows admin account to make some of the necessary configuration changes to the Windows side of things, but aside from a couple of privileged config changes, all of this can be then run as a non-administrator user account, which is another great benefit.
(Because I set this up originally a year ago, my instructions are written from before Debian 13 was promoted to stable, which is why you'll see me using 'Bookworm' in a few places.)
The setup
First: Setup WSL
Firstly, you need to install a WSL2 environment. I picked Debian (which is supported), and I initialised this. I then system upgraded the Debian installation from bookworm (12/old-stable) to trixie (13/testing) using apt, because incus is packaged as part of trixie. I was then able to install incus using apt, and follow the incus initialisation and setup instructions from the project configuration page. I quickly launched a couple of bare bones containers to check that things were working as expected.
Inside your WSL shell, assuming your user account is in the incus and incus-admin groups (check this with the id command), you should just be able to run incus launch images:debian/12 - this should download a base debian image, and launch it with a generated container name.
You can see it running with incus list , and launch a shell within it with incus shell <container-name> - Please read the lovely docs for more such hints, this post is not intended to be an incus user guide ;-)
The next thing I did was add some additional WSL configuration by creating a .wslconfig file in my User home directory on Windows - this is a plain text ini file. I was pleased to find that Notepad.exe still exists in 2024, and can be used to create this file :-)
this is relatively self-explanatory - I'm giving most of my 32GB of RAM to the linux VM (because i'm not really using the windows side), I'm enabling nestedVirtualization, although I don't think this is a prequisite for running incus containers, it sounds like something I'll probably use at some point. Finally, and most importantly for this case, I'm setting the networkingMode up to use 'mirrored' networking mode - this replicates the windows networking devices and configuration inside the linux VM, meaning we can connect directly to the linux system from the network, without having to set up port forwarding or anything like this.
Once you've created the file you need to restart WSL in order for it to take effect. The easiest way to check if it's working is to look at your available system RAM in linux using free - it have changed to be 24GB. The next stage is to setup windows to allow client connections from the LAN.
Windows Networking
We also want to be able to connect to our virtual linux box conveniently from the LAN. This requires a few things. Firstly, we need a stable network address or name. Secondly, we need to allow incoming network connections. This part requires enough Admin privileges on the Windows host to change networking settings.
I redefined the network adapter settings in Windows to use a static IP for this LAN, and added a DNS name for it in my local resolver. I set this network configuration up as a 'Private' network profile. The next step is then to configure the Hyper-V firewall on windows to allow incoming connections to pass to the VM. Running a powershell window as admin, I added firewall rules to allow this for the private network profile. In this way, I can ensure that the host is only accesible like this on trusted networks.
The WSL vm has a fixed identity string (the VMCreatorId) , a GUID, which is 40E0AC32-46A5-438A-A0B2-2B479E8F2E90, so the command you need is something like
Now incoming connections on the windows IP interface will be receivable in the WSL VM. Enable sshd on the WSL environment, and then check that you can ssh to the network address. You should get a login inside the WSL environment. If you run incus list here, you should see any running incus instances.
Bonus networking: Setup ssh proxying, and multiplexing
You can now access incus containers on the WSL instance from a remote emacs, if you use the incus-tramp method, and tramp pipelining. Access a path like /ssh:[email protected]|incus:me@container-name:/path/to/project in emacs and everything should be there. Relying on additional tramp stages for proxy chaining, although it's a very neat trick, can bring problems with performance, and reliability, and it is more simple to push the extra hops into the ssh layer.
This involves a bit of glue code, which looks hideous, but works very well.
The ssh glue
Setup 'Control Master' for ssh, which allows repeated ssh connections to the same host to re-use an established ssh session. This will speed up the time taken to open new sessions, and noticeably improve the responsiveness of tramp for ssh remotes. Secondarily, use a ProxyCommand directive to connect a single ssh connection to a proxy session. Finally, you can use wildcard rules with a host suffix matching a certain host pattern straight through into an incus container on a specific host. Here's the relevant entries from my ~/.ssh/config
In ssh configuration files, the first applicable setting is the one that will be used, so we should order the file from most specific towards most general.
Here, we're using a fake 'domain' of .wsl, and then converting the command to an ssh to the windows host, that immediately launches incus, getting the container name by chopping off the '.wsl' from the provided hostname and running 'netcat' in the container to proxy our ssh session to the ssh server inside the container. With this little piece of ugly glue, we can run ssh container-name.wsl and immediately get an ssh session directly into the running container called container-name
The control master block ensures that we re-use an established ssh control session for all connections to the same host, and persist it for 10minutes after the last connection exits, to improve reconnection times.
With this piece in place, we can access files, shells, and processes from emacs buffers, using a tramp path of /ssh:my-container.wsl:/path/to/project - or laucnch a zed session in a remote project directory, using their ssh 'remote project' feature.
Surviving restarts
The main, and perhaps the only real downside of this approach is that Windows likes to restart, often. Usually in batches once or twice a week. This is a combination of updates from remote IT and Microsoft I suppose. Rather than get too aggravated about it, I prefer to think of it as a free chaos monkey.
I can alleviate most of the pain points by making everything as restart-able as possible. WSL can be set to start as soon as you login by tweaking a few settings in the cmd.exe application
set the default shell to be Debian to match my WSL container session
set cmd to start on login
This means that after a boot all I have to do is login to resume the WSL state
The incus container can be set to automatically start at boot with incus config
I have assigned both the incus container and the windows machine static IPs on my LAN, so they reboot with a stable address.
In WSL I have caddy set to reverse proxy from Debian to the incus address, and this is configured in /etc/caddy and run from a systemd unit, to restart on boot
The local "staging" version of any webservices I am working with I typically run in Docker , inside the inucs container (as my user account) - i usually write a shell script to launch docker with the right port forwarding and data persistence flags for whatever I want to be running (for more complex setups this could be a docker compose configuration) - I simply put this shell script into my user crontab, using the magic @reboot trigger directive to launch this script after multi-user init, as me, with just a one-liner.
with these configurations in place, all I need do is login to Windows (with my face 😘) to resume running services where I left them.
For the past year and a bit, I've been relying on a one-user GoTosocial server for my fediverse participation. Fediverse is the 'well, actually' technically correct name for the social network protocols that power an overlapping set of free, distributed social networks that a lot of people just call 'Mastodon'. Mastodon is the largest and most popular server software used in this network, and got a significant bump in popularity when that crazy space junkie guy started hacking on twitter.
Monoliths vs Microliths
Mastodon is a large Ruby on Rails project, with the typical kind of architecture you might expect from a classic LAMP-adjacent dynamic web thing that's used in production to run instances with thousands and thousands of active posting user accounts, and a hefty server footprint.
Gotosocial is a fediverse server that specifically targets a lower footprint installation. It's written in Go, which while not being the kewlest platform to build a modern web server application in, is to my eyes, a pleasingly pragmatic choice. (Something I often like to say is 'Go is actually kind of a DSL for building small network servers in'). It also targets full mastodon compatibility, so it's a drop in replacement for a mastondon account, and much simpler to run if you were interested in having your own fediverse service.
WASM-azing!
Whilst Gotosocial has a modest footprint, and a few moving parts, it's not without some interesting technical architectural decisions. In one of its simpler installable forms, rather than use an external relational database like PostgreSQL, it just uses good old SQLite3 😍 - and rather than pay the CGO / boundary penalties for linking directly into SQLite as a shared library, it can actually run SQLite as a contained WASM process inside the go application, using the Wazero runtime
I adore everything about this approach, it's exactly the kind of mad science I'd try to get a simpler working service. End result is you have a single static binary that you can run and install, and it manages its own fully compatible SQlite3 database store in-process, without any install-or-link-time dependencies.
So it's super simple to install for me on linux, I simply need to unpack a binary linux release tarball and then launch the newer binary with the old database file. Gotosocial applies database migrations on startup.
Official upgrade procedure
Here's their offical upgrade instructions taken from codeberg for the binary release
Stop GoToSocial
Back up your database! If you're running on SQLite, this is as simple as copying your sqlite.db file, eg., cp sqlite.db sqlite.db.backup
Download and untar the new release, including the web assets and templates, not just the binary
Edit your config.yaml file as necessary (see the release notes)
Start GoToSocial
Wait for migrations to run.
It's about as simple as a manual upgrade can be.
My Tweaked upgrade procedure.
Well, aside from scripting it, I think there's one small improvement that can be made. So far as I know, GoToSocial doesn't (yet?) run auto vacuum on its SQlite database. VACUUM on SQLite is a necessary maintenance procedure that's used to refresh, compact and optimize the database backing store after it's been amended in use for some time. You can think of it a bit like a 'defrag' or a 'garbage collecter' for your database.
Without auto-vacuum, vacuum is necessarily a blocking operation, you will block all other database changes until the vacuum is done. As such it's ideal for downtime. So vacuuming your GoToSocial database when you upgrade is a good idea, although it does extend your service downtime by a couple of minutes.
So, as well as copy your database to a backup, I suggest you also connect to it, with the sqlite3 command and run a complete VACUUM. But wait, we can be even cleverer.
VACUUM INTO
Vacuum already makes a complete copy of the database. Go back and read the VACUUM documentation I linked above. You might also notice that SQLite VACUUM supports a 'VACUUM INTO' form, which materializes this vacuum copy information into a fresh database file.
so my amended system upgrade is like this, pretending for the sake of example that it's a manual process.
chdir to my gotosocial directory /gotosocial
download the new tarball release
stop gotosocial with systemctl stop gotosocial
rename the old gotosocial binary to a versioned backup
untar the new release and assets into the directory
renamegotosocial.sqlite to a versioned backup name e.g. gotosocial.backup.sqlite
connect to the newly renamed sqlite database with the sqlite3 ./gotosocial.backup.sqlite command
run VACUUM INTO 'gotosocial.sqlite' (i.e. re-creating an optimised gotosocial database)
start gotosocial again systemctl start gotosocial
watch the migrations and startup with journalctl -f -u gotosocial
One of those things I generally expect to be part of the routine of running a linux desktop is a certain amount of manual effort necessary to keep things running smoothly. Sometimes this is the classic multi-decade horror story of "sound card" configuration. Sometimes its the inevitable friction between under-specified hardware and volunteer-maintained drivers. Sometimes it's the CADT principle, where everything changes around you, just because it can, as part of the generational cycle of collaborative software development. Sometimes it's just the self-induced consequence of having a system where you can tweak and configure everything to work however you'd like it to, and therefore you choose to, and sometimes that's quite a deep rabbit hole. This tale covers a small handful of these categories, although it's primarily a consequence of that lattermost case.
Mod life
Mod keys. Modifier keys, that is. Those would be the keys you hold down alongside other keys to change their behaviour. The most obvious and venerable of these is SHIFT. Hold that down and type an alpha-numeric key, and you generate a different character. With the alphabetical keys, you GET THE CAPITAL LETTER FORM. Other keys, like the numerals, get you punctuation. You may also be aware of the Alt/AltGr modifier keys, which hang around on most keyboards, and generally allow access to a different shift level, further symbols and accented characters. And then there's control, or CTRL. Maybe you know that guy as the menu-shortcut accelerator key, or maybe for a couple of shortcuts you might use in the shell, if you're a shell user. Actually the kids all call it 'terminal' these days, because that's what Apple does. And then they all use iTerm anyway, for I don't know why really but I'm sure it's great. Anyway, calm down Mac-loving readers, this story is about how terrible linux is, you're all wonderful. So CTRL in the shell - CTRL + C cancels things, CTRL + D ends a session. CTRL + A takes you to the start of the line. CTRL + E takes you back to the end. Assuming you're using a fairly standard bash shell. Those last two are slightly more interesting, and immensely relevant to this story.
Enter Emacs
They're readline bindings. Readline is a GNU library used to make command line shell editing a little more interactive. And because bash is the GNU shell, it uses readline by default. Those keybindings are the default readline bindings, and they work the same in any application that uses readline. Typically this means other interactive shells. These keybindings are taken from Emacs. Emacs is a text editor, that is to say it's an application for interactively working on so-called 'plain text' files. Not that's there's any such thing as a plain text file. Emacs is one of the most ancient, convoluted, complex, crufty, awkward pieces of software you're ever likely to encounter. It's one of the original fundamental components of the GNU system. You might say it was the standard editor. Emacs is also one of my all-time favourite things. So those readline keybindings we were discussing, are intended to bring some of the more capable text editing commands from the GNU text editor across to the GNU shell, making use of modifier keys. Emacs really really likes modifier keys.
Because Emacs is a very old piece of software, it's design was heavily influenced by the keyboards typically used on the systems of its time. Computer use was a lot more text and command oriented, and the large, pre-PC era keyboards tended to reflect this by having a large amount of mod keys and function keys available. Commonly cited examples are the MIT or symbolics lisp machine 'space cadet' keyboards, and the Knight keyboard. As a consequence of this emacs can understand a lot of different modifier keys, and has a UI that is organised around layering functionality onto different keyboard 'chord' operations. You really need at least two distinct mod keys as a bare minimum to get emacs to do anything useful at all. We already met CTRL a couple of paragraphs back, but you also need another key called META. Emacs uses them in fundamental, and interestingly composable ways. For example, you can move the cursor forward one position by typing CTRL + F, but you can move the cursor one word forward by typing META + F. It's powerful, and sort of intuitive once you understand the fundamentals quite well. Unfortunately, they mostly stopped making keyboards with META keys on them some while back.
Welcoming the X Window System to the fray
Now I am getting quite old, but I'm not ancient enough to have run emacs on pre-Internet era hardware. I did use it a little bit on 7-bit serial terminals, and limped along using ESC as a prefix modifier, like a farmer, but by the time I started really learning how to use emacs to any serious degree, I'd made the jump to UNIX machines using X11 as a graphical user terminal. Some of the UNIX workstations had a META key. Some of them didn't, but had a few other modifier keys. Increasingly, UNIX graphical workstation started to mean 'Linux and XFree86 on PC hardware'. Now IBM-derived PC keyboards don't have a META key and never did. The original PC keyboards didn't really offer many modifier keys, but by this time period, everything had mostly standardised on the 101/102 key IBM extended model archetype. This doesn't have META keys, but it does have a pair of prominent ALT modifier keys. And so, we begin to remap.
X11 is maybe one of the canonical reference points for design by committee. Fully intended to offer a portable graphical , networked user interface across a variety of dissimilar UNIX systems, it tries very hard to offer the broadest possible set of abstractions across similar base behaviours, trying to build a unifying API in all aspects. Screen dimensions and orientation, color model and layout, pointers, input devices, key-types, you name it. So you can usually configure your equipment in a bewildering, verging on frustratingly flexible manner. X11 is a very broad church and welcomes all kinds of keyboards. X11 allows you a whole byte for modifier keys (I think), so you can have Shift, Lock , Control, and then five others called Mod1 through Mod5. You can freely map key codes onto key symbols, and then assign key symbols to one or more modifiers. So obviously this all took fourteen hours to decipher in the first instance, but I gradually became reasonably adept at using the Xmodmap utility to set ALT to be both ALT and META, CAPSLK to be another CTRL and life was mostly good. You'd tweak your .Xmodmaprc file every time your keyboard changed significantly, load it in as part of your login, and everything would work. PC-104/105 keyboards came along, with windows keys, and this meant that you could perhaps add a SUPER or even a HYPER key, and bind those to other emacs macros. The system was working, and everyone got rich on the proceeds! Or not. Nonetheless, although linux desktop software was fairly terrible, it was a fine environment for running Emacs, and running Emacs was where most of the work got done after all.
A detour into Macintosh
Times have changed however, and systems have changed, and uses have changed, and so have I. Like everyone else, I started using laptops more. Modifier keys started getting scarce again, as the machines shrank down to be portable, and interfaces just grew ever more graphical. For about a decade, I used Macintosh systems, which represent their own series of keyboard configuration challenges. Macs are actually pretty good for modifiers, if I'm being fair - they have their own dedicated command key for all the system key shortcuts, and so you're free to map control and option as you see fit to control and meta. They even give you a little GUI configurator for managing and assigning modifier keys, which is way more convenient than spending hours searching for information about xmodmap. The main suckitude is that they don't have a symmetrical set of them on their laptop keyboards. You don't get a right hand CTRL. Symmetry is important for healthy typing habits with key chords, because it's vastly better for your hands if you use both of them for combinations. So you ideally want to be able to hold down CTRLMETASUPER or some combination of them with one hand whilst you type the activation keys. So CTRL + C is best expressed as a right hand finger holding CTRL, whilst your left middle finger taps the C. So my life as a Macintosh Emacs user was constantly blighted by crazy-ass schemes to find keyboard layouts that allowed unstressful ways to type CTRL key combinations.
Desktop Linux in the present day
For the last few years though, I've been back on the linux horse (and why is a different story, for another day), and my main laptop, a battered lenovo ThinkPad, has a full set of three modifiers either side of the space bar, where they were intended to go. The Debian GNOME 3 desktop is configured to use the windows and menu keys for desktop commands, and the ALTGR key, which I have on the right, as some kind of compose prefix. Even thought it's X.org now, not XFree86, and Xmodmap is heavily deprecated in favour of the XKeyboard and Xinput extensions, using the GNOME configurator and then some of my old Xmodmap ways, I could make this go away, and map ALTGR to a right meta, ALT to a left meta and the windows and menu key to SUPER and HYPER. The lenovo x220 I use has a particularly excellent keyboard and all was right in the world.
And then GNOME 3.22 switched to Wayland as a display server, rather than X. And this year's Debian defaulted to this. Even though there is an X11 compatibility layer, GTK+ and GNOME on Wayland do not talk to X11 directly for mediated key events any more, and this meant that Xmodmap can't be used to universally set modifier maps. GNOME 3 on wayland will still use xkb for key configurations, and this meant another fourteen hours of fiddling about in order to come up with a keyboard scheme that works for both GNOME and legacy X using the XKeyboard extension (XKB). This was not made any easier by the fact that all the attempts to search for information on this get bogged down in legacy explanations about Xmodmap or how to enable XKB for X11. But I got there in the end
It seems like there's not actually any supported, or easily documented way to load user configurations into GNOME 3 + Wayland's XKB environment, so I ended up slightly disappointedly hacking them into the system options files. Of course this meant that several months after I did this, a system upgrade overwrote all of my changes, and I was left without a keyboard, and a scant recollection of how I ever did it, or what any of the bits were even called.
Finally I fix it
So this morning I figured out how to assemble it all again from first principles. To make it more worth my while, this time I decided to transpose all of the mod keys as I went, so I can have CTRL on the inside of META as it was originally intended to be, and push the other modifiers to the outside edge. To save myself the bother the next time this breaks underneath me, I thought I'd write down the exact sequences here. I am not going to try and attempt to explain XKB here. There are a several documents on the web that do that job, to varying degrees of success. I'm not going to pretend that I understand how it all works, I just experimented with xsetkbmap and xkbcomp under an X11 desktop until I understood how to express what I needed to work under Wayland. Here are the steps.
System-wide keyboard configuration is fine for configuring the basic keyboard layout - using the Debian keyboard configurator, I can pick either a ThinkPad or a pc-105 model with a gb layout. The modifier layout can then be selected using xkboptions. I can tell GNOME what XKB options to apply from its database, using the dconf configuration key /org/gnome/desktop/input-sources/xkb-options.
If you're playing along at home, you may have spotted that cmswin is not the name of any valid xkb layout. The wrinkle is that none of the built-in options offer quite the right set of combinations. So this is how I added my own custom XKB option.
1: Define an option
I added a new file usr/share/X11/xkb/symbols/cmswin to define my partial keymap.
Further to this, I modified /usr/share/X11/xkb/rules/evdev
adding the line
cmswin:cms_modkeys = +cmswin(cms_modkeys)
to the section
! option = symbols
I believe this is adding an option named cmswin:cms_modkeys to the dataset assigning it to parsing the 'cms_modkeys' entry from the 'cmswin' file, in the symbols subdirectory. The convention in xkb is to name all the different symbols using the same substrings, and it's terribly confusing when you're trying to remember which part does what, although slightly helpful when you're trying to perform the reverse map and locate which file is responsible for which option, I suppose.
3: Make it available for GNOME
The final step is to add the line
cmswin:cms_modkeys fix keys for emacs
into the file /usr/share/X11/xkb/rules/evdev.lst
I think this does something like import the option into the environment. There is also an evdev.xml file in the rules directory, which looks like it marks up the options to be used by the GNOME gui, but I didn't bother with that one, because life is too short to hand write XML for computers to parse, and I'd already spent half a day setting this all up. To give you an idea of how tedious this all was, for a while I'd added the evdev option into the section marked !option = types rather than symbols, and this caused wayland to stick to a crash loop as soon as I loaded the XKB option into the dconf key (with no visible error logs! yum!)
4: Retire, rich from the proceeds
With all of this in place however, everything works fine. For now. GNOME seems to be in a bit of a transitory phase with regards to keyboard and input configuration, it looks like they're reworking everything to use IBUS in the long term, so I expect I'll be doing some form of this dance again within a year. Until then though, this document can serve as a reference for the next time I, or anyone else interested enough needs to figure out how to do this.
2017 then, and nothing seems to have really changed that much at all. Desktop Linux is still terrible, and desktop linux is still awesome. Emacs is still terrible, and Emacs is still the best tool I have.
I've been very gradually upgrading this site back to life for a few years now. Very gradually #amirite . However, after earlier this year having found myself accidentally on the front page of Reddit, HN etc. with my post about building the IMDb boards , I found myself slightly embarrassed, not only by the amount of attention ( 40k+ uniques in the first two days, holy shit! ), but also by people pointing out how clunky the site is to read. Often several times a day.
The styling on the blog section, much like the rest of the blog section, wasn't in a terribly well developed state of completion. I just threw together some hand-written CSS to approximate the look and colours of my last existing Wordpress theme, which I had been fairly happy with. Now that theme was set up maybe ten years ago, and my initial port over to this 'new', self-build CMS maybe four or five years old itself, and I had given no thought at all to mobile, or in fact any screen device very much different from my own laptop display. And my main laptop display is a 1024x768 pixel non- IPS Lenovo ThinkPad x220. That is probably a significantly worse screen than your phone has.
In 2017 it's pretty stupid to build web pages just to be viewed by desktop browsers, so today I'm pushing out a rebuild of the display layer and theme, that hopefully works a little more responsively across varied devices. It should also be easier for me to evolve. I hope it improves things for my handful of select readers. I'm not terrifically good at front-ending, and my heart isn't often in it, but I have tried my best.
I'd like to be updating this site more frequently again, he writes, like one of those bloggers apologising for never blogging , but a large part of getting any kind of schedule working there, is streamlining the publishing workflow. To that end, as well as a more modernised front-end and theme, today I've also released a new site deployment system, that allows me to update the site software more easily. This is clunky, but at least automated. Previously everything was just checked out into a home directory, hand compiled and run on the server. Now that's mostly still happening , but now it's all scripted with configuration management tools so I can release updates like this without having to remember exactly how to set it all up again by hand from first principles.
Of course, for writing articles, I'm still shelling into the server and hand writing html files like a farmer , but it's all steps in the right direction. Sometimes I don't shell in to the server, I author the posts directly using emacs tramp-mode which practically counts as using a GUI round here.
I posted about yesterday's post on LinkedIn, because it seemed like a low-key way to get people to consider my CV. Although I'm not exactly job-seeking at the moment, we ARE having a big surprise technology function re-org, and it feels like a great moment to be prepared for a role change. So, if you're reading this and you think - hey that guy sounds like someone I'd like to work with and you're hiring people for something fun, now might be the perfect time to reach out 🙏
Anyway, LinkedIn appears to be having a bit of a cultural resurgance of late, which is interesting. People are using it a lot more like a general purpose social network than ever they used to. I have a few thoughts about why but they can wait for a different blog post. I kind of like it, because I'm always sincerely happy when people make "content" 🤮 rather than just passively consume stuff. A lot of the content is a bit awkward, because there's obviously a motive toward self-promotion, as so much of the LinkedIn audience are job seekers, recruiters and sales people, but still my feed quite often has interesting content popping up. It's kind of 'business instagram'. Instagram but all the influencers are dads, dad-dancing at the school reunion disco. It's honestly not without charm. Not all posts are equally enjoyable though.
The robot Elephant in the room.
AI Hypers! Of course LinkedIn loves AI hype! Now, I'm quite fascinated/entangled/repelled/enchanted with the AI boom myself. We're certainly having an industry moment, and like each crazy boom cycle I've seen in this domain, it's a bundle of awful things mixed in with amazing things, and fun things and tragic things all at the same time. One thing is certain though, there's a lot of AI opinions surfacing on LinkedIn right now. Too much I think. They're a bit repetetive, and I'm not always seeing that much valuable content from them.
Typical AI boost content
Jane Q. Businessguy posts something like this. (Not their real name. I made them up. I made the content up. This is satire. I am British, cynical sarcasm is what we do)
🚀 HIRING TIP 2026: QA IS DEAD 🚀
Yesterday I blew my own mind. I was thinking out loud in the bathroom, and minutes later, after a couple of deep inhales, I had implemented one basic Selenium script from a prompt transcribed from a voice note that runs our smoke tests automatically and honestly? We don't need QA teams anymore!!! Think about it: automation = no bugs. It's that simple. The future of software is here and frankly, any company still paying QA professionals is throwing money away. We're living in 2026 but some of you are still hiring like it's 2015. This is the kind of paradigm shift that separates the industry leaders from the dinosaurs. We've moved past the era of "manual testing." Those days are OVER. If your organization hasn't eliminated your entire QA department by Q2, you're already behind. The math is simple:
Automation = efficiency Efficiency = no QA needed QA people = legacy cost
Welcome to the future. You're welcome. Don't @ me if you're not ready to hear hard truths.
I'm pleased for Jane, although I'm quite suspicious that something like this ever actually happened.
Wait! It didn't, I made it up! I made Jane up too!
Phew. But what if some of the LinkedIn posts I'm seeing are also... made up. Some of them might even be made up by AI. Could I do something about this?
I could really do with some kind of content filter. It's a shame LinkedIn doesn't allow me to train my own algorithm. Maybe something else could do this though. Maybe this is one of those mythical use cases that are great for AI?
Fight Fire with Fire!
LLMs are really well suited to content classification. I guess if I could find a way to put the LinkedIn content through an LLM and prompt it to identify boring unispired AI cheerleading, I could then reduce the frequency of it. Or maybe just filter it out. I guess I need a semantic ad-blocker. I wonder how hard that could be to build.
Time to talk with Claude. I fire up claude desktop.
How could I build a plugin that filters LinkedIn Posts ? I'd like to use an LLM to build something that removes low quality posts boosting AI or LLM from my feed, ironically
Claude grinds away on this for a few seconds, and pulls out a couple of options. The architecture suggestion is a browser extension. Makes sense. I've built browser extensions before, they're kind of annoying, but the best way to get code control over live browser content. OK Claude, I'm listening..
I scroll past the typescript skeleton code it's printed, unasked for. Another thing I don't want to do is let random LinkedIn content posters burn tokens and the planet for me on some kind of expensive remote AI API. But my (fairly old) desktop machine has a moderately OK recent-ish radeon (AMD Radeon RX 7600 XT iirc). I've used this quite a bit for playing with local models using transformers or llama.cpp or ollama I should be able to do sentiment analysis and tagging on this in something close to real time I reckon.
Ok, I hit the bottom of the claude window and follow up with this
let's do this - generate me a prompt for claude code. An extension - I'd like the UI to just dim post content on a linkedin home page if it strongly matches the criteria - lets use ollama and a small local model like gemma3:12b-it-qat
Ok, so I make a bare directory for the project, called 'linkedin-silencer', chdir to it and run claude , which launches vanilla claude code, updated today, no plugins or user config. Let's raw-dog this. I plop the claude generated prompt into the terminal, and sit there for five or six minutes agreeing to everything. When it's done I ask it to git init and commit what it has and then I quit claude code. Let's have a look at what it's generated for me in an editor.
To the Zedmobile!
I have an ongoing love/hate affair with the amazing zed editor. Much more on the love side than the hate side. What I particularly like about it is a couple of things
it has fantastic out of the box support for common tooling infra. Something like a browser plugin, I'll have the language servers, the linters, whatever all just immediately there and working.
it's not vscode
it's really responsive, the performance of the editor and the editing widgets is super-fast. (i guess i'm just repeating 'its not vscode')
i like how their UX for llm-assisted coding makes it super easy to feed whatever model you like (local, remote, different sizes), with context pulled in directly from the project files - e.g. , you can easily pop a dialog window and have a conversation that asks about just one function in the current file, with reference to a test suite in another file - using completable shortcuts, and it assembles the prompt context for you.
zed really deserves it's own blog post at some time, but once again this is not that blog post.
zed fires up on my project, and I immediately see a bunch of red diagnostics. Oh no, my code is full of bugs I guess. It's not even my code! 😭
I told you LLMs make buggy code!
Yup, this is still sadly true. This is one of the things I think upsets a lot of people who don't like to engage with them. The magic genie still doesn't perfectly perform the magic trick. I guess it would be cool if you could just say 'hey genie, make a thing that's awesome' and trust that it would just happen, but we're not quite there yet.
To be fair to the robot, I don't write code without bugs myself. Also, it takes me much much longer to write my bugs. And, because I'm a selfish egotistical human programmer, I tend to not believe that I've made any bugs and it's sometimes a bit difficult to get my head into the right mode to debug them, because I can't immediately conceive of how or where they would exist. However, I totally get that from a certain point of view, this is breaking the enjoyment model of programming. Lots of people like writing code. Lots of people hate debugging code. Me too, to a certain extent. So now I have to debug maintainence code written for me by a dumb robot. This seems like a bad trade.
That's not the only way to look at this though. One of my big problems is sustaining the motivation to carry through with ideas some of the time. (I have that kind of brain). Ideas are pretty cheap, but iterating on them and executing them can be quite costly. And I already wrote a bunch about the idea of higher order tools as a way to reduce programming adjacent work . This is key to the appeal boundaries I find with the current generation of assistants. The most useful stuff they offer is a way for me to delegate the really boring bits of work like 'setting up a tsc project' , or 'building a nice looking README.md' , or 'refactor that submodule name'. And they can deliver on some of this, some of it really well. Right now though, we have bugs. Well, at least zed will let me use some more assistants for debugging. Let's see how far this river goes.
Debugging
Zed gives me the typescript diagnostics on a separate pane. It's not too bad. there's about two dozen errors, most of them type assignment things.
One thing pops out, it's not happy that I'm using 'chrome' apis for things like local storage. Browser extensions are a bit odd. They basically work like a little web service running in your browser. There's roughly 3 bits to it - the content script (the UI stuff for the extension, works a lot like a traditional JS document), the background service (a service worker, a javascript that runs in the background inside your browser session), and the local store (a way to get data persistence for your plugin, it's basically a KV store kind of thing wrapped over SQlite) - I assume the apis for these are something i need to add to my typescript compiler configuration - they're not part of normal JS I guess.
I pop open a zed assistant window, set it to 'ask only' and 'claude sonnet 4.5' and type
@service-worker.ts - why are there so many diagnostic errors about missing 'chrome' - I'm trying to write a browser extension. I suspect I'm missing some node module or type definitions?
The assistant pulls in the diagnostics tab output and says
The issue is that TypeScript doesn't know about the chrome global API. For browser extensions, you need to install the Chrome extension type definitions. Let me check your project setup:
and tells me I can fix it with
npm install --save-dev @types/chrome
this sounds at least 99% right to me, so I do it. Yup, fixed. I commit this change to git.
The next, and only, non-type error is a bit odd 'duplicate function' against a function called classifyPost , which is a function in the service worker that takes in post text and returns the classification - either 'KEEP' or 'FILTER'. I can see that there's also a function with the same name i the content script. I guess that all the functions in the project are getting combined into the single top level namespace. There's no implied scope then. I guess I could fix this by changing one of the function names, probably using 'rename symbol' in the language server. But maybe there's a way to namespace things ?
I bring the assitant back and ask it
there's a name collision because in @service-worker.ts there's a @classifyPost() but there's also a @classifyPost() in the client/side content code. why is this an error , should these be using some kind of namespace mechanism ?
Claude says yeah, these files are getting pulled in as scripts not modules and everything is in the global scope. It suggests that if I simply add empty 'export ' statements to each they'll be loaded as modules. All I can really remember at this moment about ECMAScript modules is that they're a bit weird, relatively recent, and there's a few different ways to provide library scopes, and some back-compatibility quirks. This suggestion sounds about 75% right to me (hell, I'm no JavaScript expert), but the change is so small I figure it's worth it. It works, so much as the errors go away. Ok, makes sense, i don't need to share any symbols. Let's commit this.
The next small tranche of errors are all about assingments, and I can see they're all coming from storage fetches. It looks like claude has implemented a local LRU cache or something, using the aforementioned local store APIs, and it's pulling values out of this in helper functions and not bothering to consider that the key lookups might return nulls. Classic type fun with database nulls! Well, I know enough to fix this, so I just manually add some intermeidate x | undefined types to the fetches, and put if guards around the reads and return paths. Close enough for jazz
The final error flummoxes me a bit. There's a loop around the elements fetched from the page Document , and it's complaining that the ChildElementList type isn't presenting as an iterable. Sounds legit, but also the code looks fine to me. Assitant time again.
can you tell me what's wrong about this loop - it thinks the NodeList isn't iterable @linkedin-filter.ts
The assistant explains that my typescript configuration isn't including the necessary types to make NodeLists fully iterable. I hit a traditional search engine for confimation, and then I add "DOM.Iterable" to the lib: array in compilerOptions . Zero errors.
I pop open zed's integrated terminal and type npm run build. It builds!
The problem with browser extensions is you have to install them
I'm feeling a bit cocky so I decide to try the extension in firefox. I load it manually using the dev tools as a local unpacked extension. Firefox refuses, because it says the service worker can't be found in background.scripts . I've seen this bug before though, manifest v3 support is a bit variable between browesers.
I add background.scripts = ["service-worker.js"] to the manifest and rebuild and install.
It loads!!! What have I done?! Look on my works, ye mighty, and despair.
I quickly load my Linkedn Page. It's full of AI hypers though 😢 The extension doesn't work.
I pull up the extension UI from the toolbar, and it's enabled. But there is a little red error message saying 'ollama not connected'. Oh dear. I check ollama and the server is running. More debug time.
I figure this one out manually using the web dev tools on the service worker inspector, and playing in console. The requests that are being made to ollama are getting 403. Huh. I wasn't expecting that, it's some kind of auth problem. Ugh, I bet it's CORS stuff. That's always tricky with extensions, they're typically requesting from a weird origin.
I know this, it's a UNIX system!
A bit of searching and a quick chat with claude desktop, and yup, ollama is strict about request origins by default in server mode. I can't blame it. After a bit of fiddling with ollama config, and lots of cursing at systemd overrides. I figure out how to configure Ollama to trust a specific origin (firefox has generated an origin url for the extension automatically, like moz-extension://foadfadfoasdfasldjfadslfja) I can trust this origin really, because its an extension I've installed myself (although I'm not leaving it on like this until I've properly read all the code, i've been running it in the debugger and i can see the requests are pretty straightforward).
I spend about 35 minutes messing about with systemd and restarting ollama before i get the recipe right (this is actually the longest bit of debugging) , and then all of a sudden, I notice the requests have switched to 200 and I reload linkedin and ... almost every post is dimmed and tagged as AI hype. Hmm. It's sorta working. Looking at the posts though, a lot of them are just a bit hype, rather than AI hype. So I adjust the prompt I'm sending, and everything seems good. I now have a browser extension that tags and dims annoying AI hypefluencer posts.
I've pushed the whole thing to GitHub, if you're interested enough to peruse. I'm sure it still has plenty of errors, but it works well enough for a blog post, and maybe well enough that I'll refine it into something a bit more useful. I haven't even tested it in chrome browsers.
Now the only thing left to do is to write an hyped up linked in post about my little project, and see if it works on my own content!
Learnings
browser extensions can be pretty cool content filters
LinkedIn is great, but it sometimes can get a bit vapid.
Anyone who's still telling you code generation assistants are just autocompleters that don't actually work is probably uninformed
free local LLMs can do short text classification work very quickly with modest hardware
I like claude, but I get most value from using it from a mix of clients
Zed is pretty neat, if you're at all zed curious, I say .. do it.
Every January I update my CV. Not because my new year's resolution is automatically job-seeking—because experience has shown me you never exactly know when you might need one.
In the tech industry, the great opportunities appear unexpectedly. (And sometimes, so do the redundancies, sadly) Either way, you don't want to be suddenly rebuilding your professional narrative under time pressure, and fighting with aggravating document tools. It's also a useful forcing function for reflection: what did I actually ship last year? What's the thread connecting my work? Who am I exactly in 2026? What do I even do? What am I for? Why am I here? (Help!)
Rebuild it every year then. It's an easy habit I love, and I've tried to pass onto anyone I've ever mentored or managed, but it's undeniably annoying busywork to do sometimes.
The problem with CV maintenance
Keeping a polished CV that meets my expectations is often grief. And most of the large hard stuff isn't even the content, it's just boring document crap.
Templates make you invisible. If your CV looks like everyone else's, it reads like everyone else's. So don't get sidetracked by an off-the peg template, or layout app. But custom layouts can be really fiddly to maintain. (Ask me about that time I wrote a whole book renderer from scratch)
Two documents in one. A CV is both a timeline (experience) and an argument (the narrative about who you are). Edit one dimension and you often need to adjust the other. It's a constraint system. Meanwhile, asethetics matter! You need to have a nice looking document that will catch someone's eye. It sounds shallow, but you need to appeal to grab attention. First impressions are deep, it's just the way humans work. (It's also the way many robots work).
The two-page constraint. It's a hard rule for me, having spent a long time as a hiring manager, designing hiring pipelines. Nobody wants to spend much more than a few minutes scanning CVs at the earliest stages. Senior roles mean more history, but attention spans don't grow to account for this proportionally. You're constantly trading off: what earns its space? Additions and subtractions mean rewrites, and reflows.
Word processors fight you. Either you accept a simple basic paragraph layout, which puts all the pressure on your sparkling prose and legendary achievements to grab people in the first twenty seconds (no pressure) or you spend half your time wrestling with reflow, column balancing, and spacing. Every edit cascades.
The core issue: content and layout are tangled together. Change a job description, and suddenly your careful two-column balance breaks.
Things I've tried
Over the years I've gone through:
MS Word/LibreOffice with a custom template — works until you need to reflow something, then it's an hour of nudging margins and trying to understand headers and anchors again.
Markdown — great for content, and edits, but you lose precise layout control entirely without exact control for element ids and classes to style.
Custom typesetting software fed from a database — yes, obviously I built one; no, I don't recommend it unless you enjoy yak-shaving and are really into typesetting and PDFs. I don't know why I keep ending up writing PDF generators, but it does seems to be a pattern I often fall into 🤔.
Hand-written HTML/CSS with print stylesheets — surprisingly capable, but debugging print rendering across browsers is its own special pain. And tag soup is a nightmare to edit.
Each approach solves one problem while creating others.
Enter Typst!
For this year's update, I discovered Typst, and I'm a bit in love 😍.
Typst is a modern typesetting language—think LaTeX's goals but with a syntax that doesn't make you want to cry, or immediately switch careers to become a goat-herding shephard or a flat race jockey 🐴. It compiles instantly, has a live-preview online editor, and lets you define reusable components.
What makes it fit particularly for CV workflows:
Separation of content and style. e.g. Define a #jobrole() function once, use it everywhere. Change the formatting in one place, every job entry updates.
REPL-like editing. The live preview on typst.app means you see reflow as you type. No more "save, compile, check, swear, undo."
Proper typographic control. Columns, spacing, fonts, all without fighting a GUI. But also without LaTeX's baroque syntax.
Breakable/unbreakable blocks. Tell it "keep this job entry together" and it just... does. No more role titles orphaned at the bottom of a column.
Version control friendly. It's plain text, so your CV can live in git like any other project. Track changes, branch experiments, diff versions—all the workflows you're used to from code.
CI-ready. Typst has a CLI, which works as a compiler. typst compile mycv.typ takes in a source file and produces a nice PDF. Which I means you can build your CV in a pipeline. Push a change, get a fresh PDF pushed straight to your home page, or emailed to your phone, or whatever.
It probably goes without saying, but Claude/Gemini etc. are all trained in the syntax, and can help you get started or finesse things.
A practical example
Here's roughly what a job entry looks like in Typst:
#role("Platform.sh", "2018–2024", "Cloud Engineer → VP Tooling & Images")[ Six years from IC to VP. Built custom container images, worked on authentication and WAF, worked on hiring pipelines and engineering career frameworks; created the Engineering Excellence and Tools division. ]
One function call. The #role() definition lives elsewhere and controls all the formatting—fonts, spacing, whether it can break across columns. Change it once, every entry updates. You can have a look at my source code in the GitLab repo I have linked. I'm sure it's very amateurish, it's my first crack at using typst like this, and I've spent less than an hour on it, but quite quickly got close to my existing template with ease.
Compare that to copying and pasting styled paragraphs in Word, or fighting with CSS print margins.
A Concluding Point
Tools matter. The friction of your tools shapes what you actually do.
Despite my annual rule, updating my CV was annoying and consumed time. I was constantly reworking my approach trying to get it more automated. Now it's almost enjoyable. Maybe I'll start updating it quarterly. It will also free me up to focus much more on the content and the impact, which is really where I actually want to be spending my effort in this task.
If you've been putting off a CV refresh, and you recognize my complaints about tooling, why not give typst a look. I have no affiliations and nobody has paid me to say this! I just found it, tried it and think it's neat.
I was messing around with regular expressions on some default interpreters on my Debian machine, wondering about what the default encoding behaviour for string literals might be. As you do. So, given a string with a "pesky foreign accent" in it, how many characters do various languages think it has?
Unfortunately, this creaky old blog software I hand-cranked cannot render this amount of markdown source blocks (ROFL), and I so I collated a GitHub gist of my findings. Despite all it's many faults, GitHub excels at markdown-serving.
The languages I tested are Perl, Python, PHP, Ruby, Bash, Common Lisp, and JavaScript.
The test ? Does the string café (an English loan word with an accent character ) match the regular expression denoting 'a four character string'
On my computer, with a UTF-8 locale, this string, while clearly four characters long, occupies five bytes . So does the string match four characters or five characters? This is computers, so obviously the answer is it depends.
There's been a flurry of discussion on Hacker News and other tech forums about what killed Perl. I wrote a lot of Perl in the mid 90s and subsequently worked on some of the most trafficked sites on the web in mod_perl in the early 2000s, so I have some thoughts. My take: it was mostly baked into the culture. Perl grew amongst a reactionary community with conservative values, which prevented it from evolving into a mature general purpose language ecosystem. Everything else filled the gap.
I remember Perl
Something to keep in mind, is that although this is my personal take, and therefore entirely an opinion piece, I was there at the time. I stopped doing Perl properly when I left Amazon, I think this would have been around 2005. It's based on the first hand impressions of somebody who was very deeply involved in Perl in its heyday, and moved on. I have a lot of experience, from both inside and outside the tent.
Perl's roots are sysadmin
What culture? Perl always had a significant amount of what you might call "BOFH" culture, which came from its old UNIX sysadmin roots. All of those passive aggressive idioms and in jokes like "RTFM", "lusers", "wizards", "asking for help the wrong way" etc. None of this is literally serious, but it does encode and inform social norms that are essentially tribal and introverted. There implicitly is a privileged population, with a cost of entry to join. Dues must be paid. Cultural conservatism as a first principle.
This stems from the old locked-down data centre command culture. When computer resource was expensive, centralised, fragile, and manually operated, it was rigidly maintained by gatekeepers, defending against inappropriate use. I started my career as an apprentice programmer at the very end of this era, (late 80s) pre-web, and before microcomputers had made much inroads, and this really was the prevailing view from inside the fort. (This is a drawback about fort-building. Once you live in a fort, it's slightly too easy to develop a siege mentality). Computers are special, users are inconvenient, disruption is the main enemy.
An unfortunate feedback loop in this kind of "perilous" environment is that it easily turns prideful. It's difficult to thrive here, if you survive and do well you are skilled; you've performed feats; you should mark your rites of passage. This can become a dangerous culture trap. If you're not careful about it, you may start to think of the hazards and difficulties, the "foot guns", as necessary features - they teach you those essential survival skills that mark you out. More unkindly, they keep the stupid folk out, and help preserve the high status of those who survived long enough to be assimilated. Uh-oh, now you've invented class politics.
The problem with this thinking is that it's self-reinforcing. Working hard to master system complexities was genuinely rewarding - you really were doing difficult things and doing them well. This is actually the same mechanism behind what eventually became known as 'meritocracy'1, but the core point is simpler - if difficulty itself becomes a badge of honour, you've created a trap: anything that makes the system more approachable starts to feel like it's cheapening what you achieved. You become invested in preserving the barriers you overcame.
(This is the same mentality that built leetcode interview pipelines BTW, but let's leave that sidebar alone for now)
So the UNIX operator culture tended to operate as a tribal meritocracy (as opposed to the UNIX implementer culture, which fell out of a different set of cultural norms, quite an interesting side bar itself2), a cultural priesthood, somewhat self-regarding, rewarding of cleverness and knowledge hoarding, prone to feats of bravado, full of lore, with a defensive mentality of keeping the flame aloft, keeping the plebs happy and fed, and warding off the barbarians. As we entered the 90s it was already gently in decline, because centralised computing was giving way to the rise of the microcomputer, but the sudden explosive growth of the WWW pulled internet / Unix culture suddenly back into the mainstream with an enormous and public opportunity vacuum. Everyone suddenly has an urgent need to write programs that push text off UNIX file-systems (and databases) and into web pages, and Perl is uniquely positioned to have a strong first-mover advantage in this suddenly vital, novel ecosystem. But its culture and values are very much pulled across from this previous era.
(Springing out of this, Perl had an, at best grudging, tolerance for 'difficult genius' types, alongside this baseline culture. Unfortunately, this kind of toxic personality tends to thrive in the type of culture I've described, and they do set to help the tone. I'm not here to call out people specifically, because I'm trying to make a point rather than feed a culture war, or dig up gossip, but there were several significant examples, you can probably find lore if you like. I think the kindest way I can describe the compounding effect of this is that there was a strong cultural norm along the lines of "It's OK to be rude, as long as it's for a good cause".)
A fort within a fort
I remember this tension as always being tangibly there. Perl IRC and mailing lists were quite cliquey and full of venerated experts and in-jokes, rough on naivety, keen on robust, verbose debate, and a little suspicious of newcomers. And very cult-like. The "TIMTOWTDI" rule, although ostensibly liberal, literally means 'there is more than one way to do it in Perl' - and you can perhaps infer from that that there's little to no reason to do it using anything else. Elevating extreme flexibility like this is paradoxically also an engine of conservatism. If Perl can already do anything, flexibly, in multiple ways, then the language itself doesn't need to change - 'we already have one of those here, we don't need new things'. This attitude determined how Perl intended to handle evolution: the core language would remain stable (a fort inside a fort, only accessible to high level wizards), while innovation was pushed outward to CPAN. You could add features outside of core by writing and consuming third party libraries, you could bend language behaviour with pragmas without modifying Perl itself. The very best CPAN modules could theoretically be promoted into core, allowing the language to evolve conservatively from proven, widely-used features.
On paper, this sounds reasonable. In practice, I think it encoded a fundamental conflict of interest into the community early on, and set the stage for many of the later growth problems. I'm not going to pretend that Perl invented dependency hell, but I think it turned out to be another one of those profound misfeatures that their cultural philosophy lead them to mistake for virtue, and embrace.
An interesting thing I think has been missed discussing the context of the original blog piece, about whether Perl 6 significantly impacted Perl growth, is the fact that Perl 6 itself manifested out of ongoing arguments. Perl 6 is a schism. Here's a oft-cited note from Larry Wall himself about the incident that sparked Perl 6, at YAPCOSCON 2000
We spent the first hour gabbing about all sorts of political and organizational issues of a fairly boring and mundane nature. Partway through, Jon Orwant comes in, and stands there for a few minutes listening, and then he very calmly walks over to the coffee service table in the corner, and there were about 20 of us in the room, and he picks up a coffee mug and throws it against the other wall and he keeps throwing coffee mugs against the other wall, and he says "we are f-ed unless we can come up with something that will excite the community, because everyone's getting bored and going off and doing other things".
(Pause a second and ask yourself about the sort of social culture that both allows this kind of behaviour at public events, and then chooses to embrace it as a key piece of cultural lore)
The impact of Perl 6
Perl 6 was really a schism. Perl was already under a great amount of strain trying to accommodate the modernising influx of post dot-com mainstream web application building, alongside the entrenched conservatism of the core maintainers, and the maintenance burden of a few years exponential growth of third-party libraries, starting to build a fractal mess of slightly differentiating, incompatible approaches of those multiple ways to do things that were effectively now table-stakes language features, as the deployment landscape started to tiptoe towards a more modern, ubiquitous WWW3.
So, while I agree that it's wrong to generalise that 'Perl 6 killed Perl', I would say that Perl 6 was a symptom of the irreconcilable internal forces that killed Perl. Although, I also intend to go on to point out that Perl isn't dead, nothing has actually killed Perl. Killed Perl is a very stupid way to frame the discussion, but here we are.
So... Perl 6 is created as a valve to offset that pressure, and it kind of works. Up to a point. Unfortunately I think the side effect really is that the two branches of the culture, in the process of forking, double down on their encoded norms. Perl 5.x beds down as the practical, already solved way to do all the same things, with little need to change. Any requirements for more modern application patterns that are emerging in the broader web development environment, like idk, Unicode, REST clients, strict data structures, asynchronous I/O, whatever? That can either wait for Perl6 or you can pull things together using the CPAN if you want to move right now. Perl 6 leans the other way - they don't need to ship immediately, we have Perl 5 already here for doing things, Perl 6 is going to innovate on everything, and spend its time getting there, designing up-front.4 They spend at least two years writing high level requirement specs. They even spin out a side-project trying to build a universal virtual machine to run all dynamic programming languages that never delivers5
This is the landscape where Perl's central dominance of 'back end' web programming continues to slip. Unfortunately, alongside the now principled bias toward cultural conservatism, Perl 5 has an explicit excuse for it. The future is over there, and exciting, and meanwhile we're working usefully, and getting paid, and getting stuff done. Kind of OK from inside the fort. Some day we'll move to the newer fort, but right now this is fine. Not very attractive to newcomers though, really. And this is also sort of OK, because Perl doesn't really want those sort of newcomers, does it? The kind that turns up on IRC or forums and asks basic questions about Perl 6 and sadly often gets treated with open contempt.
Meanwhile, over there
Ruby has sprouted "Ruby on Rails", and it's taken the dynamic web building world by storm. Rails is a second generation web framework, that's proudly an 'opinionated web framework'. Given that the web application architecture is starting to stabilise into a kind of three-tier system , with a client as a web browser, a middle tier as a monolithic application server, and a persistence layer as a relational database , and a split server architecture serving static and dynamic content from different routes, here is just one way to do that, with hugely developer friendly tooling turning this into a cookie-cutter solution for the 80% core, and a plugin and client-side decoration approach that allows for the necessary per-site customisation.
Ruby is interesting as well. Ruby is kind of a Perl6 really. More accurately it's a parallel universe Perl5. Ruby comes from Japan, and has developed as an attempt to build something similar to Perl, but it's developed much later, by programming language enthusiasts, and for the first ten years or so, it's mostly only used in Japan. To my line of thinking this is probably important. Ruby does not spring from decades of sysadmin or sysop culture. Ruby is a language for programmers, and is at this point an sensible candidate for building something like Rails with - a relatively blank canvas for dynamic programming, with many of the same qualities as Perl, with less legacy cruft, and more modern niceties, like an integrated object system, exceptions, straightforward data structures. Ruby also has adopted 'friendliness' as a core value, and the culture over there adopts a principled approach to aggressively welcoming newcomers, promoting easiness, and programmer happiness and convenience as strong first class principles.
Rails is a huge hit. At this point, which is around about the time I stopped significantly using Perl (2004-2005) (because I quit my job, not out of any core animosity toward it, in fact, in my day, I was really quite a Perl fan), Rails is the most appealing place to start as a new web programmer. Adoption rate is high, community is great, velocity of development is well-paced, and there's a lovely , well-lit, onboarding pipeline for how to start. You don't even really need to know ruby. It has a one-shot install tool, and generates working websites from templates, almost out of the box. It's an obvious starting point.
Perl being Perl, develops several analogue frameworks to Rails, all of them interdependently compatible and incompatible with each other and each other's dependencies, all of them designed to be as customisable and as user configurable as they possibly can be6
PHP
There are also the other obvious contenders. PHP has been there all along, and it's almost coming up from entirely the opposite cultural background of Perl. PHP is a users language. It's built to be deployed by copying script files to your home directory, with minimal server side impact or privileges. It's barely designed at all, but it encounters explosive growth all the way through the first (and through into the second) web era, almost entirely because it makes the barrier to onboarding so low as to be non-existent. PHP gets a couple of extra free shots in the arm
Because its architecture is so amenable to shared-server hosting, it is adopted as the primary implementation language of the blogging boom. An entire generation of web developers is born of installing and customising WordPress and text-pattern et. al by installing it directly into your home directory on a rented CPanel host account. It's the go-to answer for 'I'm not a programmer really but how do I get a personal web site'7 This zero gate-keeping approach keeps the PHP stack firmly on the table of 'basic' web programmers all through the history of the web up to the current day.
Because of these initially lightweight deployment targets, PHP scales like little else, mostly because it's execution model leans strongly towards idempotent execution, with each web request tearing up and tearing down the whole environment. In a sense, this is slower than keeping hot state around, but it does lend itself extremely well to shared-nothing horizontal scaling, which as the web user base increases gigantically throughout the 2000s era, is the simplest route to scaling out. Facebook famously, is built in PHP at this point in time.
Python
There is of course one other big horse in the race in this era, and it's a particularly interesting one in many ways, certainly when contrasted with Perl. This is of course, Python. Python is a close contemporary of Perl's but once again, its roots are somewhere very different. Python doesn't come from UNIX culture either. Python comes from academia, and programming language culture. It's kind of a forgotten footnote, but Python was originally built for the Amoeba operating system, and its intention was to be a straightforward programming language for scripting this8. The idea was to build a language that could be the 'second programming language' for programmers. Given that this is the 1980s, early 1990s, the programmers would be expected to be mostly using C / C++ ,perhaps Pascal. Python was intended to allow faster development for lighter weight programs or scripting tasks. I suppose the idea was to take something that you might want to build in a shell script, but provide enough high level structured support that you could cleanly build the kind of things that quickly become a problem in shell scripts. So, it emphasises data structures, and scoped variables, and modules, and prioritises making it possible to extend the language with modules. Typical things that experienced programmers would want to use. The language was also designed to be portable between the different platforms programmers would use, running on the desktops of the day, but also on the server. As a consequence, it had a broad standard library of common portable abstractions around standard system features - file-systems, concurrency, time, FFI. For quite a long time, one of python's standard mottoes was 'batteries included'.
Python never set the world on fire at any particular moment, but it remained committed to a clear evolutionary incremental development, and clean engineering principles. Again, I think the key element here is cultural tone. Python is kind of boring, not trying to be anyone's best language, or even a universal language. Python was always a little fussy, maybe snobby, slightly abstracted away from the real world. It's almost as old as Perl and it just kept incrementally evolving, picking up users, picking up features, slowly broadening the standard library. The first time I saw Python pick up an undeniable mainstream advantage would also have been around the early 2000s, when Google publicly adopted it as one of their house standard languages. Never radical, just calmly evolving in its environs.
Nature abhors a vacuum
When I sketch out this landscape, I remain firmly convinced that most of Perl's impedance to continued growth were cultural. Perl's huge moment of relevance in the 90s was because it cross-pollinated two diverging user cultures. Traditional UNIX / database / data-centre maintenance and admin users, and enthusiastic early web builders and scalers. It had a cultural shock phase from extremely rapid growth, the centre couldn't hold, and things slowly fell apart.
Circling back though, it's time to address the real elephant in the room. Perl manifestly did not die. It's here right now. It's installed I think by default, on almost every single computer I own and operate, without me doing a single thing to make that happen. It's still used every day by millions of people on millions of systems (even if that isn't deliberate). It's still used by many people entirely deliberately for building software, whether that's because they know it and like it and it works, or because they're interfacing with or working on legacy Perl systems (of which there are still many), or maybe they're using it still in its original intentional role - A capable POSIX-native scripting language, with much better performance and a broader feature-set than any shell or awk. I still occasionally break it out myself, for small scripts I would like to use more than once, or as parts of CLI pipelines.
What I don't do any more is reach for Perl first to make anything new. In my case, it's just because I typically am spoilt for options that are a better fit for most tasks, depending on whatever it is I'm trying to achieve. By the time I came to Perl, (1998-ish), I was already on my third career phase, I had a strong UNIX background, and had already built real things in lisp, java, pascal, visual basic and C++. My attitude to languages was already informed by picking a tool to fit the task at hand. Boy did I love Perl for a few years. The product/market-fit for those early web days was just beautiful. The culture did have too much of the negative tropes I've been pointing at, but that wasn't really a problem personally for me, I'd grown up amongst the BOFHs inside the data centres already, it wasn't too hard for me to assimilate, nor pick up the core principles. I did occasionally bounce off a couple of abrasive characters in the community, but mostly this just kept me loosely coupled, I enjoyed how the language solved the problems I needed solving quickly, I enjoyed the flexibility, and I also enjoyed the way that it made me feel smart, and en-route to my wizard's robes and hat, when i used it to solve harder problems in creative ways, or designed ways around bugs and gremlins. For a good 3-4 years I would have immediately picked it as my favourite language.
So as I say, I didn't fall out of it with any sense of pique, I just naturally moved to different domains, and picked up tools that best fit. After Amazon, I spent a lot of time concentrating on OS X and audio programming, and that involved a lot of objective C and C++. The scripting tools in that domain were often in ruby, sometimes python. For personal hacking, I picked up lisp again9 (which I'd always enjoyed in school). I dipped in and out of Perl here and there for occasional contract work, but I tended to gravitate more towards larger database stuff, where I typically found C, java and python. The next time I was building web things, it was all Rails and Ruby, and then moving towards the web services / REST / cloud era, the natural fits were go, and of course node and JavaScript or TypeScript. I've always been a polyglot, and I've always been pretty comfortable moving between programming languages. The truth of the matter is, that the majority of programming work is broadly similar, and the specific implementation details of the language you use don't matter all that much, if it's a good fit for the circumstances.
I can't imagine Perl disappearing entirely in my lifetime. I can remember entire programming environments and languages that are much, much deader than I can ever see Perl becoming.
Pascal used to be huge for teaching and also for desktop development in the 8/16 bit era
Objective C - only really useful inside the Apple ecosystem, and they're hell bent on phasing it out.
Before I got into the Internet, I used to build application software for 16 bit Windows (3.11) which was a vast market, in a mixture of database 4GLs (like PowerBuilder, Gupta/Centura SQLWindows) and Win16 C APIs. This entire universe basically no longer exists, and is fully obsolete. There must be many similar cases.
I mean who the hell realistically uses common lisp any more outside of legacy or enthusiast markets? Less people than Perl I'm sure.
Perl also got to be if not first, then certainly early to dominate a new market paradigm. Plenty of things never manage that. It's hard to see Perl as anything other than an enormous success on these terms. Perl innovated and influenced languages that came after in some truly significant ways.
Tightly embedding regular expressions and extending regular expressions (the most commonly used regular expression dialect in other tools is Perl)
CPAN, for package/library distribution via the internet, with dependency resolution - and including important concepts like supply chain verification with strong package signatures
A huge emphasis on testing, automated test harnesses, and CI. Perl test format (TAP) is also widely found in other CI/harness systems
Blending the gap between shell / scripting / and system programming in a single tool. I suppose this is debatable, but the way Perl basically integrated all the fundamental POSIX/libc as native built-ins with broadly the same semantics, but with managed memory and shell conventions was really revolutionary. Before this, most languages I had ever seen broadly tended to sit in one box, afterwards, most languages tended to span across several.
Amazing integrated documentation, online, in-tool and also man pages. POD is maybe the most successful ever implementation of literate programming ideas (although most of the real docs don't intertwingle the documentation very much iirc)
Just these points, and I'm sure there are many others that could be made, are enough of a legacy to be proud of.
Counterfactuals are stupid (but also fun). If I squint, I can imagine that a Perl with a less reactionary culture, and a healthier acceptance of other ideas and environmental change might have been able to evolve alongside the other tools in the web paradigm shift, and still occupy a more central position in today's development landscape. That's not the Perl we have, though, and that didn't happen. And I'm very confident that without the Perl we did have, the whole of modern software practice would be differently shaped. I do think Perl now lives in a legacy role, with a declining influence, but that's really nothing to feel shame or regret for. Nobody is going to forcibly take Perl away as long as POSIX exists, and so far as I can see, that means forever. In 2025 too, I can see the invisible hand creeping up on some of these other systems I've mentioned. Rust is slowly absorbing C and C++. Ruby (and of course Rails) is clearly in decline, in a way that probably consigns it to become a similar legacy state. From a certain angle, it looks a lot like TypeScript is slowly supplanting Python. I won't be entirely surprised if that happens, although at my age I kind of doubt I'll live to see the day.
Footnotes
1 : Meritocracy is a fun word. It was originally coined as a pejorative term to describe a dystopian mechanism by which modern i.e. Western / British society entrenches and justifies an unfair and unequal distribution of privilege
2 : The UNIX implementer culture, is scientific/academic and fell out of Bell Labs. I guess you could extend this school of thought as a cultural sweep towards building abstracted cloud operations, toward plan 9/ Inferno / go
3 : Web 2.0 was first defined in 1999 by Darcy DiNucci in a print article , the term didn't become mainstream until it was picked up and promoted by Tim O'Reilly (then owner/operator of perl.com, trivia fans), an astute inside observer of the forces driving web development
4: Another unfortunate bit of luck here. Right at the point of time that 'agile' starts getting some traction as a more natural way to embrace software development - i.e. iterating in small increments against a changing environment and requirements, Perl 6 decides to do perhaps the most waterfall open source development process ever attempted. . It is fifteen years before Perl 6 ships something resembling a usable programming language.
5 : The Parrot VM, a lovely quixotic idea, which sadly fizzled out, after even Perl 6 stopped trying to target it. Interestingly enough, both python and ruby both made relatively high profile ports to the JVM that were useful enough to be used for production deploys in certain niches.
6 : A side effect of this degree of abstraction, is that as well as being very hard to get started, it's easy to fall foul of performance overhead.
7 : This ubituitious ecosystem of small footprint wordpress custom installs gives birth to the web agency model of commercial website building / small ecommerce sites, which thrives and is suprisingly healthy today. Recent, and slighly optimistic surveys have pitched WordPress as powering over 40% of all websites today. Now this is certainly inflated, but even if the realistic number is half of that, that's still pretty damn healthy.
8 : It's often repeated that Python was designed as a teaching language, but as far as I know, that's not actually the case. The designer of Python, Guido Van Rossum was previously working on a project that was a intended as training language, called ABC, and many of ABC's syntax and structural features influenced or made their way into Python.
9 : Common lisp is a better answer to an infinitely flexible 'everything' chainsaw language than perl, IMHO
Calm down, not gigs I'm playing, gigs I'm going to.
Festival season is upon us, I realised, as I found myself watching the Amazon Prime streaming footage of this years Primavera Sound. Primavera seems huge now, and I'm surprised/pleased to notice I'm not really getting any pangs of FOMO. It's fun to dip in and out of the streams though, and the Amazon presenting team / live mixing are endearingly sloppy and untogether. I guess PS has outgrown me, or I've aged out? I think it's them not me, it's only just about recognisable as the buzzy hip live music mash-up I first started going to almost twenty years ago. Maybe I'll go for 2027 (which by my count will be the 25th anniversary year) - or maybe I'll find something a bit smaller and leaner and odder. I've been eyeing OFF in Poland for a while...
Live music has changed so much recently. I was gently flabbergasted when the Beta Band, a band I used to gently follow around the country, announced a reunion cash-in tour, and sold out the early access, and the general access tickets in about five minutes flat, while I sat there bewildered trying to figure out which Ticketmaster entry-point seemed the least likely to meltdown while I was using it. This being a band I regularly used to pay a fiver on the door to see play in a half-empty student union bar, back when they were hot stuff 🤣.
That appears to be the way things are nowadays. You have to pre-register for the pre-register, and get a code from pre-ordering some merch, and know how to game the electronic queues. And then you have to remember which one of sixteen different apps (all owned by Ticketmaster) your electronic tickets are squirrelled away in them, when you finally arrive at the show dates, up to a year later. That's if you remembered to put in your diary, and didn't just forget about it entirely. Anyway, I'm not too sad about missing the Betas, might as well let some other people have a chance - although I expect I'd have had all the nostalgia, I'm quite sure they'll do the festivals in another year or two after this level of interest. While I was in Ticketmaster, failing to get Beta-d, I saw some St. Vincent tickets for Somerset house in July, on sale before any announcement, so I picked up a couple of those instead.
I have had a bit more success with some of my other pre sales. Upcoming I have
I think that's it so far. I might have forgotten some, as I said. At my age I need to write things down. Like this.
Two things all of these have in common - I had to buy all of them months and months in advance using a 'pre-sale code', and, somehow they're all female artists. I don't think those two facts are related. It's fairly noticeable that women artists really dominate popular music these days. I cynically suspect this might be related to the fact that all the money fell out of it a couple of decades back. The boys have all moved on to being startup founders.
I'm not going to see Oasis, but I will be tracking the integrity of their reunion over at my hand made personal oasis bust-up tracker https://haveoasissplitupyet.lol
My children are going to Glastonbury without me 😬. Most likely, I will watch some of it on the TV.
Oh, I am going to Latitude again, although I'm so spectacularly uninterested in almost all of the musical acts playing, I hardly count it as a gig. I expect I'll be down the front for Alison Moyet though.
Some time on Friday, IMDb announced that they intended to shut down their message board system, permanently. I don't find this to be a particularly surprising decision. I'm more surprised that the message boards are still there, in 2017, seemingly essentially unchanged for the last fifteen or so years. They've had a few coats of paint, and a handful of feature improvements, but they largely seem to be backed by the same system design developed by the in-house tech team, way back at the dawn of the century. And for the bulk of that early development time, I was the primary developer. As it has said on my homepage for many years, 'you can blame me for the message boards'.
A long time ago in a galaxy far, far away
I was incredibly excited to be asked to join the IMDb developer team at the end of 2001. Aged 30, with almost a decade of professional software development under my belt already. Although 2001 sounds today like it was the relative stone-age of the modern web, which of course in many ways it was. At this point I had already spent several years working on basic web applications in the original dot-com boom, and I was in-awe of the IMDb , which even back then was a somewhat venerable internet institution. Founded in 1990, it thus predates the invention of the World Wide Web by several years, having started out as lists of data shared via USENET posts. At the time I joined, they were a couple of years into their Amazon ownership, and starting to expand the team.
As I started, they were just on the cusp of launching IMDbPro and had an ambitious roadmap to completely rebuild the main website from the inside out, using the shiny new technology stack the small development team had built from the ground up to power the IMDbPro application server. This, I thought was a very clever hack - imdb.com was a hugely popular website, and this approach of adding industry focused features to a subscription remix of the site built on top of the same data feeds (still basically formatted text lists, using the conventions of the old USENET based tools) meant that in effect we could use the far smaller user base of the pro site as a test-bed for the new tech, and gradually port sections of this across to the terrifyingly high volume 'consumer' site, without having to do a rewrite and a relaunch. To further sweeten the deal, if you look at this arrangement, this meant that the test-bed users would actually be paying to break in the newer software, and helping you iron out the bugs.
In 2001, a shiny new high performance web stack meant perl . Apache 1.3.x running mod_perl to be more precise. In case you don't know what mod_perl is, it's a piece of semi-deranged brilliance that wraps the perl language interpreter into an apache module as a persistent runtime and exposes the internal API of the HTTP server to it. This lets you write applications that are now effectively themselves apache webservers, with direct access to every part of the HTTP serving lifecycle. Furthermore, by using the other neat hack, Registry.pm you could use modules or scripts that had been designed to work as CGI scripts, and get the some of the same speed boosts, unmodified. With these techniques, you could write perl applications that went almost as fast as Apache could, and in the late 90s/early 00s it was this or PHP. PHP back then was pretty grotty, I thought, and the cool kids were all using perl. Perl had libraries, and excelled at gluing existing bits of UNIX together. This meant you had to write far less of the application by hand. Yup, by hand . Let me dig into that a little bit
It's the pictures that got small
Writing web software back then was a fairly different prospect. In my circles, we didn't really have much in the way of frameworks. There were a few enterpris-ey things floating around that converted your big IBM and Oracle and Microsoft client/server application into some kind of terrible intranet suite that required ActiveX support to load any pages, and I'd poked around with Zope with some interest, but by and large if you were doing anything interesting, you used FreeBSD, or linux (2.2, with SMP support!).
You'd most likely use Apache 1.3, forking, and write your site as a combination of static pages, server side templating and CGI exec-d programs, in some kind of UNIX scripting language (usually perl, but any of the usual suspects were relatively common, including actual honest-to-god shell scripts), or maybe you'd write a performance critical CGI as binary in C.
For data processing, you might connect your application directly to a pre-existing company RDBMs, if you had such a thing and your DBA, if you had such a thing, let you, or you might deploy a SQL db on or nearby to your web host - usually MySQL 3.22 with ISAM and a quasi-religious intolerance for foreign key support but that was OK you could do all the data validation in application code. ( A bit like JavaScript databases in 2017 )
We had libraries for common tasks, like parsing wire protocols and file formats, and wrapping utilities to do things like generate or resize graphics, but you'd stitch a selection of these together in an ad-hoc fashion to make a 'system'. A typical web stack would be table-based HTML with attribute styling and inlined images for typography and spacing , possibly pre-rendered, but maybe dynamically generated, then some CGI scripts for user management full of hand coded cookie and session tracking. A relational database for persistence, using hand coded SQL and a custom database schema. Page generation via a self-written templating system, gluing skeletons of layout-oriented HTML around variable interpolation with inline conditionals. This part would often run as server-side includes, but sometimes this would also have just been handled by CGI scripts.
Maybe you'd have a hand built filesystem cache in front of this. 'Front-end' back then would often build static page representations, first in Photoshop or Illustrator , which would then be converted into single HTML page masters in Dreamweaver or FrontPage and then handed over to the back-end coders to clean up and crack apart into templated fragments, by hand. Single byte string encodings through-out, no threading, a light veneer of Object Orientation over internal data structures - you'd have a small cluster of actual physical servers, perhaps in a data center, but often on-premises, sometimes in racks, sometimes actual tower servers in the corner, directly connected to an internet router of some pitiful capacity. Sometimes your cluster was as small as one machine.
Architecturally you'd have a webserver, perhaps two if you wanted to split 'heavy' dynamic serving from lighter or static content. Your database might end up on its own box with better IO and networking. If you had enough web servers you might put some kind of load balancer in front, perhaps a HTTP reverse proxy as an accelerator cache (often another Apache, sometimes Squid ). In 2001 I'm not sure I fully understood what a CDN even was . You'd deploy with FTP or maybe rsync , sometimes the production filesystems were locally mounted via NFS or SMB and you'd just copy stuff over, or edit it in place. Version control, if you even had any might just be renaming files, perhaps SCCS or RCS. Advanced users might have CVS. Designers might have a pre-OS X Macintosh , suits would use Windows , developers had something more of a free-for-all - windows 2k , desktop linux , I used BeOS for several years whilst that was still a thing, and seemingly everybody , but everybody used emacs to write code - GNU emacs was common, but the cool kids were using XEmacs . Sometimes a remote XEmacs client on your deploy host attached to your local X11 server over the wire . Crazy days.
My God, it's full of stars
So that's the scene in 2001 when I joined the amazon.com family as an SDE , working on the new IMDb platform. I was a fairly hot perl programmer, having spent a good few years designing and rewriting custom web 'frameworks' and optimising mod_perl architectures. I was really good at SQL, at least I thought I was in comparison to most of my peers, and I had developed a particular fondness for the then slightly uncommon PostgreSQL database engine . I'd done quite a few web things - early corporate intranet portals, hobby sites , moderately popular dot-com publishing houses , but this was a step change into an entirely bigger league.
In reality, especially as I look back with hindsight, I can see I had very little idea what I was doing, but hardly anyone did. There wasn't a lot of published material on architecture - everyone read Greenspun , but there was nothing like the modern tech web, scalability porn, conference circuit. No HN , no Reddit , no twitter , no Facebook, and looking things up on StackOverflow was still almost a decade away. It wasn't even that easy to find what scant information there was, you have to remember that Google was barely yet a thing. Information sharing tended to happen on mailing lists, using actual email, or maybe still on USENET. ( Paul Graham hadn't yet written ' A plan for spam ', and we didn't really have functional automated spam filtering).
IMDb had an unusual working setup for the day, as befitted it's birth from a federation of USENET correspondents. Everyone worked completely remotely, scattered around the world. At the time I joined, there was an express preference for staff who could attend a weekly company meeting over lunch, near Bristol ( in a cafeteria, attached to a swimming pool ), and the majority of the tech team building the software was now based around this area. Home Internet connectivity was still largely 56kbps or lower dial-up , possibly metered, although I was lucky enough to be in a part of Bristol eligible for an insanely fast 1Mbps cable connection .
Anticipating having to work on significant amounts of DP, potentially offline, I asked if I could be provided with a small server with SMP and RAID capacity, and was rather surprised by a small tower HP Proliant rig turning up at my house, cocooned onto a loading pallet too big to fit through the front door. I had to unglue it piece by piece and carry it up to my 'home office', a box bedroom full of IKEA tables, slightly too tall to be comfortable desks, and assemble it in place. I christened it mavis.imdb.com, and installed Debian stable on it, which involved most of a day figuring out the hardware RAID drivers, and from that point on it's shrieking fans and disks were a constant part of my daily life for the next half-decade. Eventually a house move allowed me to get it into a makeshift server cupboard where I could deaden this persistent din behind a door and blankets and curtains. I occasionally wonder now, in my middle-age, if I have a frequency gap in my hearing to match that particular pitch, but if so, it's not affected me enough to care to get it measured. As the noise tended to interfere with music, for the first few years I developed a habit of listening to BBC Radio 4 morning to midnight, and therefore, when there wasn't a test match to listen to, for a brief period of my life I developed an unusual degree of expertise in the comings and goings of 'The Archers' .
One consequence of the remote working, and patchy connectivity was that the development work in the tech team was informally silo-ed up into sub-systems that individual engineers had ownership over. The very first task I worked on, after getting a working build of the entire stack onto mavis, was porting the statistics page across to the new web stack (internally known as 'mayhem', after project mayhem , everyone was big on movie references, naturally) by way of familiarising myself with the application and infrastructure. I made a perfunctory stab at that, and then I was searching around for something more substantial to own. The forums, or 'message boards' seemed to be a natural candidate.
The most recent piece of work I'd done at my previous gig , had been to contribute a threaded discussion system to our general purpose content management system, which allowed a tree of conversations to be attached to any content id in the catalogue, so the site users could have a threaded comments section attached to any content. This had worked pretty out well. By contrast, IMDb had a pretty threadbare generic forum system, a standalone phpbb installation, almost entirely isolated from the rest of the system, organised into a few dozen general purpose with I think even a separate login system.
A business goal for the next year was to drive up user registrations, and the forums system seemed like a good feature to assist with this. It offered additional site value that was only viable to registered users. Another target was to integrate the boards system more directly into the movie database, allowing people to have conversations directly attached to the pages for movies and shows. Another important requirement was to allow for a system that would let the data contributors directly communicate with the data management team. So I was tasked to do something with the forums to meet these broad goals, and the implementation and design of it was largely up to me, informed by regular feedback from the wider team onto weekly progress reports and via the team lunch meeting.
We're going to need a bigger boat
I considered a number of approaches.
I could have extended the PHP forum system as was, to support the new features, but I didn't really consider that for more than a couple of minutes - it was PHP, which I didn't know terribly well, and disliked, and would be harder to tightly integrate with the rest of the mayhem app, which was a domain optimised mod_perl web service.
I wondered about wrapping a USENET service, which had a lot of appeal, in as much as a lot of the base mechanics of hierarchy would be already covered, and a highly scalable architecture with a portable standard with several existing back-end implementations. I really liked this idea a lot, but I rejected it eventually when I realised that it would be difficult to build an integrated web front end that offered as much functionality as a stand-alone newsreader. If I had been able to find a decent open-source web NNTP client I might very well have done this.
Another alternative would have been to find an alternative forum system that was more amenable to customisation. I considered using the slash system that powered slashdot.org, but I rejected that because at that time it had a reputation for poor performance and uptime, and was struggling with coping with trolls. I really should have paid more attention to these ideas , both of which would come back to haunt me
eventually using a mixture of naivety, hubris, ego, enthusiasm and pragmatism I decided I'd build something custom, scaling up over the ideas I'd used for the comments module in my previous job.
The basis for that system was something I was quite proud of, and in some senses it was quite a clever hack. We had wanted threaded discussions, but it's famously tricky to model trees in SQL. My first attempt, with hydrating flat lists into trees at runtime from a SQL result set was computationally a little bit expensive for the hardware of the time, and slowed up page rendering in the articles with comments.
So I came up with an ingenious scheme. I'd store several sort fields against the comment records - one representing the vertical position in the thread, and one representing the indentation level, and every time a reply was inserted into a comment thread, I'd compute the correct indent level by adding one to the parent reply, set the vertical position to one larger than the parent, and then update every larger sort sequence to increment it by one; so that they were sequentially stored in thread order when read by that index. As I was storing the timestamp, and a sequential post id, I thereby had a system that could trivially read back conversations by order of time, order of posting or order of reply. This meant that posting was relatively computationally expensive, but only on the database server, whereas reading was simple and fast. I reasoned that reads were many times more frequent than writes, and biasing the system this way would optimise it for the common case, and avoid the need to build a cache invalidation system .
This system actually had worked out pretty well in practice, at least for Accounting Web comments sections. Although it's conceptually neat, it's also actually pretty fucking dumb for a couple of reasons.
this system means that adding comments becomes linearly more expensive as threads grow in size. The more popular a system gets, the work needed to post an individual comment increases in a polynomial fashion
Oops.
I wasn't entirely stupid, I had calculated this downside, and I'd done some scaling calculations on paper to see what the cost of implementing this for the IMDb would be, and here I made my first actually stupid mistake, I used the metrics of the existing forum system to try and predict the capacity of the new one. I can't remember the exact numbers now, and I've long misplaced the notebooks, but it was something lower than a thousand posts a day, and the average thread length was a few dozen posts. Amazon could afford a useful database server, and it seemed like I easily had a couple of orders of magnitude of headroom. Telling myself that premature optimisation was the root of all evil , and conveniently ignoring the fact that this design was literally entirely borne of an optimisation hack, I decided to proceed with this scheme.
Show me all the blueprints
I gave the design a lot of thought. I had been a USENET user back in the glory days before spam and binaries had rendered it toxically uninhabitable. I adored slashdot. I'd used a lot of shitty web forums since then, and I had designed a flexible engine that could handle any kind of post based discussion grouping. I thought this was a great opportunity to design a discussion system that I'd want to use myself. scratch your own itch . I think I already mentioned, I didn't really have much idea what I was getting myself into. Ah, youth.
I thought that most of the grief and spam I'd seen in other systems, was primarily because of the cheapness and disposability of user identity. I figured we could tie that down by disallowing anonymous posts, which was aligned with the goal of increasing user registrations already - maybe ultimately we could link them into amazon.com accounts, and therefore real identities. I wanted to give the users the ability to personalise and curate their site home page, so they'd have an investment in a community they valued, and would be publicly accountable to.
Another thing I'd noted about other forums was how quickly they stagnated into a dominant clique, and deterred new joiners. I decided this was in part because of the permanent record; the conversations got stale because everything had already been said, and the groups then tended to be dominated by handfuls of high-status members with visible post-history. Groupthink dominates, outsiders are shunned, filter-bubbles prevail. I thought that an interesting solution to this would be to actively expire user posts. IMDb already had a system of user reviews for more static user content attached to database entries. The boards were for conversation - so we'd just periodically remove older content, and make no secret about it. This should stop the entropy lock-down, and also give us a mechanism to keep a lid on the database / thread size to help with performance. Everything should stay fresh and sparkling and self-rejuvenate.
I know lots of this was naive thinking and with 2017 hindsight, it's easy to see the flaws. In 2001 though, there was much less experience of online community management. We thought we knew about trolling, because we'd experienced previous communities, but I don't think anyone yet had a handle on the scale and the scope of it in a significantly mass-medium consumer Internet.
I really wanted nested threading, which is a very good, perhaps too good, way to promote reply-oriented posting and reading. For that same reason, I didn't want threading to be the implied default mode, because I thought it promoted point-by-point refutation, which lead to arguments and flame-wars. So I envisioned a system that could seamlessly move between a flat or a nested view, with a cookie to fix it to your individual preference.
Each post would have two actions - a new top level post in the thread, or a reply to the particular post, and the different view options would allow you to see how the thread timeline fitted together from each point of view. I felt this would encourage replies, without mandating them as the only form of discourse. This meant that the organisational system was topic ( either a generic, or a database object ) , consisting of a thread - which was defined by the opening post made by any user at the topic level. This then collected numerous replies, which themselves could have sub-threads of reply.
Mindful of the fact that this was still an era of expensive and slow dial-up and low end computers, I wanted the ability to view in narrow or expanded views. I didn't want to force people to download gigantic pages of browser and modem-choking deeply-nested table layouts, so we would flip between outline and expanded views as well as flat or nested. I wanted people to have a static, but customisable home page that they could add content, style and flair to; hoping to give them a sense of curation and ownership and identity, that should help act as a brake on too much antisocial or negative behaviour. I'm not sure I was even smart enough to wonder if people would use their home page to host offensive content. (Of course, some did).
So I started to build it. Initially it went really well. On the data model and storage engine side of things, I was on a pretty solid footing, it was familiar ground. I carried on using PostgreSQL, and we specified a decent (for the times) server to host it on. No H/A or replication at all. I'm shocked at that idea now, but at the time I had reasoned that we were building an ancillary, purposefully ephemeral side-car discussion system with a different storage layer to the main site, and we'd be fine with regular hot backups - in the case of disaster we could shut them down without affecting the main site, and restore from backup. In the case of total and utter catastrophe, we could just reset them to zero and start again, they weren't designed for permanence anyway. Feedback about the design and features from the rest of the team was positive, with plenty of enhancements and suggested tweaks, and the system started to take shape.
The UX layer was way harder than I'd anticipated, and because of this, I started to get a bit bogged down in the 'second 90%' of the first deliverable. The mayhem engine that the team had built (a really clever piece of software design, that I don't really have time to give justice to here) had never yet really had to cater to highly dynamic pages - it's core purpose was to serve flexible views of an almost read-only statically compiled dataset of movies and people. It was originally built around doing that in a particularly optimised way.
I had to build up my own HTTP POST and form handling layers that would integrate with the existing HTTP handlers, from a somewhat lower starting point than I was used to doing, and this soaked up quite a bit of testing and debugging time. Even worse was the display code. We didn't really have much facility for dynamic page layout in the templating system - which was both highly customised, and complex; the site page templates were used to drive the static build system, via a custom compiler - the markup in the template specified what data views would be generated by the build, which directed the data builders that compiled the binary movie database- the pages were effectively just compiled to a stub handler for a specific route which would seek to the object index in a particular data index, and then basically sprintf the data out port 80 as a hydrated web page. This was a fast way to serve varying pages with identical structure, but not immediately well suited to highly adaptive constantly updated live pages or submission forms. Still, I wanted the boards system in the existing stack as well as I could manage, and so I laboured to build the missing features into my system in a way that could integrate well, which involved at least one complete abandoning and rewriting of the internal API.
The actual boards display templates themselves were a significant time soak. We had a great designer, who took my ugly box tables prototype output, and turned out nice looking blueprint designs for all the various view modes and forms as static web pages. This was of course the era of the browser wars, and we were expected to support a bewildering array of user agents from the Netscape 3.x era onward, inclusive of weird-ass things like AOL clients and MSN web-tv set top boxes and goodness knows what else...
Busting these intricate table-based views apart and back together again into a cryptic markup and logic language, adding the various ( session global ) mode flags such that all the different view combinations rendered as functional pages that degraded gracefully took me weeks . I was slipping past shipping dates and entering a terrible crunch death-march to just try and get something out of the door. Unhelpfully, this was all happening at a time when I was having a few strains in my family life, and also struggling a little bit to balance this into a sensible routine of working from home, I was ping-ponging between getting distracted away from 9-5 and then overcompensating by working across nights and weekends. Eventually we had to pull out features to ship.
I drastically cut back the home page customisation, abandoned all the planned but unstarted work for a search index, and only had time to add the most rudimentary admin features. I had wanted to migrate the existing posts across to the new system, but I'd not even begun to start on that, and that also hit the cutting room floor. With a lot of assistance from the rest of the tech team to get it over the line, we hit publish on the initial TNG boards system some time in the summer of 2002, later than planned by some months. This pattern of the message boards being more work than expected for all parties that touched them would be the prevailing tone for the next several years.
A test designed to provoke an emotional response
User feedback was immediately negative, and highly vocal. Lobbying started instantly for the reinstating of the previous system. People complained about the new designs, the complexity of the new display options, the inevitable launch bugs. I was silly enough to join in the conversation to help explain the launch and solicit feedback, and from that point on I had an onslaught of direct contact messages and emails, occasionally positive and friendly, but more often than not weird and offensive, sometimes abusive. You do try to tell yourself that you can just ignore the trolls, but in truth it is quite difficult to remain completely unaffected by emails that compare you to a child rapist and calling for your death in offensive terms, even if it was only provoked by you breaking a font size in a particular version of Internet Explorer 3 . You never quite get used to that, I find. I was pretty crestfallen with all the negativity after all that work, although the team were positive and assured me that some of the board users could be like that, and that in general people are more vocal when they're complaining, and are naturally somewhat resistant to change. I still felt pretty down.
My mood did soon change after a few weeks. The new boards were kind of a hit. Maybe a smash hit . They quickly overshot my scribbled calculations of scale in a slightly worrying manner. With some judicious database tuning, the performance stayed OK though. For now. Then we added links from every title page (IMDb pages were sub-grouped into title pages, for tv shows and movies identified by a key called a tconst which looked like tt1234567 and name pages, for people, robots, animals etc. from cast and crew which were identified by a key called an nconst which look like nm1234567 ; top level boards un-linked to other database objects therefore got a new key type called a bdconst , somewhat inconsistently, these looked like bd1234567 and didn't matter very much because there was only ever a few dozen standalone boards) and the numbers started to properly hockey stick .
At the time we used to compute the page views in a weekly report which broke out the top N subsections according to first level directory. We never shared traffic numbers publicly, and so even after all this time I will be respectfully coy, but the highest chart topping positions were obviously things like /title, /name, /search /news /chart etc. At launch, the boards were lurking down the bottom, nowhere to be seen, but after we started the title conversations they were solidly into the top five, where they remained with ever-accumulating numbers, and user registrations clocking up correspondingly.
From that point on, I spent a significant amount of my waking life 'doing the boards' for the next several years. Initially I was scrambling to put in the missing features we'd pulled before launch - post editing, markup for posts and then profiles in a hand-rolled version of BBCode ; again with a stupid insistence on display time optimisation, I converted this to HTML at write time, which meant that when we added post editing, I had to backward parse the HTML back into bbcode to be re-edited, all with a misconceived series of chained regular expressions . This lead to an endless sea of parse bugs that pretty much guaranteed that the markup and emoji (although they weren't called that yet, we called them 'smileys') set would be once fixed effectively sealed forever, even though I'd taken the trouble to add an admin edit tool, that allowed for updates to markup to be made by non-developers through the CMS API.
I'd thrown together a naïve search API, entirely based on un-indexed SQL substringing, which I'd fully intended to replace after launch. It never worked, and the system filled up so quickly that it killed the page cache entirely by constantly table scanning the texts, so much so that I spiked it in the first week, and never got a chance to work on it's planned replacement. I was still getting emails complaining about that five years later after I'd left.
With the surging popularity, came increasing amounts of negative user behaviour, and I had to increasingly devote development time to adding abuse processing tools for our small moderation team, onto what had only ever been an afterthought of an admin system. We never proceeded to link up the user accounts to amazon accounts, and I'd never planned to add user-driven moderation. My quixotic hopes for user killfiles (renamed to 'ignore lists', which is a far better and kinder name), global killfiles (known as the 'Phantom Zone ', because I love Superman ) with account history purging and deletion weren't enough on their own, and the tooling for processing abuse reports were too clunky and slow, largely because I hadn't planned enough for them from the offset.
I was now fighting a constant war on two fronts. With the popularity of the system way beyond my original estimate of a few thousand posts a day. We quickly escalated to a point where the really popular off-topic boards were ersatz real-time chatrooms, accepting hundreds of posts a second at peak-times. All of this in a cursor-pooled synchronously blocking database directly attached to the HTTP display servers. I spent a great deal of my work time just constantly rewriting sections of it all to squeeze efficiency out of this setup. First with indexes and schema changes, then with hardware upgrades and tuned and profiled system software, then with a complete rewrite of all of the database logic to use stored procedures, and finally a long overdue table sharding so we could cluster boards between different tables and tablespaces to balance the IO and garbage loads. At the same time on the other front we were trying to come up with ways to lower the proportionally increasing cost of trolling and abuse.
My partner was temporarily stationed away in London by this point, so I was home alone, aside from the dog . Workdays at this point quite often consisted of walking 12 paces from the bedroom, still brushing my teeth at about 09:30, getting a support email, starting to poke at something interesting with the boards, and then not giving up until the small hours of the next morning. I was fairly obsessed with all of it, and my health was suffering, although I was too close to all of it to properly see this at the time. I developed a weird collection of neurological symptoms which stubbornly refused diagnosis, and subsequently appear to have been entirely stress-induced.
We still were choking out at peak load times, and it was starting to have a knock-on effect to the rest of the site. Eventually, a super-talented colleague helped me out by implementing a workable version of my poorly articulated designs for a caching database proxy; implemented seemingly overnight by him in C, it spoke postgresql wire protocol and cached result sets in a filesystem that we mounted on ramdisk. Kind of a home-brewed combination of memcached and pgbouncer . The simplicity and effectiveness of this just took my breath away, just as much the lesson that if a software thing doesn't exist, you can just make it yourself. Everything is just ones and zeroes, as I am very fond of saying to this day.
With this addition we got to a place where the system was in enough of a steady state. We implemented more banning and reporting, added a reputation score based system that slowed the rate of posting for users with lower reputation scores, which also helped reduce the saturation write loads at peak. Eventually we added an automatic moderation robot with a learning capacity and pluggable rulesets. I called him Spike . He worked fairly well, if a little bluntly at some times.
I hope I'm not giving out the impression here that it was all entirely negative. It was definitely a rollercoaster few years. Exhilarating, and also very entertaining. The boards were a living thing that had sprung out of nowhere, literally something I'd created in my spare bedroom. It sort of felt like a Pacific-Ocean sized colony of sea-monkeys eternally fizzing away with unexpected activity right there in my spare room.
Although they were often frustrating, the users were also inspiring, and creative, and surprising, and occasionally pretty funny, even some of the (gentler) trolls. On top of an understandable level of frustration and annoyance, I generally found I felt a sense of sympathy for them, and their complaints and frustrations with the system. All of this was before the age of 'social media', and I could almost feel the shape of it hanging there, slightly beyond where we were heading, off-piste and in a direction we probably shouldn't venture into.
A consistent surprise was the amount of effort people put into curating their limited patch of profile space, and how social and to us off-topic, it all was. We were constantly running into people trying to use the boards for personal social spaces - I argued for providing individual personal boards for every user at one point, but the management team explained that we weren't really in the core business of general social networking. It confused me at the time, and I had to think about it for a while, but I think that was correct thinking, and there's a lot of wisdom there. You simply can't do all possible things well. With a small team, and a big world, you benefit most from focusing entirely on the things you're best at and the things you want to be better at.
A few of the sillier trolls stand out. There was one early griefer , who we very easily IP traced to a school library, I think based in Canada. We waited until he was in mid-session one afternoon, and then if I recall correctly, management called his head teacher, who was then able to apprehend them in the act. There was another, very silly catfish troller called tabitha_cyeg , with an obviously manufactured identity. Their M.O. was posting bizarre conspiracy theories about the site technology, and myself, during which they'd claimed to have hacked into using l33t -sounding but completely irrelevant NetBIOS vulnerabilities replete with faked server logs, and on one occasion 'hacked' emails from myself revealing my true name to be something along the lines of 'Claude M. Savoire'. Quite a few users were seemingly entirely convinced, but to me it was pythonesque.
Getting contacted by the Feds to deal with users who'd been posting death threats about President Bush was weird , at least it was the first time, and I got a few PMs and emails from actual industry figures, which was always quite exciting. I personally banned a moderately famous Hollywood producer this one time, for abusive posting, which is something of a curiosity. I remember going to watch Jay and Silent Bob Strike Back at the cinema around this time frame, and getting a particular kick from the sub-plot where they individually visit all the internet forum posters who have been rude about their previous films.
I watched people fight and friend. Saw a few romances and a marriage or two emerge from the regulars. I read, and occasionally got involved, against my better judgement, in fascinating and productive conversations. I still bump into people IN REAL LIFE who reminisce about the boards and are to this day impressed with me when I tell them I had a big hand in their genesis. I once spent an evening in a darkened restaurant patio overwhelmed to tears as a kind man explained to me his young daughter, hospital-bound and dying of cancer, had used the Harry Potter IMDb boards as her main social life in her last year, and how much that had meant to him and her. Stories like that are just a profound privilege to have had even the most tangential involvement in.
And I learned so much. Working with such a smart team, on such a great and special piece of the internet. Learning about every aspect of scaling a web stack from the disk blocks up to the network and back down again. This era was still 32-bit Intel hardware, and I learned a huge amount about that, and UNIX profiling , and the linux virtual memory system and file system , and HTTP caching . I made so many mistakes, because there just wasn't any other way for me to learn, and I did figure out how to fix or improve on many of them.
I learned about PostgreSQL internals from the wire protocols all the way down to the storage models in some detail, and to this day I'm a pretty great PostgreSQL DBA , when I need to be. I learned a lot about UX influence and steering behaviour, albeit by mostly getting it wrong. I learned about building search engines, and service orientated architecture, and why you really shouldn't hang responsive systems off of blocking I/O, and maybe message queues are useful. I learned how to measure system performance all the way down to the CPU cache level . I learned how to keep focused on problems I didn't yet know how to solve, or perhaps didn't yet understand. I learned lots and lots of things about movies and cinema history, much of it just by osmosis poring over the data sources. I learned how to better manage my own time and projects, and I learned what it feels like to burn out , and what you should do about it when you know that you are. Since I left Amazon.com, I've had a great and varied career , and I think at least 75% of the useful things I know how to do well I learned first-hand on that gig, and I've always treasured, and respected that.
Always. Be. Closing.
And now they're shutting the boards down. I first heard about it via text message, oddly enough; but shortly after that it was all over my news feeds followed by a slow stream of emails, checking in. Friends, ex-colleagues, some of them from former boards users. I felt an odd sense of shock about it, in a way, and slightly emotional. Sixteen years is a ridiculously long time in Internet years. The web itself wasn't sixteen years old when I joined Amazon, and nor was the even older still IMDb. I don't use the boards myself any more, although I do occasionally look over them, perhaps once or twice a year. It's been clear for a while that they're not getting a fraction of the use that they once did, and that's fair. The web is a different thing in 2017 entirely, and that's also a good thing.
Communications technology evolves, and hopefully improves all the time. People have all kinds of social networking now for communicating, and the bulk of this is happening on different, smaller screens than anything I could have envisioned when I was first sketching out some pencil ideas in a gridded notebook. An actual Filofax I believe. It was very humbling to see the amount of twitter traffic noting the IMDb announcement, as well as the number of actual proper news sites that wrote this up as something significant. The Verge report seemed to think the IMDb message boards were era-defining. That's something, I guess. All things must pass.
There's just one more thing that's bothering me
' Mjeyds '. On the imdb board bbcode syntax, there's a particular smiley that you markup using this bizarre word. People occasionally ask what the term means, and I've always enjoyed the mystery, being one of very few people in the world to have any claim to know the answer. I guess it's now or never for the reveal.
The emoticon set was curated, uploaded and configured by my erstwhile designer colleague. He took responsibility for naming them. He wasn't English, hailing from Denmark, I believe via several other countries. When I pressed him for an explanation of 'mjeyds', he said it was supposed to be an onomatopoeic of the way the late Graham Chapman said a languorous 'yes' whilst sucking on a pipe in a scene from Monty Python's the meaning of life . If it is, I guess it works better if you're using a Danish alphabet? If you've got all this way through this post only to find out the answer to that question, then I am sorry if it is an anticlimax, but thank you for reading. Maybe some things are better left mysterious. Another lesson learned.
Crazy Credits
this is a personal web page, and an entirely personal and subjective retelling of my own experience building and maintaining a small section of IMDb.com a long time ago. Whilst I'm happy to take personal responsibility for a large amount of the boards creation and inspiration, I don't want people to get the impression that this was in any way a solo effort. All of the work outlined above was produced in the context of a small dedicated team, and although I've refrained from naming names, and attributing ideas elsewhere this is borne more from a desire not to miss anyone out - after this amount of time there's simply no way I can credit individuals for parts I can remember without failing to attribute others for equally important contributions I have forgotten. I've done my best to be honest about facts and timelines, and tried not to infer too much about third party motivations, but I know I've forgotten things and misremembered others. Working from memory, after this amount of time, such errors are only human. If you spot anything terribly wrong, or have any questions or corrections, please get in touch. I'd like to thank the entirety of the IMDb team 2001-2005 for working with me on all the aformentioned things, and more. Great team, great times