RFC: IRC Server

Hey. Yewnyx here. I admin the Parahumans servers, meaning the Ward Wordpress as well as the Parahumans IRC server. This note is about the Parahumans IRC server; Ward has remained more or less smooth since it was first set up.

So. As some of you have noticed (particularly WeaverDice players), There have been issues with connecting to IRCCloud.

The reason for this is that there is a session limit per IP connecting to the server. This helps us avoid spammers. Obviously, IRCCloud should be exempted from this session limit: they a) identify their users uniquely to us (so moderation actions can stick) and have their own antispam measures, and b) are a popular way to stay connected remotely for a great many users.

The problem we’re having is that for whatever reason, Anope, which provides IRC Services such as OperServ, ChanServ, and NickServ, is on the fritz and I don’t know why. It forgets the exemptions every so often for no reason I can discern. I add the exemptions, they work, and a couple of hours later, poof. Anope notices, and kills the extra sessions as it thinks it is supposed to do.

Maybe restarting Anope can fix this temporarily (and whatever corrupt state its databases might be in, we deal with and scramble to fix afterwards).

However, at this point, I’m a bit burnt out. I’ve been walking on eggshells for a couple of years. Not in terms of what I say but in terms of being very, very conservative in what changes I make to the IRC server. The most recent troubles hit me right as I was sick in Tokyo, then flew back to the west coast, and was sick and jet lagged there - and it caused a lot of stress for an experience which I’m not keen on repeating again.

I’m not going to be happy if I continue to maintain the IRC server as such. It’s been pretty stable: we can count the number of times we’ve put you all through a server reboot on one hand. But going forward, we have 3 options:

  1. Train up some current trusted people in how to run all aspects of an IRC server.
  2. Put out feelers for someone outside the server to take it on (and build up trust), or
  3. Migrate to a system which is more stable and easy to admin (heavily implies Discord).

Personally, I heavily favor #3, as Discord:

  • Is easier to write bots for
    • We could script WeaverDice channel creation
  • Makes moderation less janky and messy
  • Allows channels to be bundled together in categories
  • Makes channels more discoverable
  • Integrates well with Patreon (however this is pretty much off the table where a Parahumans discord is concerned)

And additionally, any transition to a different IRC setup is going to be messy and difficult, no matter how smooth we can make it for you to switch over.

Migrating to Discord isn’t without risk, however. As intimidating as IRC is for some people, and as accessible as Discord is, from a certain perspective that is a blessing in disguise. The IRC is more often than not a pleasant, intimate area where like-minded fans can chat together, and where many names are mutually recognizable. With increased accessibility, the community may expand much faster than our ability to moderate it or keep track of discussion; an “Eternal September” scenario is not difficult to imagine.

I’m interested in people’s feedback on this - I’m about at the end of my rope on supporting the IRC server, but the last thing I want to do is leave anyone in the lurch.

–Yewnyx

Addendum

It’s been pointed out to me that I ought to explicitly include some relevant motivating context for this. As I pointed out, I was involved in some rather difficult server administration last week, and in the midst of addressing the IRCCloud connection difficulties, also had to reconfigure the server to remove certain permissions and issue moderator actions. Previously, I’ve accidentally kick-banned many tens of blameless users due to messing up the different forms of user/host specifications in InspIRCd and Anope; so I was extra-careful. This meant that while sick and jetlagged, I was having to be very cautious and careful. I ended up passing out for several hours in between starting and completion of putting these measures in place.

In more principled (rather than anecdotal) terms, I believe the problematic channel itself owed its continued existence to a general lack of discoverability of channels and lack of logging: there was no review for it for a very long time. While personal responsibility (and not living up to it) was a large factor here, the system was also structured in a way that enabled it. In principle I have nothing against private channels, but there should be a reviewable standard of behavior that goes along with that privilege, in my opinion. I favor a system in which on-server conduct is by default reviewable by admins, and that is a notable difference in Discord vs. IRC.

[Status] Rebooting the server

The server stopped responding for some reason. I’m rebooting and looking into what happened. Sorry for the downtime.

Update 1:

Okay. So I think that EFS is causing massive delays in Wordpress presenting a page, and traefik gives up. In the off chance that it doesn’t a page will show, which explains why some entries showed WordPress getting hit, but for the most part erroring. This would also explain why htop shows a bunch of apache processes in uninterruptible sleep (blocking on IO), and why the server needed a hard reset to access once the issue was seen (everything blocking on NFS IO, computer blew up. A common scenario).

Dammit, EFS.

Update 2:

Yep. Moved that 💩 back into a docker volume and it’s back to its speedy self.

Favicons!

I’ve added favicons. Many thanks to /u/Aurnyx for the outstanding base design, and Nick (on IRC) for vectoring and exporting them in various sizes! The icon reads surprisingly well even at 32x32, so I’ve exported everything at that size, except for 16x16, where the symbol reads best as the dot and circle.

Note that this may not show up in many browsers until their cache expires and it tries to load the icons again.

Here’s a small sampling of the icon at different sizes:

16x16:

32x32:

310x150:

310x310:

[Status] Rebooting the Server

I’m rebooting the server to downsize the instance. Last time, despite the termination protection, this deleted the instance anyways. Accordingly, I’ve backed up WordPress as well as the container volume itself.

Cloudflare should handle keeping the caches alive while the server reboots.

EDIT: This has completed.

Integrating with Wordpress.com

I’m planning on installing Jetpack this weekend, which should bring back some of the integration that people who like to subscribe on wordpress.com are familiar with. I want to be ready to fix things in case stuff goes haywire, hence the delay: I have a day job. I’ll also need to plan to take some of Wildbow’s time for this, since he’s needed to turn the key, as it were.

[Status] Bringing up Kubernetes Cluster

I’m bringing up the Kubernetes cluster to start poking at it into the same VPC as the current instance. No downtime or interruption is expected at this time, but noting it here just in case.

EDIT: Okay, kops seems to have brought it up fine. Now for the tedious configuring.

EDIT 2: Actually, Kubernetes is fun but I can get by with something simpler for now, as long as I document it.

[Status] Changing Database Size

I’m downsizing our database instance since I over-provisioned it for launch, and things are quiet and otherwise going well. Some possible minor bumpiness is expected.

EDIT: This has completed. Looks to be all clear.

Addressing Readability

I’m deleting the theme switcher. Previously, the theme switcher would set a cookie to modify which theme Wordpress would serve you. Unfortunately, Cloudflare doesn’t cache on cookies, it caches on URL (with the query string). So…this caused some caching issues where if you had the wrong theme active and happened to access a page with that theme that wasn’t in the cache, it’d serve up the old cache.

Unfortunately, dealing with Wordpress on theme switching has proved an unforeseen headache while I try to focus on other aspects of the site, so I’m removing the alternate theme and theme switcher, and recommending the bookmarklet or browser extension approach. They work well and don’t impose a potential maintenance overhead.

Helpfully, m1el offered up a solution in their reddit post here. This bookmarklet works reasonably well; to use it, drag this link to your bookmarks bar and click it when you want to whiten the page: Make it Readable

Thanks for bearing with me while I iterate on this stuff!

–Yewnyx