Another hiccup this time - a bunch of intermittend 500 errors on the site.
Basically, the problem was that php-fpm was configured to let its processes run indefinitely until they hit some large maximum number of requests, and also to keep a very large number of threads alive to serve them.
So, when demand spiked and a bunch of pages started getting loaded, php-fpm tried to catch up to demand and launch processes to keep up. Problem is, each page load (request, technically) has a timeout (due date), and if it’s late the server goes “oh well, that’s an error” and gives up - otherwise it could be exploited by a malicious client making it keep trying to serve the page forever.
So what happened is that php-fpm wanted to catch up, and launched itself so many times that each individual one had trouble making that due date - hence the site being laggy or down.
Fixing it was a matter of keeping fewer processes around, and forcing them to relaunch after a fixed number of requests so that garbage piling up in their memory would get thrown out every so often.
For good measure, I limited MariaDB’s connection limit since it’s firewalled off from everything but WordPress, and shouldn’t need that many (especially as it should be connecting over socket).