It's been one year since ThisWebsiteWillSelfDestruct.com launched. I wanted to take a moment to write up some thoughts on running the site for the past year.
I made the site for Ludum Dare, a game jam. The theme was Keep it Alive. I made the first version over four hours before bed. I posted it, I went to bed, then I wondered if it would last until the end of the Ludum Dare judging period.
When I woke up, there had been around 20 posts, and I showed it to my wife. She thought it was neat, and we read through the posts. We were kind of shocked how thoughtfully people were posting to the site, and we loved reading through it. I posted a few times, myself.
While I was out doing errands, I noticed that there were even more people posting. I think it got up to 100 posts, and I was honestly a little surprised that people were being so open and honest in their messages.
Then I didn't think about it until my phone started getting downtime messages. After looking into it, I was getting a lot of traffic. It was coming from a popular Reddit post.
Without that post, I don't think the website would be alive today. It both showed off the site to tons of people, and it also forced me to take the site's architecture, content, and moderation policies far more seriously.
The site was built with Rails and Vue.js. It is essentially just a web form with a couple APIs. There are three activities most people were able to do with the original site:
There's no way to read a particular message by design. This was unintentionally useful in moderation, as I'll discuss later.
One of the problems was with the way I'd implemented getting the time the last message was written. I just had an API that the website automatically requested every 5 seconds, and updated the time on the top of the page. This made it so you'd very quickly notice when new people posted, and the website's life jumped back up to 86,400. This was useful for debugging, and was very easy to implement. Since the whole thing, from backend and database to design was built over one evening, speed of implementation was key to getting it done. And since I expected at most hundreds of people to view the site over its life, I wasn't too worried about architecture.
But at its peak, there were well over 20,000 people concurrently on the site. That means that every second, there were 4,000 requests just to get the time the last message was written. That's a lot of requests, and each one was adding a database hit.
To get this fixed quickly, I just upgraded the database server, and decreased the time between requests to 60 seconds.
There would be further improvements to this, such as caching the information in Redis, but I'll save that for another day.
With Reddit came a moderation headache. I'd prepared for some level of moderation, and had the ability through a very quickly thrown together backend to hide comments from view, but that just wasn't enough.
For about 18 hours, overnight, following the Reddit post, I worked on keeping the content on the site clean, kind, and thoughtful.
Some people quickly figured out there was no rate limiting on the API to post a message, and started mass-posting hate speech. I added a new step in our posting API which checked a Redis cache of IPs that would increment a counter for how many posts an IP had made in the past X minutes. If it was more than a fairly low number, it would silently not post the message. The user was not warned that their post didn't go through. And since there's no API to view a particular post, there's no obvious way to check if your mass-posting worked or not...
Unless you're also spamming the message request API. Which started to happen, so I added rate limiting through Redis across the board. It was basically copying and pasting the same half-dozen lines of code to all three APIs, slightly tweaking their limits. This whole process took about an hour, during which I was also doing manual mass removals of posts through SQL queries.
I'd just see what a spam poster was posting, search for it, and delete it directly out of the database. This was all very real-time and went on throughout the night.
While I was working on the spam problem, I was simultaneously working on content filtering. Some content, I realized, I did not want to post online. I was not interested in creating a place for hate speech to thrive.
I built a quick search tool, and made some modifications to the database so that I could do faster full text searching on what was quickly becoming a huge database of letters. (The database, at its height, had well over 2 million posts, thanks to the rampant spam before my spam filtering. At this point, I'd just been toggling off the spam but keeping it in the database so that I could reference it when writing rules about what should and shouldn't be filtered.)
The process worked like this: I would manually read every post that came into the site, and when I saw one that wasn't appropriate, I would block it, and attempt to write a simple filter to block it. This involved finding all the various misspellings of offensive words used in an attempt to bypass a filter. Then I'd search those strings, and see if blocking them would have removed legitimate posts. If they would, I'd work at massaging my filter until it would mostly only filter out bad posts.
Then I'd update the code to add the filter, and move on.
After doing this process for about a day, the number of problematic posts was vanishingly small. It's almost certain some positive posts are also filtered out in the process, but I've tried my best to limit that when possible. Fortunately, it's impossible for anyone to know if their post has been filtered, so bad actors don't have a quick way to figure out how to work around the filter, and well-intentioned and incorrectly filtered users don't have a bad experience when they post their message.
One side note that I have to mention here -- I had to write a separate filter for long messages because a lot of people posted the full script of the Bee Movie. It was everywhere. Eventually, I would also filter non-unique messages. But I do think it was entertaining how severely the long text of the Bee movie script would break the style of the website. For a short time, a few hours into reddit, after filtering out a lot of the hate speech, the site was made up of real content and bee movie scripts, in equal parts.
After a night without sleep, and content mostly moderating itself, I finally went to bed. When I woke up, a person reported a problem: There was a trending reddit post in a local American city about a possible threat to a mall that was posted through ThisWebsiteWillSelfDestruct.com. The title of the website made the threat all the more sinister. People were concerned, and a few mentioned calling the local police, and said that the police were investigating.
I like to be up front when it comes to this kind of thing, and called the police myself, and explained the situation. We had a nice chat, and I gave them all the (limited) information I could. Nothing came of the threat, but I think it's worth mentioning as an unexpected experience that came from the site.
The filtering has mostly worked for the past year with only minor improvements. I'm always keeping an eye on things, but I've been blown away by the supportiveness of the community.
It's been awesome reading the things that are posted to the site, and I thank everyone who's been a part of it.