Previous: dying, but everywhere (7)
Next: interesting things in the new times (12)
reader feedback needed
Post #396 • October 27, 2004, 7:55 AM • 14 Comments
It finally happened. I expected it, but not this soon. Yesterday, a comment spammer put up a post with 600 - yes, 600 - links to a porn site.
In a way, I take it as a compliment that Artblog.net is now widely read enough to become a desirable target for comment spam. Moveable Type sites get hit all the time, by robots, because they have a standard interface. I authored my own content management system, so someone had to hit me manually.
Because of my alert mechanisms, that post stayed up for about three minutes, but I can safely assume that attacks like this will increase proportionally with the popularity of the blog. That brings us to possible solutions, all of which stink for different reasons.
1. Ban automatic comments. Most really big blogs don't have comments. You can e-mail them, and if you say something sufficiently clever, the blogger may post it. This assures that all comments which finally appear address the topic and have a brain. Stinks because: I enjoy the hell out of the comments and learn a lot from the exchanges. I think comments form an important part of the identity of Artblog.net.
2. Comment moderation. Your comment goes into a holding cell until I inspect it and let it through. Stinks because: it really slows down the discussion, and creates an additional workload for me.
3. Comment registration. You register your e-mail and screen name with Artblog.net, which then sends you a password. You can then comment in real time to your heart's content. I could moderate your first comment á la #2 to make sure you made a legitimate remark and let your posts through thereafter. Stinks because: this requires a lot of hoop jumping just to drop a few thoughts at a website. It will definitely get rid of a lot of the conversation. And you would rather go to the dentist than acquire another password.
4. A combination of #2 and #3. You get a choice - either register, or allow your comment to be moderated. This allows readers to select their preferred level of hoop jumping. Stinks because: This gets pretty complicated and I've never seen any other site do it.
I plan on banning spammers manually like I did yesterday morning for a good while yet, but eventually, maybe sooner, I will have to deal with this. Please express your preference for one of the above or suggest alternatives.
October 27, 2004, 3:59 PM
I don't worry about the corruption of your innocence. I worry about the attacks scaling up, making every third post on this site a hundred-line comment linking to sites selling Viagra. That's really going to mess up the conversation.
We're not there and may never get there, but I've seen ugly things happen to other blogs.
October 27, 2004, 4:06 PM
I've been wrestling with this problem too. My wife's blog has been repeatedly bombarded with comment spam for about a year, so Gallery Hopper (my blog) has been somewhat innoculated by the countermeasures that are already in place, namely the MT-Blacklist extension and IP banning.
I, like you, am loathe to throw hurdles in front of visitors who want to leave comments. For the time being, that means I'm taking on a little extra workload to maintain my blacklist. Then again, I don't get the number of comments you do, so moderating comments wouldn't be that big of a problem for me.
October 27, 2004, 4:53 PM
Franklin, 90% of the comments on this blog come from--what?--a dozen people? When the spam gets to be too much, set up a password system. My guess is that those dozen people are already emotionally invested enough that a little old password system isn't going to stop them. A few of them will straggle; send them a personal invitation to join the site. They will eventually get on board.
For the rest of us: we may put off joining for a while, but when we get *really* pissed off about something, joining the site suddenly won't seem like such a problem.
October 27, 2004, 5:52 PM
At this point leave as is. It would have to get much worse for one of the alternatives to be a better choice.
October 27, 2004, 6:56 PM
Cinque: it is true that there are those of us who contribute a lot, and we wouldn't mind passwords so much, but we (at least I) want to do absolutely everything possible to encourage more people to comment. As Gravity says, it should get worse before something needs to be done. But it is a good idea to discuss it now.
October 27, 2004, 6:58 PM
Lets form a caca committee and resolve this once and for all. Oh by the way, I'm for keeping it the way she be.
October 27, 2004, 7:39 PM
registration also stinks because some people (like me) post under more then one name. I'm not sure how important anonimity is to everyone, but once you're registered it becomes an issue.
I like the blacklist idea (or maybe some sort of filtering?), but i'm assuming that would require lots of work on Franklin's end to implement.
Another idea (which also would require implementing) is where the site shows you a little graphic and makes you type in the letter/numbers displayed in it. That could happen on the posting end (for anonymous) and in an optional registration phase. That, again, sounds like a major coding project for Franklin.
October 27, 2004, 9:35 PM
Ditto on "registration stinks." I think URL blacklisting is the way to go, if you can manage it. The graphic Turing test gizmo alesh mentions could also be workable. You can probably implement something like this through this site. Downsides include accessibility issues: blind users or others without graphical browsers can't pass the test.
Re: blacklisting... MT-Blacklist checks any URLs from a comment message and website fields against a URL blacklist and rejects the comment if it finds a domain match. It seems to work pretty well as comment spamming is pointless without the ability to include PageRank-boosting links. I had to install it recently when I became a clearinghouse for online casinos. Haven't had to worry about it since.
While this is certainly not my area of expertise, Franklin, you seem to be capable of a little coding. How difficult would it be to hack up a little Perl that checks the contents of link tags and the url field against such a list?
A bit of googling brought me to this pre-MT_Blacklist Laughing Meme post: URL Blacklists in MT: An OO Approach.
In what is admittedly a "quick hack":
"SpamBad looks for a directory called blacklists, and reads the contents of each file in that directory (in case for example you want to use other peoples lists as well as your own), one banned url per line, lines beginning with # are comments. It then compiles those URL into a simple regex which it checks against both the author's url field, and the body of the comment. (My first reaction was this would be slow, but it isn't, important to remember this is the kind of thing computers are good at)"
It's written for Movable Type, but looks simple enough (to my eye) that someone who knows what they're doing could modify it appropriately. Not sure if it handles wild cards--if not, it would probably be an advisable addition.
Simon Willison discusses some approaches, including implementing some form of URL blacklisting as well as killing comment link PageRank through redirects.
Here's Jay Allen's original solution (before he created MT-Blacklist)
Don't know if any of this helps; I may be a bit out of my depth here.
October 27, 2004, 9:37 PM
My initial response will be to automate the process by which I delete posts. That ought to hold the wolves at bay for a while and won't affect the reader's lives at all.
If it ever does get out of control, I can go to the "type in what you see in this picture" challenge that Alesh describes, which might not be such a huge mission to implement and would allow everyone to stay anonymous. Unfortunately, people could still spam the site manually.
Another possibility: After you submit a post, it goes into a holding cell and Artblog.net sends you an e-mail saying "click this link to release your post." People could still use whatever screen name they wanted or any number of them, but I would, after a while, figure out which ones were the same unless someone went through the trouble to set up an e-mail for each name. (This, incidentally, is how my spam filter works.)
Rest assured that nothing will change for now, and I will keep everything open as long as possible. I just wanted to have this discussion before a crisis came up rather than during.
October 27, 2004, 9:43 PM
Dan: Thanks for the reading list. My understanding is that blacklisting IPs occasionally bans legitimate senders, particularly ones on AOL. Also, do I have access to the MT blacklist if I'm not using MT? (Because I aint using no MT.) Anyway, I'll look into it. Thanks again.
October 27, 2004, 10:02 PM
i'm all about the password
October 27, 2004, 10:02 PM
I'm not sure how suitable blacklists are (their quality is only as good as the vigilance of the person who maintains them, and it requires testing every post against a long list of potential offenders), but I do know the "type this text" software is extremely easy to program from scratch.
October 27, 2004, 10:22 PM
The method I discussed blacklists URLs, not IPs.
Blacklisting IPs could very well block out legit users. It's also easier for spammers to work around (they just need to change the IP they're sending from).
URL blacklisting (again, what MT-Blacklist uses) only cares about the links a comment contains in its body or the URL field. It's harder for spammers to hide from this as it would involve changing URLs, and the only way a legitimate user would be blocked is if they're trying to link to barnyard.porn.info or something.
The MT-Blacklist blacklist files are just text files. Anyone who uses the MT-Blacklist plugin has the option to share theirs on their site--collective immunization. The blacklist linked to above is Jay Allen's master list. If you were to implement some sort of blacklisting system, I'm sure you could use this list as is. Take a look--it's fairly extensive (though, again, utilizing wild cards would be essential towards minimizing upkeep).
Again, check out that Laughing Meme link from above. I don't know Perl (I seem to recall that you do, though I may be mistaken), but the script is object oriented and so ought to be fairly portable (this would assume that your content managment system is Perl-based, which it may not be).
October 27, 2004, 3:48 PM
Franklin - Just leave it as is. I don't think all of us innocent cherubs are going to be permanently damaged by some stupid porn spam. The freer and more wide open the blog is, the better.