I have comments, but no spam in my comments. Here’s why.

jrockway · on July 27, 2009

I also don't get spam comments anymore. I use the most trivial captcha imaginable, and I use traditional error messages.

But honestly, I hate blog comments. Being able to read a number from an image doesn't mean that what your writing is worthy of being published. (I keep them enabled because I like the fact that people are occasionally so upset with my analysis of Java's type system that they want me to "die in a fire". Really? What exactly would that solve?)

DarkShikari · on July 27, 2009

My solution to this problem (in addition to Akismet) is just to manually moderate all comments; it works for smaller blogs like mine ( http://x264dev.multimedia.cx/ ).

This prevents pointless comments and flamewars--of course, it also means I probably won't let you flame me baselessly in my blog comments, but there's no free speech on a private website anyways.

jbrennan · on July 27, 2009

There wouldn't be free speech on the private site, but there's nothing stopping a would-be commenter from writing their own rebuttal article, were comments disabled.

I like this better because there's a much smaller chance of a flamewar happening, and the responses will generally be well thought out.

blasdel · on July 27, 2009

There is free speech, it just belongs to the publisher :)

colins_pride · on July 27, 2009

I agree that filtering out the worst junk doesn't necesarily raise the overall quality to an acceptable level. It seems like there are two issues, which are related: [1] comments are too small-d democratic and [2] there doesn't currently exist a good mechanism for aggregating comments.

With the exception of a few places like HN, the best comments are treated about the same as bad comments. The absence of incentives leads to the absence of quality.

The other issue is related in the sense that if there was a good mechanism for aggregating comments across sites it would almost certainly serve to recognize and reward the best comments. Solve the fragmentation problem, and you get resolution on the first one for free.

modoc · on July 27, 2009

Using Akismet blocks about 99.99% of the spam comments that hit my blogs. I moderate the rest. It's quick and easy. I almost never delete a comment, but I like to have the option.

dmix · on July 27, 2009

For a year I had about 30 spam comments a month on my Wordpress blog. I added Akismet a month ago and I've had 1 this month. It really does work and there's no need to bother users with a captcha.

there · on July 28, 2009

i've been using defensio (http://defensio.com/) for about a year and have had great success with it. over 11,000 spam comments with 9 false positives, and no annoying captchas.

comments flagged as spam by defensio require an email address to go into the database (if it wasn't spam it doesn't require an address and will just post to the site right away). the user is then emailed a link that they have to click on to make their comment show up, which then notifies defensio it was a false positive.

eventually i grew tired of the email bouncebacks from the spam submissions, so i started caching the ips that submitted the spam and automatically blocked access to the entire site from the ip after a number of spammy comments queued up. the comment spammers just flood the site with spam very quickly, so their spam count exceeds that quota and they are forever blocked just as quickly.

MicahWedemeyer · on July 28, 2009

Akismet + post-moderation is definitely the way to go. If you're using pre-moderation (ie. it doesn't show until the blog owner clicks "OK"), please say so in the comment form so I can leave before wasting my time.

raghus · on July 27, 2009

This is good! http://www.tbray.org/atompub/ouch (refresh it to see a new message)

m_eiman · on July 28, 2009

Even though my blog is very low traffic, and that's probably part of why my approach is working, it has yet to receive a spambot comment after four years.

I'm guessing this is due to its... non-standard construction. It's pure XML plus an XSLT that transforms it into HTML on the client side.

A more sensible approach might be to use XSLT to transform just the FORM stuff on a "real" blog system, but I haven't tried that. If the bots are using real browsers when trawling the net and not just some regexps it'll provide no protection.

If you're curious it's over at http://eiman.tv/blog/

blasdel · on July 28, 2009

I didn't believe you at first, but lo and behold, you're serving as Content-Type: text/xml, and there's actual client-side xslt!

What do you do for IE? :)

m_eiman · on July 28, 2009

What did you call it, "IE"? Never heard of it... :]

It hasn't been on my list of priorities. Might work in IE7 or IE8, but I haven't tried it.

pmjordan · on July 28, 2009

IE supports XSLT. Yes, even IE6.

rythie · on July 28, 2009

I remember playing with it in 2000 on IE5, though I remember having to download new version of MSXML for it. It might have even been the first browser to support it.

jacquesm · on July 27, 2009

The biggest reason why a certain blog does not attract spammers is probably because it is very small. Not using standard software helps, and if you've never run with automatic approval that is another show stopper for spammers.

For every target that is even slightly hardened there are half a million soft targets out there. Better to go after those, the ROI is higher.

If you had lots of visitors it would be worth their time to figure out a way to get in there. Think of spammers as a measure of success, if you are anywhere near successful the spammers will find you, count on it.

blasdel · on July 27, 2009

As the author of XML (among other things) he has a pile of PageRank to offer, and that attracts automated attention.

jacquesm · on July 28, 2009

Those would have to be pretty stupid spammers then, the rel='nofollow' took care of that.

sfphotoarts · on July 27, 2009

Not exactly, he was involved with (not an inventor of) a decedent technology of IBM's GML called SGML. A more fair attribution would be that he inventor a precursor to AJAX.

avibryant · on July 27, 2009

Excuse me? Tim Bray and C. M. Sperberg-McQueen were the editors of the first XML specification, and their names have been on every edition since then. See http://www.w3.org/TR/REC-xml/ all the way back to http://www.w3.org/TR/WD-xml-961114 .

mbrubeck · on July 27, 2009

I'm not sure what you're talking about. Yes, Tim Bray was involved in the development of SGML. But he's also the first-listed editor of the XML 1.0 specification.

sfphotoarts · on July 27, 2009

History is more than being a name listed on the web page... That was a turbulent time back in mid nineties with a lot of things going on. You'd have to have been around and involved back then to really know what happened.

XML is a simplified subset of SGML, which as I stated was an invention of IBM.

Moot these days I suppose. A horrendous 'technology' that has seen its day pass and be replaced by considerably improved markup languages.

wglb · on July 28, 2009

Er, you are off on this one. GML was invented by IBM. Tim was involved with SGML, wrote one of the first, if not the first, web crawler, and was a primary force and first listed author on the XML spec.

mbrubeck · on July 29, 2009

Here's Tim Bray's own history of the early days of the XML committee: http://www.tbray.org/ongoing/When/200x/2008/02/10/XML-People

RyanMcGreal · on July 28, 2009

I run a modestly popular (around 10,000 page views a day) civic affairs website that used to be utterly deluged with spam. The site was becoming unusable and unmanageable, and I had to do something about it.

I wanted to avoid image captchas, which I consider to be a usability barrier. Instead, I implemented a really simple, three-pass filtering system for comments by anonymous visitors (registered visitors can log in and bypass the spam filtering system entirely). It included the following three steps:

1. Ask a very simple grade-school math question using plain text.

2. Check comment text against a list of 56 common spam words. (This is actually rather crude: anonymous comments including, say, "socialism" trip the filter for "cialis").

3. Include a hidden form field that is supposed to remain blank.

That was in June 2008. Since then, not one spam comment has gotten past the filters.

I should note that my site, like Tim Bray's, uses a CMS I developed from scratch (starting about five years ago), so it's likewise not subject to the exploits that target Wordpress or Drupal or the other popular CMSes.

cesare · on July 28, 2009

Automated comment spam (using scripts) can be easily blocked just by using random form field names.

I'm surprised that this solution has not been adopted yet by the most popular blogging softwares.

cesare · on July 28, 2009

Please ignore the comment above. Don't upvote it. Now that I think of it, it doesn't make much sense.