I also don't get spam comments anymore. I use the most trivial captcha imaginable, and I use traditional error messages.
But honestly, I hate blog comments. Being able to read a number from an image doesn't mean that what your writing is worthy of being published. (I keep them enabled because I like the fact that people are occasionally so upset with my analysis of Java's type system that they want me to "die in a fire". Really? What exactly would that solve?)
My solution to this problem (in addition to Akismet) is just to manually moderate all comments; it works for smaller blogs like mine ( http://x264dev.multimedia.cx/ ).
This prevents pointless comments and flamewars--of course, it also means I probably won't let you flame me baselessly in my blog comments, but there's no free speech on a private website anyways.
There wouldn't be free speech on the private site, but there's nothing stopping a would-be commenter from writing their own rebuttal article, were comments disabled.
I like this better because there's a much smaller chance of a flamewar happening, and the responses will generally be well thought out.
I agree that filtering out the worst junk doesn't necesarily raise the overall quality to an acceptable level. It seems like there are two issues, which are related: [1] comments are too small-d democratic and [2] there doesn't currently exist a good mechanism for aggregating comments.
With the exception of a few places like HN, the best comments are treated about the same as bad comments. The absence of incentives leads to the absence of quality.
The other issue is related in the sense that if there was a good mechanism for aggregating comments across sites it would almost certainly serve to recognize and reward the best comments. Solve the fragmentation problem, and you get resolution on the first one for free.
Using Akismet blocks about 99.99% of the spam comments that hit my blogs. I moderate the rest. It's quick and easy. I almost never delete a comment, but I like to have the option.
For a year I had about 30 spam comments a month on my Wordpress blog. I added Akismet a month ago and I've had 1 this month. It really does work and there's no need to bother users with a captcha.
i've been using defensio (http://defensio.com/) for about a year and have had great success with it. over 11,000 spam comments with 9 false positives, and no annoying captchas.
comments flagged as spam by defensio require an email address to go into the database (if it wasn't spam it doesn't require an address and will just post to the site right away). the user is then emailed a link that they have to click on to make their comment show up, which then notifies defensio it was a false positive.
eventually i grew tired of the email bouncebacks from the spam submissions, so i started caching the ips that submitted the spam and automatically blocked access to the entire site from the ip after a number of spammy comments queued up. the comment spammers just flood the site with spam very quickly, so their spam count exceeds that quota and they are forever blocked just as quickly.
Akismet + post-moderation is definitely the way to go. If you're using pre-moderation (ie. it doesn't show until the blog owner clicks "OK"), please say so in the comment form so I can leave before wasting my time.
Even though my blog is very low traffic, and that's probably part of why my approach is working, it has yet to receive a spambot comment after four years.
I'm guessing this is due to its... non-standard construction. It's pure XML plus an XSLT that transforms it into HTML on the client side.
A more sensible approach might be to use XSLT to transform just the FORM stuff on a "real" blog system, but I haven't tried that. If the bots are using real browsers when trawling the net and not just some regexps it'll provide no protection.
I remember playing with it in 2000 on IE5, though I remember having to download new version of MSXML for it. It might have even been the first browser to support it.
The biggest reason why a certain blog does not attract spammers is probably because it is very small. Not using standard software helps, and if you've never run with automatic approval that is another show stopper for spammers.
For every target that is even slightly hardened there are half a million soft targets out there. Better to go after those, the ROI is higher.
If you had lots of visitors it would be worth their time to figure out a way to get in there. Think of spammers as a measure of success, if you are anywhere near successful the spammers will find you, count on it.
Not exactly, he was involved with (not an inventor of) a decedent technology of IBM's GML called SGML. A more fair attribution would be that he inventor a precursor to AJAX.
I'm not sure what you're talking about. Yes, Tim Bray was involved in the development of SGML. But he's also the first-listed editor of the XML 1.0 specification.
History is more than being a name listed on the web page... That was a turbulent time back in mid nineties with a lot of things going on. You'd have to have been around and involved back then to really know what happened.
XML is a simplified subset of SGML, which as I stated was an invention of IBM.
Moot these days I suppose. A horrendous 'technology' that has seen its day pass and be replaced by considerably improved markup languages.
Er, you are off on this one. GML was invented by IBM. Tim was involved with SGML, wrote one of the first, if not the first, web crawler, and was a primary force and first listed author on the XML spec.
I run a modestly popular (around 10,000 page views a day) civic affairs website that used to be utterly deluged with spam. The site was becoming unusable and unmanageable, and I had to do something about it.
I wanted to avoid image captchas, which I consider to be a usability barrier. Instead, I implemented a really simple, three-pass filtering system for comments by anonymous visitors (registered visitors can log in and bypass the spam filtering system entirely). It included the following three steps:
1. Ask a very simple grade-school math question using plain text.
2. Check comment text against a list of 56 common spam words. (This is actually rather crude: anonymous comments including, say, "socialism" trip the filter for "cialis").
3. Include a hidden form field that is supposed to remain blank.
That was in June 2008. Since then, not one spam comment has gotten past the filters.
I should note that my site, like Tim Bray's, uses a CMS I developed from scratch (starting about five years ago), so it's likewise not subject to the exploits that target Wordpress or Drupal or the other popular CMSes.
But honestly, I hate blog comments. Being able to read a number from an image doesn't mean that what your writing is worthy of being published. (I keep them enabled because I like the fact that people are occasionally so upset with my analysis of Java's type system that they want me to "die in a fire". Really? What exactly would that solve?)