Comment Moderation: Fighting Spam and Trolls

A few tips from a long-time blogger.

As any blogger with even a slightly popular blog can tell you, good comment moderation is an absolute requirement to maintain a good, readable blog.

The way I see it, comment moderation serves two purposes:

  • It prevents your blog from being an advertising platform for people who don’t contribute real content. I’m not just talking about obvious spam here, either.
  • It prevents your blog from being a platform for offensive or abusive people who don’t contribute real content. And yes, I am talking about trolls here.

Let’s take a closer look at each of these two points.

Comments by Spammers

There are two kinds of comment spam.

One type — the most prevalent — is mostly automated spam posted by software commonly referred to as spambots. Once your blog gets on the radar (so to speak), automated spam can be quite significant. This blog, for example, attracts more than 500 automated spam comments a day.

This kind of spam is pretty easy to recognize. One type, for example, includes multiple links for things like online gambling, prescription medication, or pornography. The other type puts its link in the comment form’s URL field and then fills the comment field with text that may or may not make sense but has nothing to do with the content of the original post. Here’s an example from my post titled “Five Tips for Composing a More Effective Social Networking Bio“:

I precisely had to thank you so much all over again. I am not sure the things that I could possibly have accomplished in the absence of the entire tricks contributed by you on my problem. It truly was a very frightening case for me personally, nevertheless viewing your specialized manner you handled the issue forced me to leap over delight. I’m just happy for the assistance and believe you are aware of a great job that you’re getting into training other individuals via a site. More than likely you haven’t encountered any of us.

Huh? I get hundreds of comments like this every day.

It should be noted that a lot of this spam appears on posts that may be quite old. This particular one appeared on a post that was 2-1/2 years old. This is one reason why bloggers use plugins to automatically turn off the commenting feature on older posts.

Fortunately, spam prevention tools can detect and catch 99% of this kind of spam. I use Akismet on my WordPress site and it does a great job of catching and corralling this garbage so it never has a chance to appear on my blog. If you’re not using a spam prevention tool and are manually going through this crap, what are you waiting for? Don’t you have better things to do with your time?

The other kind of spam is more insidious. It’s posted by a real person and it looks like a legitimate comment. But its sole purpose is to promote a product, service, or Web site — not to engage you or other blog readers in a conversation about the original post’s topic.

In many cases, the spammer doesn’t put any real effort into his comment. It might contain a sentence or two that’s vaguely related to the post. The spam delivery is in the commenter’s name and URL. Rather than being something like “John” or “Mary Smith,” it’ll be something like “John’s Carpet Service” or “Discount Vitamin Shack.” The URL will be the URL for the site John or Mary want to promote. In most cases, the email address will be something that’s likely fake or never checked for incoming mail — usually a Gmail or Yahoo! account — but sometimes a legitimate-looking email account is included.

To me, this is a gray area — is it a legitimate comment or spam? Considering the content and purpose of the comment should guide you. Your site’s comment policy should help; I’ll get to that in a moment.

Trolls

A far worse problem these days is what many people refer to as trolls. Trolls are people who post offensive or controversial commentary on blogs or discussion forums. Their goal is apparently to make themselves look smart or superior at the expensive of you or other commenters. By posting comments, they’re “trolling” for an argument — much like a fisherman might go trolling to catch fish.

This is where good comment moderation is vital to your blog.

You see, if you allow offensive commentary — including personal attacks on yourself or blog commenters — you do two things:

  • You discourage legitimate commenters from sharing their thoughts. After all, they could be the victim of the next troll attack.
  • You encourage more trolling activity by current and future trolls. After all, you let one offensive comment out there, you’re likely to allow others. They see your blog as a good place to troll for new victims.

Is that something you really want?

I have seen too many blogs and forums completely devastated by the comments posted by trolls and the offensive and defensive comments posted in response. Back in the early days of the Internet and newsgroups, we used to refer to this as “flame wars.” There’s nothing useful or productive about the comments by trolls or the resulting flame wars. Why allow them on your blog?

The Freedom of Speech Argument

The biggest defense against firm moderation that would prevent trolling activities is that it’s “censorship” and that you’re violating the commenter’s “freedom of speech.” They often use the phrase “First Amendment Rights.”

Let’s look briefly at the First Amendment to the U.S. Constitution:

Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances. [emphasis added]

Where exactly does it say that I have to put up with offensive commentary on my blog? All it says is that the government can’t make a law abridging the freedom of speech. I’m not the government, I’m not making a law.

So I don’t think “free speech” is a valid argument. After all, should anyone have the right to say anything they want — no matter how offensive — on your blog?

If people want to spout hate and offensive commentary, they can do it on their own blog.

Creating a Comment Policy

One way to fight back against spammers and trolls is to create and uphold a site comment policy. This policy should clearly state what is and/or isn’t allowed in the comments on your blog. Linking to this policy in an obvious place — or even placing a short version of it right above or below the comment form — will make it clear that you don’t tolerate spam or bad behavior.

Want some examples of good comment policies? Here are a few to give you ideas:

  • An Eclectic Mind. This is the comment policy for my personal blog. It’s a bit wordy — what do you expect from me? — but it does cover all the bases. You might also be interested in another post on my blog, “I Love Blog Comments Here.”
  • Stonekettle Station. Jim Wright doesn’t put up with crap either. That’s the short version of his comment policy. The long version, which address trolls and free speech, can be found here.
  • Whatever. John Scalzi’s comment policy. Simple and to-the-point.
  • Lorelle on WordPress. Lorelle knows more about WordPress blogging than I ever will. Here’s her site’s comment policy. You might also be interested in another post on her blog, “Comments on Comments.”

This topic was also addressed back in 2007 by Lorelle VanFossen in The Blog Herald.

Do you have a site comment policy you want to share with readers here? Post it in the comments for this post.

Maintaining Order

Creating a policy isn’t enough. You also have to maintain it. That means objectively reviewing every comment on your site and deleting the ones that violate the policy.

Yes, deleting them.

My advice is not to edit them, or allow them but reply with a warning, or do anything else. If a comment violates your policy, just delete it.

Don’t even send the commenter an email message telling them that you’ve deleted their message and why. If a commenter lacks the courtesy to be civil and follow your established rules on your blog, does he deserve any courtesy from you?

More important than that is the entire concept of “feeding the trolls.” When you respond in any way to a troll, you encourage more trolling activity. You see, these people just can’t let it go. They see any response as having a victim on the hook and they keep up their trolling behavior.

Ignore them and they will go away. Really.

You need to keep this in mind no matter where you see trolls. If you can’t delete their offensive crap, just ignore it. (Or, if it’s offensive enough, contact the site owner directly and tell him/her what you think and how it makes you feel about their blog/site/forum. A responsible site owner will take care of the problem.)

And if the whole concept of trolls is new to you, I urge you to read the entire “Troll (Internet)” entry on Wikipedia. It’s excellent and it clearly shows how bad these people can be for an Internet community like a blog.

Steps to Take

To sum up, I want to review the steps you might want to take to moderate and control the comments on your blog.

  1. Install and use spam prevention tools. Akismet is the best one (in my opinion) for a WordPress blog. It’s free.
  2. Write and post a site comment policy. Use the ones linked to above to give you ideas.
  3. Set up your blog to require moderation of all comments. On a WordPress blog, you do this in Discussion Settings.
  4. Regularly check for and approve (or delete) new comments. I’ve created a bookmark in my browser to quickly go to the comment moderation panel for each of my sites. I check for comments every morning and sometimes during the day so few comments are ever held in moderation for long.
  5. Resist the urge to respond to trolls on your blog. Don’t respond in comments or in email. You will regret it.
  6. Ignore the comments posted by trolls on other sites and in online forums. Don’t feed the trolls.

Please use the comments for this post to share your thoughts, experiences, and questions about this topic.

Twitter’s Report for Spam Feature

Block and report with one simple click.

Spam has been a problem on Twitter since it became mainstream over a year ago. It’s an extremely frustrating situation for those of us who want to use the service as a social networking tool — to actually meet and interact with other people who we find interesting. We’re the ones who follow up on new followers and actually read incoming @mentions (or @replies) and direct messages.

Report for SpamI’ve urged people to report spammers using the @spam Twitter account. But now there’s a better way: The Report For Spam link on the person’s profile page.

This example shows it quite clearly for a spammer account that began following me today. It’s the last link in the Options area. Clicking the link displays a confirmation dialog to make sure you really do want to block the account and report it for spamming. Click OK and the job is done.

What kind of account activity is considered spamming? The Twitter Support page, “Reporting Spam on Twitter,” lists many examples of what the Twitter folks consider spam. I recommend that you read it if you’re not sure what Twitter spam is.

In this example, the spammer had followed hundreds of Twitter users, likely because they’d tweeted using a keyword the spammer had programmed into a bot. The spammer posted just one tweet, which didn’t make much sense and included a link. I didn’t click the link; it’s never wise to click a link posted by a spam account. (Think candy from stranger.) The link was likely either going to sell me something or attempt to install some malware on my computer.

I’m thrilled about this new Twitter feature. If used consistently by serious Twitter users and acted upon by the folks at Twitter headquarters, we should see a reduction in spam and perhaps a lot of discouraged spammers. Sadly with the proliferation of automated Twitter follow and spamming tools, it’s unlikely that the spam problem will ever completely go away.

Learn it all.But I think that if we do our part to report spammers as they follow or interact with us, we’ll make the Twitter experience a bit more enjoyable for everyone.

PLEASE Report and Block Twitter Spammers

It’s getting completely out of control.

This afternoon, I received @ replies from three different Twitter users who do not follow me, all of which contained spammy content. All three messages were obviously automatically generated based on a key word I’d included in a tweet:

  • Spammer 1 invited me to a “Free Procrastination Seminar” after I used the word procrastination in a tweet.
  • Spammer 2 pointed me and a Twitter friend to a site that sells face masks after I suggested that my friend wear a face mask when cleaning out a dusty hay barn.
  • Spammer 3 pointed me and a Twitter friend to a site that sells MacBook Pro batteries after my friend and I had a Twitter exchange about his MBP battery.

It’s bad enough that everyone and his uncle is trying to use Twitter to promote themselves and their businesses. But now they’ve set up empty Twitter accounts and are using automated tools to send out Tweets that promote their products or services based on key word matches. That means they could be sending out hundreds or thousands of advertising tweets per day, clogging up your Twitter timeline with their crap.

I, for one, am sick of it.

There are two things you can do to help stop Twitter spam:

  • Follow @spam on Twitter. This is a special account monitored by the folks at Twitter. Once you follow @spam, it will follow you back. You can then send direct messages to @spam when you want to report a spammer. For example, you might compose a message like this:
    d spam @spamguy123 is sending me unsolicited advertisements.

    The folks at Twitter investigate legitimate spam complaints. In addition, @spam sends out periodic tweets about using Twitter safely, so you might pick up a few useful tips.

  • Block spammers. If you get followed by a spammer or received an @ reply with spammy content, take a moment to block that Twitter user. The folks at Twitter take blocking into consideration when evaluating spam reports and account activity.

You can learn more about reporting Spam to Twitter here.

Learn it all.Please don’t just ignore the spammers. Do something to stop them. Only if we all act can we get a better handle on the situation. The folks at Twitter hate spam even more than we do. It clogs their bandwidth and stretches the resources of their servers. If we help them identify spammers, they’ll help us by suspending their accounts.

Spread the word.

Blogging Basics: Comment Spam, Part II

Part II: When Comments Go Wrong

In the first part of this series, I explained what comments and pingbacks are and how they can benefit your blog. If you don’t know this stuff, go back and read that first. In this part of the series, I’ll explain how and why the comments feature can go wrong and list three tools for WordPress that can fight it.

Spam, Spam, Spam, Spam

While your blog’s readers like the comments feature because it enables them to participate in your blog, spammers like it, too. It gives them the ability to share their spammy comments and links on your blog.

Comment Spam ExampleComment spam is a terrible problem for bloggers. If left uncontrolled, it can quickly take over your blog by filling post comments with a lot of garbage — some of of obscene — including links to Web sites you probably don’t want to advertise for. Your blog visitors will have to wade through all this junk to find real comments. If the problem is bad enough, the probably won’t bother looking. If the comment spam is offensive enough, they might not visit your blog again.

Pingback SpamComment spam’s close cousin is pingback spam, which is relatively new to blogging. In pingback spam, someone else’s blog links back to yours, placing a pingback link to that blog in your blog. The purpose may be to get your site visitors to come to that blog, or, if you have nofollow disabled, to improve the site’s Google page rank.

Both comment spam and pingback spam can be automatically generated. For comment spam, spambot programs can automatically find comment forms on a blog, fill in the fields, and submit the spam comments. Pingback spam can be created through the use of feed “scraping” tools that pull parts of posts from your blog and posts them to the spammer’s blog, along with a link to yours. Because of automation, so there’s no limit to how much spam can be sent to your blog.

Spam Stopping Tools

Fortunately, there’s help. Many WordPress programmers are out there, fighting the same war against spam that you are. They have the skills to write plugins that can identify spam and quarantine or delete it so it doesn’t appear on your blog.

While there are numerous spam prevention tools out there for WordPress users, I have personal experience with three of them:

  • Aksimet, which is part of WordPress.com and comes as a plugin with self-hosted WordPress blogs, is created and maintained by the folks at Automattic, makers of WordPress. It’s fully integrated into WordPress and is extremely effective. I tell you more about how to set up and use Akismet in Part IV of this series.
  • Spam Karma, by Dr. Dave, is another powerful spam prevention tool. I used this exclusively for a while and it caught all the spam that appeared on my site. The only reason I stopped using it is because I switched to Akismet.

  • Bad Behavior is a plugin by Michael Hampton. It attempts to head off spam by determining whether a hit to a blog post is by a human or a spambot. Spambots are automatically denied access. One side benefit of this approach is a reduction in MySQL activity due to spambot access — that’s why I initially began using it. I used Bad Behavior in conjunction with one of the other spam prevention tools listed here for some time before trusting Akismet to do the whole job. The reason: Bad Behavior sometimes records false positives, making it impossible for certain real people to post comments. This problem occurs rarely, but since Akismet seems to be doing the job on its own, I prefer not to take the chance. (Note to Michael if you stop by to read this: if I got this wrong, please do comment to set me straight.)

I should note here that both Akismet and Spam Karma can “learn” about spam based on how you resolve comments you manually moderate. That’s why it’s important to properly identify any false positives or missed spam.

In the next post of this series, I’ll explain how you can identify comment spam — even when it doesn’t look like spam.

Learn More

Get more from your software.

Learn more about working with a self-hosted WordPress installation — or WordPress.com. Check out my WordPress courses on Lynda.com.

Blogging Basics: Comment Spam, Part I

Part I: Understanding Comments and Pingpacks

One of the main things that differentiate a blog from a Web site is the ability of readers to interact with what you post. This is done primarily through the use of comments.

Comment Basics

Most blogging software supports reader commenting. Typically, a comment form appears at the bottom of a post. Readers can enter their comments about the post, along with their name, e-mail address, and Web or blog URL. When the form is submitted, the comment is added to the post.

Post with CommentsThe screenshot here shows what a post on my blog, An Eclectic Mind, looks like with a few comments added, as well as a comment form.

Most blogging software packages offer the blogger options for handling comments. WordPress, for example offers several options:

  • Comments can be enabled or disabled by default or set on a post-by-post basis.
  • Commenter e-mail address can be required for a comment to be submitted.
  • Blog registration can be required for a comment to be submitted.
  • Comments can be held for moderation or automatically moderated based on a handful of options, including moderation and blacklist words or phrases.

Pingbacks and Trackbacks

Pingbacks (or trackbacks) are part of the commenting arena. A pingback happens when another blogger writes a post in which he links directly back to your post. He may have quoted your post in his and is linking back to the source. Or maybe he just wants to tell his readers how good your post was and send them over to your blog to read it. If his blogging software supports pingbacks or he has manually entered the link as a trackback, a special comment is sent to your blog with a link back to his blog.

Technically, a trackback is different from a pingback. A pingback is automated. The other blogger’s blogging platform must be capable of creating the pingback comment. Before automated pingbacks were widely supported, blogging platforms included a trackback feature that required the blogger to manually enter a linked post’s URL in a field when creating his post. Nowadays, these two terms are often used interchangeably.

In WordPress, you must have pingbacks enabled for your blog posts in order for WordPress to receive them. Pingbacks can appear with comments or, if the blog’s theme separates comments from pingbacks, they can appear separately. For example, my blog’s theme separates comments and pingbacks under different “tabs.”

Pingbacks look different, too. Instead of including a blogger’s name and comment, they include the name of the post that links to your post and a short excerpt surrounded by [...] characters. Here’s what a pingback looks like on a post in this blog:

Pingback Example

Comments, Pingbacks, and Reader Participation

It’s pretty easy to see how comments encourage reader participation. Comments give readers an opportunity to add or respond to your post. If enough readers comment and you respond, a conversation gets started. Sometimes that conversation can have more value than your original post.

For example, one of the most popular posts on this site is about a change in iTunes that affected how podcasts play back on an iPod. I identified the problem and created a workaround. A bunch of readers commented. One of the readers commented by sharing an AppleScript he’d written to automate my workaround. Another reader fine-tuned that script so it ran more efficiently. To this day, I use that script as my workaround. You can see the post and read the comments here.

Pingbacks also encourage reader participation, but in a less direct way. Suppose you read this post and think that your readers might benefit from it. You write a post on your blog that refers to it and adds your own comments. When you link to this post from your blog, a link to your post appears on this post. So readers reading comments here can go to your post to see what you’ve written about this topic.

Unfortunately, not everyone uses comments and pingbacks as they’re intended. The result is comment and pingback spam. I’ll discuss those in the next post of this series.

Lynda.comLearn More

Learn more about working with a self-hosted WordPress 2.7 installation — or WordPress.com. Check out my WordPress courses on Lynda.com.

Related Posts:

The following posts on this site are related. This list is not machine-generated.

All Pingbacks Must Die

I’ve had my last pingback spam.

Anyone who has a blog knows that the comment feature is what makes a blog stand out from a plain old Web site. The comment feature is what makes a blog interactive, it’s what gives readers a chance to share their point of view or additional information about a topic. It gives them a chance to ask questions and get answers.

The comment feature works with the pingback feature. Pingbacks (which are often referred to as trackbacks) are machine-generated “comments” that are added to a post when another blogger writes a post that links to it.

Huh?

Discussion AreaOkay, think of it this way. You’re blogger A writing post 1. Blogger B writes post 2 that includes a link to post 1. A comment appears on post 1 that links back to post 2. This is all done automatically in WordPress (my blogging platform of choice) if — and this is a big if — you left the Allow Pings option turned on for post 1. You can find the setting for this in the Discussion area of the Write Post administration panel.

Unfortunately, the pingback feature also makes it possible for sploggers to get free links to their sites. A splogger builds content on a blog by stealing it from RSS feeds. Their goal is usually to get hits on their Web sites, which are full of Google AdSense ads, but they sometimes are part of a “link farm” that boosts search engine ranking.

The problem lately is that my sites have been attracting more pingback spam from splogging sites than real pings from legitimate sites and bloggers. These must be manually deleted, since my spam prevention software doesn’t seem able to catch them all. And frankly, I’m a little sick of spending each morning deleting six to twenty of these comments.

So I’m going to stop writing posts with the pingback feature enabled.

And if you’re having this problem on your blog, I recommend that you do the same.

Deleting Spam from Your WordPress Blog

Marking it as spam isn’t enough to get rid of it.

One of the things I like about WordPress is that it’s impossible to know everything about it. And today I learned something new.

I learned that the spam comments that I marked as spam had not been deleted from my WordPress database. They were just marked as spam so they wouldn’t appear in posts.

How did I discover this? I had to export all blog posts from An Eclectic Mind to a special WordPress-compatible XML file that contained all blog posts and comments. I had to weed out all the posts and comments I didn’t want to import into my new Maria’s Guides site. And that’s when I found all the nasty spam I’d marked for the past 4 years.

Now don’t think this was all of the spam. It was only the spam that was marked as spam using WordPress’s comment moderation feature. When the comment spam situation got out of control, I enlisted the help of the Bad Behavior and Spam Karma 2 plugins. Bad Behavior prevents potential spambots from posting comments at all. Spam Karma catches 95% of the spam that gets past Bad Behavior. I’m left with less than 10 spam comments a day. Not bad when you consider that Bad Behavior alone caught 17,067 spam attempts in the past seven days. The way I see it, anyone with a relatively well-Googled blog who doesn’t use at least one of these tools is doing a lot more comment moderation than they need to.

So there I was, halfway through the process of deleting non-book-related posts and their comments from an XML file, when I realized that much of the file’s contents was spam that wouldn’t appear when I imported it anyway. And that’s when I started thinking about how much database space was devoted to this spam.

The DB-Manager Plugin

Database ContentsI use Lester ‘GaMerZ’ Chan’s DB-Manager plugin. This plugin puts MySQL database features into the WordPress administration panel. This is a must-use for anyone who needs to get into their database and learn more about it or make changes to it.

So I went into the plugin’s interface and learned that my blog had 1900+ comments. I knew that only 1400+ comments were actually appearing in the blog. That made 500+ spam entries sitting in my database, taking up disk space and making my backups much larger than they needed to be.

(Note: The screenshot here shows the database contents after removing the spam. If I’d known I was going to write about it here, I would have taken more screenshots.)

I wanted them out.

Help on the WordPress Forums

I found help on the WordPress forums. They really can be helpful if you enter the right search phrase.

The topic was Support › deleting over 10,000 spam comments without using moderation page. The story was, this poor soul had left his blog alone for a week and, when he returned, found 10,000 comments on it. He wanted to delete them.

A member named bindanaku came to his rescue with a MySQL query:

DELETE FROM wp_comments WHERE comment_approved='0'

This assumes that you want to delete all comments that haven’t been moderated. This was not the case for me. I wanted to delete all comments that had been moderated as spam. I assumed that the correct query for my situation would be:

DELETE FROM wp_comments WHERE comment_approved='spam'

I was right.

Back to DB-Manager

Enter a MySQL QueryI went to the DB-Manager administration panel and clicked the Run SQL Query button. That gave me a window where I could enter my query, as shown here. When I clicked Run, I got a message that the query was successful.

Sure enough, when I checked the Database info (see previous screenshot), I could see that 500+ comments had been removed from the database. But the table size was the same.

I used DB-Manager’s Optimize DB feature to optimize the database. That dropped about 400K from the table size.

I should note here that if you’re more familiar with editing a MySQL database, you can do the query with your normal editing tool. I don’t mess with my MySQL database much. I’m always afraid of screwing it up. (Call me a wimp — I don’t care.) That’s why I use DB-Manager.

Conclusion

While all this might seem like a lot of work to get rid of 400K of file size, the situation could be worse on your blog. My blog has about 1500 posts spanning about four years. I’ve been using Bad Behavior and Spam Karma for at least two of those years. So the majority of these old spams were from very old posts. If you don’t use any spam protection software and are manually moderating comments, you could have far more of these spam comments in your database. And since many of them were lengthy listings of porn and ringtone and other URLs, they were quite large in size. If you have a lot of these in your database, it could be taking up a lot of space — perhaps even more than your actual blog posts.

Do I recommend going through this process? It’s up to you.

Why WordPress.com is Virtually Spam Free

A great article on Plagiarism Today.

As those of you who read this site regularly should know, I’ve been pretty POed about the blog spam and splogging situation. I subscribed to the Plagiarism Today feed because of its excellent articles about copyright and the fight against feed scraping by sploggers.

Today’s article about WordPress.com was an especially good read. From Why WordPress.com is Virtually Spam Free on PlagiarismToday:

It seems as if nearly every major free blog hosting service has been either overrun or nearly overrun with spam. However, one services stands alone, a relative oasis of spam cleanliness, Automattic’s WordPress.com . Despite being just as free as its competitors and placing few restrictions on registration, WordPress.com has not endured the spam avalanche that other services have.

The article author, Johnathan Bailey, interviewed WordPress founder Matthew Mullenweg to learn why WordPress.com is so spam-free. The article is enlightening and highly recommended.

More Bad Behavior

I update some software to help keep spammers off the site — and preserve my bandwidth.

Miraz introduced me to the Bad Behavior WordPress plugin some time ago, and after ascertaining that it did indeed work with a GoDaddy.com hosting account (my hosting ISP), I installed it on all of my WordPress-based sites. What I saw was an immediate reduction in the amount of spam that Spam Karma was catching. That wasn’t because it made Spam Karma less effective; it was because less spam was actually accepted by WordPress for moderation. I can verify this by checking the Bad Behavior stats — it catches roughly 7,000 potential spam hits a week on just one of my sites. That means my server doesn’t have to work so hard and, as a result, it can be more responsive to visitors.

One of the drawbacks to hosting multiple sites on a budget is the limitations imposed by my ISP for my level of hosting. I’m allowed 100 concurrent hits — to all sites on my hosting account. I have two very busy sites online and I think they sometimes fight with each other for bandwidth. This hasn’t been a problem until lately — the other day I started getting Error 503 messages (server busy) when trying to access my sites.

I investigated and discovered that at the time I was trying to view my site, Spam Karma had caught roughly 200 spam messages in the span of 3 minutes. No wonder my site was busy. Spam Karma was fighting off spammers. But what the heck was Bad Behavior doing? Sleeping on the job?

I went to the Bad Behavior Web site and noticed an update that should resolve things. More spammer-stopping power. I downloaded and installed it. If the Error 503 messages become less common and Spam Karma catches less spam, I know it’s doing its job.

The point is this (yes, there is a point): if you have a WordPress blog that allows comments, having spam protection is more than just preventing your site from being filled up with spam comments. It’s protecting your bandwidth. And for that, Bad Behavior seems like a good solution. Just be sure that you have the latest version.

And, if you find your spam prevention software helpful, be sure to send a few euros to the developers to keep them interested in keeping the software up-to-date.

December 14 Update: I just did some more research over at Lunacy Unleashed, Web site for Michael Hampton, the developer of Bad Behavior. His article, “Spam Surge,” seems to collaborate what I’ve been experiencing. Apparently, the spam surge also affects e-mail accounts. (My e-mail spam increased considerably about a month ago but has since tapered off to manageable levels.)

One Way to Protect Your E-mail Address from Spammers

Don’t put it on a Web site!

Spammers are nasty, sneaky, conniving people. They use every tool at their disposal to gather e-mail addresses to spam.

Among the tools in their arsenal are spambots — programs that crawl the web and gather anything that looks like an e-mail address, whether it’s in text or part of a mailto tag. Like this: me@spamsucks.com. Or this: Get Info. (I just made those addresses up. Let’s hope they’re not used by anyone, because they’re sure to be spammed.)

So here’s a tip: if you have a Web site, or your company has a Web site, do not put your e-mail address anywhere on it. Doing so will likely get your e-mail address on spam lists. The amount of spam you get will grow exponentially over time, forcing you to spend more time weeding out the spam you receive than actually reading the legitimate messages.

How then, you ask, can people contact you?

My preferred method is with a contact form, like the one used on this site. Here are some painless ways to install a contact form; one of them should work for you:

  • WordPress users can use the WP Contact Form plugin by Ryan Duff to create a quick-and-dirty contact form. Miraz and I discuss this plugin in our book, WordPress 2: Visual QuickStart Guide.
  • Web sites on an Apache-compatible server with PHP installed can use MindPalette’s NateMail, a free contact form that works with PHP. I used this for a while — until I switched to WP Contact Form — and liked it. One of its best features is the ability to use a pop-up menu that lists various people to be contacted. Because the e-mail addresses are not in the Web page, they are protected from spambots.
  • Your ISP may offer a form tool as part of its services. GoDaddy.com, for example, offers form a form mail feature as part of its hosting packages. You create an HTML form on your Web page, include the proper POST command, and GoDaddy sends form content to the e-mail address you specified in the configuration page.

Keep in mind that even contact forms are not capable of keeping out all spam. Some spambots are designed to look for forms and automatically fill them out with spam messages. One way to cut down on this is with a CAPTCHAs feature in the form software. This forces a user to enter the text characters that appear in a graphic image as part of the form. Most (but sadly, not all) spambots are foiled by this additional step, since they can’t interpret the graphic.

Another method for fooling spambots is to encode your e-mail address so it can’t easily be read by the spambot but can be read by humans or Web browsers. This can be something as simple as me at spamsucks dot com or as complex as using special obfuscation software to do the encoding. Personally, I prefer the forms. I’m sure there must be a spambot out there that can read encoded e-mail addresses. But if this is the only option, go with it.

Just get your e-mail address off your sites now. Every minute you waste can lead to more time wasted sorting through spam.

SpamSieve

A spam filter plugin for Mac users.

I don’t know about you, but I’ve been getting a TON of spam e-mail lately — much of it to my .mac e-mail account. Most of it falls into one of a few categories:

  • Stock “recommendations.”
  • Award announcements.
  • Letters from widows or businessmen in Nigeria.
  • Offers for sex-enhancing drugs and devices.
  • Porn sites.

Neither my e-mail server nor my Mail application (Apple Mail) seems able to weed out this crap. So it ends up in my In box for me to manually delete. What a pain in the butt.

Enter SpamSieve. This Mac OS plugin, which works with most common e-mail clients — Mail, Entourage, Eudora, and others — uses Bayesian filtering to identify and weed out spam. You train it, from within your e-mail program, to know what’s spam and what’s not. It works with your Address Book so it won’t mark a message from your mother (or editor or boss) as spam. And it maintains a database of internal rules that help it identify spam — rules you have easy access to and can change at will.

SpamSieve in Mail's Message MenuI installed SpamSieve on my PowerBook yesterday and it immediately began working — even before I had a chance to start training it! Installation isn’t difficult, but you’ll need to follow the instructions in the PDF manual that comes with SpamSieve to get it right. If everything is set up properly, SpamSieve training commands will appear in your application’s menus, as shown here. Then, every time you launch your e-mail client, SpamSieve also opens. It works in the background, moving messages identified as spam to a special Spam mailbox or folder, allowing you to train it to recognize spam or tell it that a message it thinks is spam really isn’t. When you’re finished working with e-mail and Quit, SpamSieve automatically quits, too.

SpamSieve StatisticsInterested in seeing how SpamSieve is doing? You can check out its statistics. Here’s what it looks like on my PowerBook; keep in mind that I check a limited number of e-mail accounts from this computer so it hasn’t had much to work with. The percent accuracy should get higher as I continue training SpamSieve; the manual recommends that you train until it has processed 1,000 messages for the best results.

Now, instead of dreading mail collections, I look forward to them. I’m always curious to see how SpamSieve does. So far, I haven’t been disappointed. It’s doing a better job than my ISP and Mail’s junk filter.

SpamSieve is shareware and costs $30. There’s a 30-day trial period and the software is fully-functional during this time. You can set it up and give it a good tryout for a few weeks before making the small investment in its purchase.

If you do try it, take a moment to stop back here and share your comments about it. I think other readers might benefit from more opinions than just mine.

Reducing Database Queries

I install Bad Behavior and get another tip from its author.

I never quite understood why I would use an excellent and powerful spam prevention tool like Spam Karma with another spam prevention tool called Bad Behavior. I’d been using Spam Karma on its own since I started moving my sites to WordPress last January and it was doing a kick-butt job of keeping my site comment-spam free — over 50,000 individual comment spams caught in the past eight months. Only about a dozen comment spams slipped through during the same period, and they were easy enough to delete manually. I didn’t think I needed anything else.

But Miraz also uses Bad Behavior on her blogs, so I figured there must be something about it that I was missing out on.

Then I moved all my WordPress-based sites to GoDaddy.com. That’s when I was introduced to GoDaddy’s limitation on concurrent database hits: 50 per MySQL database. My introduction came as a WordPress error screen that said my database could not be accessed when I attempted to load a page on aneclecticmind.com. Another try moments later and it worked. I figured that aneclecticmind.com and wickenburg-az.com, both of which get over 1000 visits a week, we most likely to hit that limitation. This was a problem.

I use a plugin called WP-UserOnline to monitor current access to the site. The software displays registered users (normally just me), guests, and bots from legitimate systems like Google, Technorati, and MSN (for example). It also tells me the maximum number of users online at once: 50 on aneclecticmind.com just a week or so ago. But it doesn’t tell me about the spam bots — automated programs that go online for the sole purpose of filling my posts with comment spam. Spam Karma stops the spam, but not before the spam bot gets online and tries to post it. With literally hundreds of spam messages caught and discarded every day by Spam Karma, it was likely that spam bots were accessing my database along with users, filling up my 50 access spots.

When my sites were running on my own computer, there was no limit to the number of concurrent hits to my mySQL databases. So I didn’t really care about spam bots. But now I had a limit and was starting to care very much.

So I looked at Bad Behavior again. It claimed to prevent spam bots from even getting online at my site. Surely that would prevent database hits. It was worth a try.

I installed Bad Behavior per the online instructions. Then I read the Read Me file. It said it didn’t work with GoDaddy.com. I wondered about that. This was a new version (2.0.6) and that Read Me file might not be up to date. So I decided to risk it to see what would happen.

What happened is that it worked. In less than 24 hours, in fact, it has stopped more than 200 spam attacks on just one site.

I wrote to the author of the software, Michael Hampton, certain that I was missing something. Perhaps Bad Behavior and GoDaddy’s settings were fighting under the hood, causing database corruption, etc. But he wrote back to say that GoDaddy.com had probably fixed the settings problem that was causing the incompatibility. It worked because the problem was gone.

Great!

I used the Donation link in the Bad Behavior Administration panel to send the author a few bucks. (Lunch on me, is usually how I phrase it.) In the comments area, I mentioned how well it was working and my concerns about database hits. Michael responded with even more useful information:

Bad Behavior runs as soon as possible during WordPress’s page load, which cuts database access to a minimum. There shouldn’t be more than a couple of queries run by the time Bad Behavior loads.

Another thing you can do to reduce database queries is to enable the WordPress object cache, which is disabled by default. You can do this by inserting this line somewhere in the middle of your wp-config.php file:

define('WP_CACHE', true);

Then WP will cache some common database queries in files on disk, where they won’t generate future database accesses (unless the data changes, of course).

Thanks for your support!

Thank you, Michael, for this useful tidbit!

I’ll institute that today and see how it works. In the meantime, I’m pleased at the performance of the Bad Behavior/Spam Karma pair. And now I understand why Miraz also uses both.

Footnote (added September 21, 2006): I added the above code to my wp-config.php file and got all kinds of error messages. I suspect that it’s because permissions aren’t set up correctly in my wordpress folder. If you try it and everything goes bad, simply remove the inserted line and resave the wp-config.php file. Will provide an update if I have time to troubleshoot the problem. In the meantime, Bad Bahavior has blocked over 500 access attempts in just a few days. Not bad, huh?

E-Mail Addresses on Web Sites

Why you shouldn’t include a link to your email address on your Web site.

Many people — including me! — use their Web sites as a kind of global calling card, a way to share information about themselves or their companies with others all over the world. It’s common to want to share your contact information with site visitors — particularly potential customers — so they can contact you. This is often done through the use of a mailto tag. For example, <a href="mailto:me@domain.com">email me!</a> which appears as a clickable email link.

Unfortunately there are people out there who want your email address, people who want to scam you into sending money to Nigeria, advertise their online casinos, sell you prescription drugs, show you their porn sites — the list goes on and on. If you have your email address on any Web site, you probably already get a lot of this spam. That’s because of computer programs that crawl through Web sites and harvest email addresses that are included in the otherwise innocent mailto tag. Heck, they even harvest addresses that aren’t part of a mailto tag, so just including your email address on a Web page without a link can get you on a bulk email list.

So what’s the solution? There are a few.

One popular and easy-to-implement solution is to turn your email address into a text phrase that a site visitor must see and manually type in to use. For example, me@domain.com becomes me at domain dot com or meATdomainDOTcom. You get the idea. Someone who wanted to send you an email message, would be able to figure that out — if he couldn’t, he really shouldn’t be surfing the ‘Net anyway — and manually enter the correct translation in his email program. But email harvesters supposedly can’t figure this out (which I find hard to believe) so the email address isn’t harvested.

Another solution is to use an email obfuscation program. These programs take email addresses and change or insert characters to make them impossible to read. The email addresses look okay on the site — to a person viewing them — and work fine in a mailto link — when used from the Web site. WordPress plugins are available to do this. I don’t use any of them, so I can’t comment on how well they work. But they must be at least a little helpful if they’re available. You can find a few here, on the WordPress Codex.

The solution I use is form-based email. I created a Contact Form with fields for the site visitor to fill out. When the form is submitted, a program processes it and sends it to my email address. Because that address is not on the Web page that includes the form — or on any other Web page, for that matter — email harvesters cannot see it. As a result, I’m able to provide a means of contacting me via email that keeps my email address safe from spammers.

The program I use is called NateMail from MindPalette Software. it’s a free PHP tool that’s easy to install and configure. But what I like best about it is that you can set it up with multiple email addresses. Use a corresponding drop-down list in your form to allow the site visitor to choose the person the email should go to. NateMail directs the message to the correct person. You can see this in action on my other WordPress-based site, wickenburg-az.com, in its Contact Form. If you want a few more features, such as the ability to attach files to an email message, MindPalette offers ProcessForm for only $15.

Other WordPress users are likely to have their own favorite methods of protecting their email addresses from spammers. With luck, a few of them who read this will share their thoughts in the Comments for this post.

One more thing…this doesn’t just apply to WordPress-based sites. It applies to all Web sites. And a contact form tool like NateMail will work with any PHP-compatible Web server.

If you’re already getting spam, using one of these methods won’t stop it. It’ll just keep the situation from getting much worse. Your best bet is to change your email address and protect the new one. In my case, that’s a big pain in the butt — so many people I need to be in touch with have my email address and, worse yet, I often use it as a login for Web sites I visit (which does indeed make the spam situation worse). I’m working on a plan to phase out the bad addresses and replace them with ones that I protect. Until then, I have to rely on the spam-catching features of my ISP and my email software to sort out the bad stuff — currently about 20-40 messages a day — so I don’t have to.