Marking it as spam isn’t enough to get rid of it.
One of the things I like about WordPress is that it’s impossible to know everything about it. And today I learned something new.
I learned that the spam comments that I marked as spam had not been deleted from my WordPress database. They were just marked as spam so they wouldn’t appear in posts.
How did I discover this? I had to export all blog posts from An Eclectic Mind to a special WordPress-compatible XML file that contained all blog posts and comments. I had to weed out all the posts and comments I didn’t want to import into my new Maria’s Guides site. And that’s when I found all the nasty spam I’d marked for the past 4 years.
Now don’t think this was all of the spam. It was only the spam that was marked as spam using WordPress’s comment moderation feature. When the comment spam situation got out of control, I enlisted the help of the Bad Behavior and Spam Karma 2 plugins. Bad Behavior prevents potential spambots from posting comments at all. Spam Karma catches 95% of the spam that gets past Bad Behavior. I’m left with less than 10 spam comments a day. Not bad when you consider that Bad Behavior alone caught 17,067 spam attempts in the past seven days. The way I see it, anyone with a relatively well-Googled blog who doesn’t use at least one of these tools is doing a lot more comment moderation than they need to.
So there I was, halfway through the process of deleting non-book-related posts and their comments from an XML file, when I realized that much of the file’s contents was spam that wouldn’t appear when I imported it anyway. And that’s when I started thinking about how much database space was devoted to this spam.
The DB-Manager Plugin
I use Lester ‘GaMerZ’ Chan’s DB-Manager plugin. This plugin puts MySQL database features into the WordPress administration panel. This is a must-use for anyone who needs to get into their database and learn more about it or make changes to it.
So I went into the plugin’s interface and learned that my blog had 1900+ comments. I knew that only 1400+ comments were actually appearing in the blog. That made 500+ spam entries sitting in my database, taking up disk space and making my backups much larger than they needed to be.
(Note: The screenshot here shows the database contents after removing the spam. If I’d known I was going to write about it here, I would have taken more screenshots.)
I wanted them out.
Help on the WordPress Forums
I found help on the WordPress forums. They really can be helpful if you enter the right search phrase.
The topic was Support › deleting over 10,000 spam comments without using moderation page. The story was, this poor soul had left his blog alone for a week and, when he returned, found 10,000 comments on it. He wanted to delete them.
A member named bindanaku came to his rescue with a MySQL query:
DELETE FROM wp_comments WHERE comment_approved='0'
This assumes that you want to delete all comments that haven’t been moderated. This was not the case for me. I wanted to delete all comments that had been moderated as spam. I assumed that the correct query for my situation would be:
DELETE FROM wp_comments WHERE comment_approved='spam'
I was right.
Back to DB-Manager
I went to the DB-Manager administration panel and clicked the Run SQL Query button. That gave me a window where I could enter my query, as shown here. When I clicked Run, I got a message that the query was successful.
Sure enough, when I checked the Database info (see previous screenshot), I could see that 500+ comments had been removed from the database. But the table size was the same.
I used DB-Manager’s Optimize DB feature to optimize the database. That dropped about 400K from the table size.
I should note here that if you’re more familiar with editing a MySQL database, you can do the query with your normal editing tool. I don’t mess with my MySQL database much. I’m always afraid of screwing it up. (Call me a wimp — I don’t care.) That’s why I use DB-Manager.
Conclusion
While all this might seem like a lot of work to get rid of 400K of file size, the situation could be worse on your blog. My blog has about 1500 posts spanning about four years. I’ve been using Bad Behavior and Spam Karma for at least two of those years. So the majority of these old spams were from very old posts. If you don’t use any spam protection software and are manually moderating comments, you could have far more of these spam comments in your database. And since many of them were lengthy listings of porn and ringtone and other URLs, they were quite large in size. If you have a lot of these in your database, it could be taking up a lot of space — perhaps even more than your actual blog posts.
Do I recommend going through this process? It’s up to you.













1 response so far ↓
1 Rog // Nov 28, 2007 at 1:16 am
Thanks for this. I’ve been planning on migrating my Wordpress blog but I was wondering if there was a bunch of stored spam that would get in the way.
Sure enough: 5702 rows affected. Crazy.
Rog’s last blog post..Giving up on Wordpress?
Leave a Comment