Lunacy Unleashed

Notes from the field in the War on Spam

WordPress 1.6 Bug Hunt

Waiting for WordPress 1.6? I’m afraid you’ll have to wait a little longer, but the wait promises to grow much shorter after this weekend.

This Saturday, November 5, will be the WordPress 1.6 Bug Hunt. To participate, just install a WordPress 1.6 blog somewhere from the current subversion repository, and then log in to the #wordpress-bugs channel on IRC (irc.freenode.net).

Skippy says “All you need to bring is a text editor, and an installation of WordPress 1.6-ALPHA-2-still-dont-use! We’ll provide the snacks, and manage the schedule.”

During the bug hunt, we’ll all go through the outstanding bugs, work up patches, and get this puppy ready. If you know some PHP and want to get more familiar with WordPress internals, now’s your chance.

November 1, 2005 Posted by | WordPress, WordPress 1.6 | 1 Comment

Automattic Spam Stopper

Recently, Matt Mullenweg, creator of WordPress, had a bright idea on how to stop blog spam. He wrote up some code, distributed his new WordPress plugin to a small group of testers, and so was born the so-called Automattic Spam Stopper, or ASS.

I was able to obtain a copy of Automattic Spam Stopper for review and made a quite disturbing discovery, namely, how it works.

Whenever a user makes a comment to your WordPress blog, ASS forwards a copy of the entire comment, the metadata such as username, email address and URI, as well as your blog address and Web server environment variables, to a central server for analysis. The server then returns the response “true” if the comment is judged to be spam.

Mullenweg isn’t saying what the “secret sauce” is for the server, so as to frustrate the spammers. “By the time we’re done spammers around the world will quiver in their boots,” said Mullenweg.

So how does the server determine what’s spam? Users of the plugin submit copies of any spam they receive by marking them as spam in the WordPress administration panel. ASS then forwards copies of these to the server for analysis.

The submitted spam, however, remains in your database, but hidden from view. This could cause resource constraint (disk space) problems, and backup/restore problems, for many users, especially after time. WordPress does not automatically remove spam from its database, and does not provide any method for removing it from the database. A third-party plugin, however, does provide this function.

Right now Mullenweg inspects all comments submitted this way manually, before the server considers them to be spam. If he judges them to actually be spam, then they are added to the server’s corpus, or database of submitted spam.

He has not said, however, whether legitimate comments are kept on the server, or whether anyone else looks at the submissions. Thus, ASS may not be a good anti-spam choice for private blogs, or for blogs which frequently use password protection to limit access to their contents. In a very real sense it comes down to whether you trust Matt Mullenweg with your readers’ comments. Some people will, and others won’t.

Mullenweg envisions ASS as a service which is free for personal use, and paid for business use. “I would be more comfortable with something where it was free for regular people, and only businesses or enterprises paid (enough to support everybody),” he said.

“There may be ‘keys’ or accounts at some point to prevent abuse,” he said. “However the plugin and API are designed to be pretty easy to recreate, so if someone wanted to run their own spam [prevention] service they could easily.”

That much is true. I could create a server to do this in rather short time. And I almost did. It’s been an idea that’s been discussed before among WordPress anti-spam gurus, and ultimately rejected.

To date no one has been able to provide a centralized server solution which ensures the integrity of the database, for instance. Mullenweg ensures the integrity of his database by inspecting all comments manually, but this “solution” doesn’t scale very well, and is untenable once ASS is released to a wider audience. He has proposed that users be registered and receive keys in order to use the service, but even this doesn’t prevent spammers themselves from registering and submitting garbage to the database.

In addition, no one has been able to provide a centralized server solution which ensures the privacy of users whose comments are subject to this sort of analysis, especially with respect to private blogs and password-protected posts, where users expect their comments to be private. I’ve come up with an idea or two on how this might be done, but I’m not sharing until I’m certain it really can be done; if it were really that easy, it seems that someone would have done it already.

Now if Mullenweg can solve the problems of privacy, integrity, scalability, and those gigabytes of spam clogging up his users’ databases, he may be on to something. But everyone else who’s had this idea ultimately scaled it back or dropped it entirely. I fail to see how Matt’s ASS is any different.

In the meantime, if you’re looking to stop spam without compromising your users’ privacy, consider Bad Behavior, which is shockingly effective despite not looking at the content of comments at all, and Spam Karma, which does, but doesn’t send the whole comment, and much of your server information, off to who knows where.

Update: Some other reviews of Automattic Spam Stopper:

October 10, 2005 Posted by | Bad Behavior, Blog Spam, WordPress, WordPress 1.6, WordPress.com | 13 Comments

You can have any color you want, as long as it’s black

Okay, so I’ve had a chance to play with WordPress 1.6-ALPHA-2-still-dont-use out of SVN, and I’ve had a chance to play with WordPress.com. I think I have a half-baked idea of what’s going on, and I’m going to share it with you. Assuming anyone’s reading this, of course.

First of all, this new version of WordPress is bound to make blogging very nearly idiot-proof. Even an MSN Spaces user should be able to muddle their way through the streamlined, simplified administrative interface. It might still be too tough for AOL users and people trying to find a Wal-Mart job, though.

I suspect your average WordPress.com user is going to get their new blog, click Write, and start blogging, without spending much time — or any time — going through the numerous options. And that’s fine. You can add categories on the fly without even stopping to click Mangle. And with the new editor, you can even write posts without knowing a single bit of XHTML.

That covers about 95% of blogging for most people.

But the other 5% turns out to be a real sticky point.

At the moment, WordPress.com offers only a limited selection of themes to choose from, and the themes are not customizable. This gave me a real problem at first, as most of the themes have bugs or omit critical functionality. After testing out the available themes for the better part of an hour, I finally settled on this one, which doesn’t at all make me happy, or even look the way I’d like, but does have all the functionality working properly. As far as I can tell. For now. Even the prize-winning Connections theme omits the comments template on pages. In contrast, my WordPress 1.6 site lets me install any theme I want, customize the theme, and do whatever I need to do in order to have my blog look, feel and act exactly as I want it to.

Nor does WordPress.com allow the installation of plugins. In WordPress 1.6, I can install plugins to extend the functionality of WordPress itself, add new features, change the way things work, and a wide variety of different things. Indeed, my most well-known site has some 20 plugins installed. I think I’ve forgotten why.

Beyond themes and plugins, most of the core functionality of WordPress 1.6 is present in WordPress.com. A few things aren’t here right now. For instance, WordPress.com doesn’t let you set your local time zone offset, or change your permalink structure. The time zone thing is bothersome, but most people aren’t going to complain too loudly; times are (currently) displayed in UTC. And most people probably wouldn’t know that you could change the permalink structure unless you pointed it out to them.

There is one very good reason for WordPress.com to not permit users to install their own themes and plugins. That is security. Both themes and plugins can contain actual PHP code. This means that, in theory, a WordPress.com blogger could upload a theme and a plugin which lets him obtain unauthorized access to others’ blogs. Or worse.

I don’t think the security problem is insurmountable, though. After all, Web hosts let people run unknown/untrusted code all the time. For instance, my Web host uses the UNIX user security structure. By having the web server run my code under my user ID, rather than the server’s, my code can only access things that I legitimately have access to. Other users’ files are off-limits (assuming the other users haven’t explicitly granted the world access to them).

By incorporating a similar security structure into WordPress.com, it should be possible to allow users to run their own themes and plugins. And that will be the first white car in a world of black ones.

August 24, 2005 Posted by | WordPress, WordPress 1.6, WordPress.com | 9 Comments

   

Follow

Get every new post delivered to your Inbox.