Lunacy Unleashed

Notes from the field in the War on Spam

I spotted some Bad Behavior

People ask me about making money from AdSense all the time. While I usually will offer little tips and tricks that I’ve learned along the way, one thing I want to make sure that new AdSense publishers know is what NOT to do.

The number one thing that you should NOT do is STEAL OTHER PEOPLE’S CONTENT. Yes, I know the guy who sold you the video or the eBook said it was okay. Guess what, he has your $97 bucks, and you’re about five minutes away from being up shit creek without a paddle, as you lose your web hosting, your domain names, and most importantly, your AdSense account, all because you ripped someone off.

If I catch you stealing my content, your ass is grass. (This obviously doesn’t apply if I gave you permission to use it.)

This content was stolen from Michael Hampton.

Copyright © 2006 Michael Hampton. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.

September 19, 2006 Posted by | AdSense, Advertising, Bad Behavior, Blog Spam, Google, Link Farm, Personal, Spam, Splog | 2 Comments

I spotted some Bad Behavior

People ask me about making money from AdSense all the time. While I usually will offer little tips and tricks that I’ve learned along the way, one thing I want to make sure that new AdSense publishers know is what NOT to do.

The number one thing that you should NOT do is STEAL OTHER PEOPLE’S CONTENT. Yes, I know the guy who sold you the video or the eBook said it was okay. Guess what, he has your $97 bucks, and you’re about five minutes away from being up shit creek without a paddle, as you lose your web hosting, your domain names, and most importantly, your AdSense account, all because you ripped someone off.

If I catch you stealing my content, your ass is grass. (This obviously doesn’t apply if I gave you permission to use it.)

This content was stolen from Michael Hampton.

Copyright © 2006 Michael Hampton. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.

September 19, 2006 Posted by | AdSense, Advertising, Bad Behavior, Blog Spam, Google, Link Farm, Personal, Spam, Splog | 2 Comments

I got slashdotted!

In case some of you weren’t aware, I was /.ed last night. For almost two hours after the posting, my site was unavailable, intermittently available, or very slow. After 24 hours this post constitutes a brief analysis of what went wrong and remedial steps I took — while the oncoming hordes were banging at the gate — to get my server back up and running. I also note some lessons learned for those of you wanting to drive lots of traffic to your sites, get popular fast, or just experience the for yourself. I also note some implications this experience has for Bad Behavior 2 development.

First a few raw numbers: In the first 24 hours, Apache recorded 30,787 visits from slashdot.org, 4,882 more visits from people with their referrer blocked, and 5,582 visits from other places. Of those hits, only 45 came prior to the post going live. (Subscribers to /. can see articles shortly prior to their publication time.) I had 27 minutes from the first referral from /. at 1:24 am to the time the post went live there at 1:51 am. (All times are UTC.)

Of those pageviews, 544 came between 1:51 and 2:01. There were 634 pageviews between 2:01 and 2:11, 658 between 2:11 and 2:21, and so on. Clearly the server should have been able to handle much more than one pageview per second. And this only counts pageviews served successfully; it doesn’t count images, CSS, MP3 files, etc. It also doesn’t count the innumerable requests dropped on the floor, or people who just gave up waiting.

The server started showing signs of trouble fast. By 1:55 am the load average had passed 35. By 2:00 it had passed 50. At one point I saw the load average as high as 112, and I was over 500MB into swap on a box with 1GB of RAM. I noted that accesses were going VERY slowly and realized that neither Apache nor MySQL had had much performance tuning, and could do a lot better than this.

Unfortunately, I am an idiot, and neglected to take any measures BEFORE the barbarians showed up at the gate! I’m doubly an idiot, because I called them by submitting the story to /. in the first place! So, here’s what went wrong, how I made it right, and actually managed to get the box serving requests again.

First off, the server platform is Fedora Core 5. It includes Apache 2.2.0, MySQL 5.0.21 and PHP 5.1.4. What can I say, I like the bleeding edge. Anyway, take the distro wars elsewhere. The point is, Apache and MySQL on this platform aren’t particularly well tuned for high performance, high traffic sites. So off to Google I went to try to get things under control.

I quickly determined that MySQL was spending way too much time creating threads to serve incoming requests. It also wasn’t dropping old connections quickly enough. So I set the following two variables:

set global thread_cache_size = 150;
set global wait_timeout = 10;

That got MySQL behaving mostly okay. Then I turned to Apache. It turned out to be a bit more of a problem, as at first glance, it already looked like it was fairly well tuned:

StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000

So I twiddled the values for a while, with little result, all the while getting hammered with load averages hovering between 50 and 60, but at least I wasn’t swapping anymore. Then after over an hour and a half, I finally realized I had a big problem:

KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15

Wait, OFF? That means Apache’s getting hit about 10 times as hard as it needs to be, as it will have to spawn a new process for every inline image, the CSS file, etc. So I changed it fast:

KeepAlive On
MaxKeepAliveRequests 1000
KeepAliveTimeout 10

And when I restarted Apache, now around 3:30 am, the load average quickly dropped from 45 to about 10, and requests started coming through at something approaching tolerable speed again.

Then I installed PHP eAccelerator, which has a nice Fedora package (php-eaccelerator) and works out of the box. When I restarted Apache after installing it, the load average dropped by half again, and the site was as fast as it usually is with nobody on it.

I’ve since installed WP-Cache 2 with the required WordPress 2.0/PHP 5 fix and Mark Jaquith’s gzip compression patch. I tested it to make sure it works, but I’m leaving it off until the next time the barbarians show up at the gate, as it screws with some dynamic code I have and I haven’t figured out how to get it to execute the code every time. Yet.

For those of you on shared hosting providers, you won’t be able to make most of these changes yourself. Only WP-Cache 2 is user-installable. If you use a VPS or dedicated server, though, you can do all of this.

There’s plenty of further performance tuning I can do, especially with MySQL, and I plan to do it in the very near future, just in case one of my posts actually gets greenlit on Fark.com or something.

One of the things I did while the server was being hammered was to disable Bad Behavior, to determine if it was putting too much load on the system while it was being hit with 50-100 requests a second. I’ve determined that at those levels it does hit the database pretty hard, and I plan to redesign all of Bad Behavior’s database usage to try to accommodate this sort of situation.

P.S. I can tell you that many /. users really do click on ads. Yesterday’s take on Google AdSense was $40.91, and so far today I’m above $66. Not bad. I get paid to learn fast about tuning my server. 🙂

There’s much more work to be done, though, so I’ll most likely have a follow-up to this.

June 3, 2006 Posted by | AdSense, Apache, Bad Behavior, MySQL, Slashdot, WordPress | 11 Comments

Making money from AdSense?

People ask me about making money from AdSense all the time. While I usually will offer little tips and tricks that I’ve learned along the way, one thing I want to make sure that new AdSense publishers know is what NOT to do.

The number one thing that you should NOT do is STEAL OTHER PEOPLE’S CONTENT. Yes, I know the guy who sold you the video or the eBook said it was okay. Guess what, he has your $97 bucks, and you’re about five minutes away from being up shit creek without a paddle, as you lose your web hosting, your domain names, and most importantly, your AdSense account, all because you ripped someone off.

If I catch you stealing my content, your ass is grass. (This obviously doesn’t apply if I gave you permission to use it.)

This content was stolen from Michael Hampton.

Copyright © 2006 Michael Hampton. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.

May 18, 2006 Posted by | AdSense, Advertising, Bad Behavior, Blogging, Google, Spam, Splog, WordPress | 5 Comments

AdSense on WordPress 2.0

If you’re upgrading to WordPress 2.0 and use Google AdSense, there is something very important you need to know to ensure you continue to get well-targeted ads, and that Google doesn’t suspend your account for program violations.

One of the new features in WordPress 2.0 is a live post preview. If you scroll to the bottom of the page while editing a post, you’ll see a live preview of how your page will look once it’s published. This is a very nice addition to WordPress, but for AdSense publishers, and those using other context-targeted ad networks, it presents a serious problem.

When the post preview is rendered, it will try to fetch your Google ads!

And because the post hasn’t been published yet, when Google’s bot tries to crawl the page a few seconds later, it will receive a 404 error.

The best case scenario for this is that some of you (who don’t use permalinks) will receive very poorly targeted ads for up to two weeks after you publish your post.

And the worst case scenario, since Google prohibits displaying ads on 404 pages, is that you could get your account suspended.

WordPress 2.0 does provide a solution, though; it’s the new is_preview template tag. This new tag tells whether the post is being displayed in the post preview section while it’s being edited.

So all you need to do is to add in a check for this into your template code wherever you have placed AdSense, and the problem will be solved. Just add this code around your AdSense code:

<?php if (!is_preview()): ?>
// Paste your AdSense code here //
<?php endif; ?>

This way, the post preview will not try to show Google ads, and they will only be shown once your post is published. This will keep your AdSense account safe and your ads well-targeted.

Update: It’s come to my attention that is_preview() may be broken. If you find that’s the case, submit a bug report and post the ticket number in the comments below so we can track it.

Update: I’ve tested is_preview() and it seems to be working just fine. Like other template tags, it only works inside the loop, though.

Update: Since people frequently place ads outside the loop, there needs to be a way to test for this outside the loop. The following workaround worked for me:

<?php global $wp_query; if (!$wp_query->is_preview): ?>
// Paste your AdSense code here //
<?php endif; ?>

Update: Ticket 2188 is open for is_preview() acting strangely.

December 27, 2005 Posted by | AdSense, Advertising, Google, WordPress, WordPress 2.0 | 40 Comments

Want a link farm? How about some spam?

The following bit of spam arrived on my contact form last night. Nothing has been changed, because the guilty don’t need protecting here.

Jill wrote:
I have an offer for your business if you’re interested in increasing revenues each month. I’ll cut right to the chase. We’re looking for 1 of 2 things. Or both:

1. Allow us to place targeted advertising on your existing website. We would share any advertising revenues with you at an agreed upon percentage. We’re masters of online advertising, so we can probably unlock new cash flow for you that might have otherwise never been tapped.

2. Allow us to set up around 10 subdomains or subfolders off of your website–for example: http://www.subdomain.YOURURL.com OR http://www.YOURURL.com/subfolder These would contain sites we control and be on a variety of topics. You would have to switch the DNS info for these new subdomains over to one of our servers so that we can make changes to these sites. For this we would pay you a monthly fee that we both feel is fair.

Anyways, if you could get back to me as soon as possible it would be appreciated. We would like to make this a win/win situation! I Hope to hear from you soon!

Jill

jill@masterlinkservice.com

p.s. I do not want to waste any of your time. If you’re not interested please just delete the message and I will not contact you again. I feel the offer is a win/win however and that we can make lots of money together!

p.p.s. I hope to hear from you soon!

Website:
IP: 142.161.37.169

For the unfamiliar, I’ll explain these two ideas in some depth.

The first one sounds like a typical advertising campaign you might see on a blog, such as AdSense or BlogAds. Only this one is bound to contain ads you don’t want on your site, like online casinos or erectile dysfunction drugs. In the case of this company, I’m going to guess home improvement loans, based on some domains I caught this company involved with.

The second one is absolutely something you should never, ever do if you want to be found in a search engine. Companies which get control of a portion of your domain space in this way will typically do one or both of two things:

  1. They’ll post “free” articles on various topics, which also happen to be rather boilerplate, and appear at various other domains across the Internet. One example I caught this company doing involves http://www.homeloaninfobox.com and http://www.homeinus.com which contain exactly the same content, word-for-word. Search engines catch on to this sort of trick and lower both sites in their results.
  2. The more evil possibility is that of a link farm, pages with dozens or hundreds of links to various other sites, which contain dozens or hundreds of links to the same sites. Spammers want sites outside their own sites to link to them, so as to increase their legitimacy, and decrease the chance that their link farm will be caught. Google delists entire domains that it finds involved in link farms, and this is definitely not something you want to happen to you. It happened to Matt Mullenweg of WordPress. He thought it was a good idea at the time, but it turned out to be anything but.

There are good ways to make money on your blog, and there are bad ways. Those are two very bad ways.

October 18, 2005 Posted by | AdSense, Advertising, Blog Spam, Google, Link Farm, Spam, WordPress | 1 Comment