When I talk about spammers, I’m usually not talking about email spam. Since I started using Gmail, I’ve had very little contact with that sort of spam. The spam that has the power to get my attention is spam in blog comments.
The first time I saw comment spam, I had been blogging for just a few weeks and I took it as a sign of success. Somebody wanted their links on my page and that meant my page was important. I felt a little bit flattered as I deleted the comment, removing the link from my page to some fly-by-night online pharmacy.
Later that year, Akismet came to town. Akismet is a centralized service, free for personal use, that examines new comments as they are submitted and tells your blog software whether each comment is probably spam. It learns about spam from the feedback of bloggers who report false positives and false negatives.
Thanks to Akismet, I only see a small fraction of the spam that is hurled at my blog. Only when the spam is of a new type or was actually entered by a human does it reach my email inbox as notification that I have a comment awaiting moderation. Though it’s not perfect, Akismet has saved me from countless hours of manually filtering tens of thousands of spam comments.
I have seen several comments get past Akismet. These usually involved some sort of “social engineering” designed to trick people into telling Akismet that the comment is not spam. This is done in the hopes that the spammer’s comments will evade Akismet in the future. For example, a comment saying “Great blog, I’ve bookmarked you” is somewhat more likely than one saying “Play poker online” to be approved by an unsuspicious blogger. Flattery strikes again.
I got tired of finding spam comments on my blog—five per year is not much, but it was too much for me—so I turned on moderation for all comments. If you post a comment here, it will be a few minutes or hours before I approve it. This spam-proofs my blog but it does not deter spammers. They don’t care whether their comments hit the mark because it’s cheaper to just keep spamming a vast list than to spend time removing spam-proof blogs from the list. Remember, spam is primarily economically driven.
Rather than stanch the flow of spam into my life, comment moderation secures my blog against publishing unwanted comments while increasing the flow of notifications into my email inbox. I don’t think I have to tell you that the initial feeling of flattery wore off long ago. Now it’s just annoying.
Well, a few weeks ago I started seeing a new sort of spam that Akismet wasn’t flagging as spam. This is a typical comment:
Evening to you all! I came across your blog posting after searching for and your post on Andy Skelton makes an interesting read. Thanks for sharing. I will research more next Friday when I have the day off. Peter
And another one:
Hey! Nice blog posting about Andy Skelton. I would have to agree with you on this one. I am going to look more into . This Friday I have time. Swiss Dude
I wondered. Where did they get the idea that my blog was about Andy Skelton? Is my name a valuable search term now? Are they spamming other people with my name? Then I noticed that the name of my blog, which appears in my RSS feed, was “Andy Skelton” and that’s probably where their spam software found my name. Regardless, Akismet now catches these comments so I don’t have to moderate them.
Today, a few hours after posting to this blog, two WordPress blogs published posts quoting my post and linking back to me. Their blogs automatically notified my blog so that my blog would publish a link back to their articles about my article. This is what their posts look like:
unknown wrote an interesting post today on Buying my first house
Here’s a quick excerpt
There are way too many real estate agents in the world and I have known way too many flaky ones to expect to find a good agent at random. I looked at the online Realtor directory for Austin and it didn’t set any of them apart. …
Read the rest of this great post here
This is not only flattering, it appears to be perfectly selfless and harmless because they link to my post and they don’t seem to be deceiving anybody. Don’t believe it for a second. This spammer is counting on my links to pump up the Google PageRank for their domains, increasing their value in spam systems capitalizing on the word Realtor. I sure won’t be allowing these links on my blog but I sure appreciate the links to me they published.
I may take away some tiny PageRank benefit from this relationship. Higher PageRank means my pages appear higher in the search engine result pages and that means more traffic for me. So PageRank is good, right? PageRank is what made Google a household word and made its investors rich. It makes our lives easier because instead of searching through directories (remember Yahoo! in the years before Google?) we enter search queries and get instant results.
It would seem to be all rainbows and unicorns but, as you might have figured out by now, Google is the root of all web spam because PageRank is what makes web spam profitable. This is the unintended, evil consequence of a brilliant invention. An invention which has the laudable basic purpose of serving the public, which is capable of estimating the relevance of billions of things in a few milliseconds, which operates on the largest and fastest-growing dataset ever conceived by humanity, which is fooled by mere flattery.