things are as bad as they seem: Thoughts on Greylisting

In attempting to subscribe to a mailing list tonight, I've encountered my first greylisting, and I'm not sure what to feel about it.

Greylisting, for those of you who don't spend any time reading up on the SMTP protocol and spam-eliminating measures, is a rather controversial method of avoiding spam. Greylisting is, simply, the act of deferring all incoming messages by means of a 4xx error code once.

Deceptively simple, no? SMTP is designed to be robust. If a mail server doesn't accept your message, you try again a little later. Legitimate mail has no problems following these rules, so they'll always try to resend a deferred message. (AOL has the misguided notion that any message it can't deliver within 3 hours should be bounced back to the sender. Fuck you, AOL.) Greylisting mail servers keep a record of which hosts have tried to send mail to them, and if they try again later like a good host should, the greylisting server accepts the message.

The reason why people think greylisting works is because spam doesn't behave like normal mail. Spammers have a great interest in delivering as many messages as possible, and so it just isn't cost effective to retry dead addresses over and over again. Thus, the theory goes, spammers won't retry to send messages. Ever. If they can't send a message on the very first go, they close the connection and move on.

I'm not sure I believe that. If I were trying to saturate a list of addresses, I'd set a short connection timeout and retry addresses that return a 4xx once or, maybe, twice. If I do this, I'd completely circumvent most greylisting algorithms I've seen.

So I'm guessing some economist somewhere is doing the cost-benefit analysis between the "one strike and you're out" spamming method and the "wait ten minutes and try one more time" method. Right now greylisting isn't a popular spam-avoiding option, so the one-strike method is probably most advantageous right now. If greylisting catches on, the advantage will diminish and you'll see more and more greylist-busting spammers.

But let's consider greylisting as it pertains to the most important users. Namely, me. Greylisting means that the message I send to you at 3 PM won't be delivered until later, possibly much later. The SMTP specs don't outline a retry schedule, which is an invitation for implementers go hog wild. AOL can fuck things up by just putting a 3 hour cap on retries. If you write your own mail server software, you can pick your poison. You might want to retry every deferred message every ten minutes for a week. Or five. Or once a day every day for a month. Or you might want to use a sliding scale like qmail's quadratic formula.

One could compare and contrast these different retry schedules, but the fact remains that none of them are in direct violation of the standard because the standard doesn't weigh in on the subject. In my case, the mailing list server greylisted my subscription confirmation for 300 seconds. No problem there, since qmail's first retry occurs at the 400 second mark. But that's just circumstance. I might have been using Postfix, or (ugh) Sendmail. Or AOL. Since retry schedules are entirely arbitrary, I could have waited any length of time in excess of 300 seconds before getting my message accepted. With qmail it was an extra 100 seconds, but it just as easily could have been 300 more seconds, or an hour.

Ultimately, I don't think I mind greylisting much from a technical standpoint, but it's based on a fundamentally flawed idea: that spammers will never change their tactics. If greylisting is only a mild inconvenience to me now, it will continue being an inconvenience once the spammers have taken all necessary measures to bypass the advantage gained by using it.

things are as bad as they seem

2006-02-05

Thoughts on Greylisting

No comments:

About Me

Blog Archive

Links