Commentary
1/7/2004
10:17 PM
Fred Langa
Fred Langa
Commentary

Langa Letter: E-Mail--Hideously Unreliable

A recent test by InformationWeek columnist Fred Langa shows that up to 40% of valid E-mails never reach the recipient. Here's what it all means to you.



You're losing E-mails. It's almost certain that some significant percentage of your legitimate outbound E-mails aren't getting to their destinations; or that some significant percentage of your legitimate inbound E-mails are being lost before you ever see them.

When I say "significant," I don't mean a few. I mean something like 40%, or even more in some cases. And I'm not talking about losing junk mail. I'm talking about the loss of totally valid, non-spam/non-junk E-mail.

Think about that for a minute: As many as four out of 10 of your serious E-mails--the sort you might exchange with co-workers, friends, business associates, or customers--may not be making it to their intended destinations.

This alarming statistic is derived from a large test I conducted late last year, involving more than 10,000 participants. I announced the test with a call for volunteers in an issue of my E-mail newsletter last October. It said, in part:

...I'd like to gather a group of volunteers... and send each one a simple non-spam E-mail message, in plain text and with no attachments, from a personal mail account (not a bulk mailer). I'd like to see how many of these simple messages actually make it through the gauntlet of servers, routers, and ISP-based and local mail filters.

I won't tell the volunteers in advance what address the mail will come from or what the subject line will be.... Rather, I propose to simulate a normal, unanticipated, plain text, non-spam E-mail, as if between friends or coworkers, and see what gets through....

I included specific sign-up information, and asked interested readers to indicate their willingness to participate by sending a reply E-mail to a designated mailbox.

I'd hoped for maybe 500 volunteers. But less than a day after my request went out, I was astonished to see that more than 10,000 people had signed up. Clearly, E-mail reliability is a real hot button!

To keep the size of the test manageable, I then stopped accepting additional applications to participate, ending up with 10,979 volunteers on tap.

It's important to note that these test participants were eager, motivated, and E-mail savvy: They had learned of the test via E-mail, and had signed up by E-mail within hours of the call for volunteers going out. Thus, if anything, this body of E-mail-enthusiastic volunteers represents a best case for E-mail success, a fact that puts the dismal test results in an even starker light.

The Test
The basic test concept was simple: I'd send one plain text, attachment-free E-mail to each volunteer. The content of the E-mail would simulate normal, safe, business or interpersonal correspondence. It would contain no deliberate or obvious spam or virus-filter triggers (e.g., no spamlike components, such as offers to enlarge this or shrink that; no attachments; no viruses; no HTML; no embedded scripts; etc.). The subject line also would be plain and general, neither designed to trigger nor avoid spam filters. To count as a successful delivery, the test E-mail had to be seen and responded to by a human; not by a mailbot, responder, or other auto-reply mechanism. The body of the message contained instructions for the volunteer either to send a new, original E-mail, or to forward the test E-mail to a unique reply mailbox. This was similar to the original sign-up process the volunteers had already used to join the test, and so wasn't a high hurdle. But it did ensure that simple automatic reply mechanisms couldn't skew the test results. Instead, only human-generated responses that arrived at the designated reply mailbox would count as a complete, successful communication.

I took steps to ensure that the test E-mails would be treated completely normally; subject to all the standard E-mail processing and filters that might be in place at the volunteers' ISPs, mail servers, or desktops. That meant that I wouldn't send the E-mails from a "langa.com" or other address that the volunteers might already associate with me, or already have in their "whitelists." (Whitelisted addresses are known to be safe; mail from those addresses is preapproved, and bypasses normal filtering.) So, I made arrangements to send the test E-mails from a name and address the volunteers hadn't seen before.

I'd planned the above basic approach when I thought I'd have maybe 500 volunteers. But with more than 10,000 volunteers available, I also was able to add tests for several additional E-mail variables. I did this by splitting the pool of volunteers into large subgroups. Each subgroup got a slightly different test E-mail; or got E-mails that were sent out in slightly different ways. (I'll detail the variables in a moment.) Each subgroup was given its own unique mailbox to reply to, so I could track the response rates separately.

I sent all 10,979 test E-mails from a private E-mail account, using normal desktop E-mail tools, during East Coast business hours on Nov. 17 and Nov. 18, a Monday and a Tuesday. I left the response mailboxes--the mailboxes by which the volunteers could acknowledge that they got the test E-mail--open for a week. I ran no mail or spam filters on the response mailboxes so all acknowledgements would be delivered untouched and intact. Recall that all the volunteers were expecting some kind of test E-mail to arrive, although they didn't know exactly when, where, or how. Recall also that these were motivated, E-mail-savvy test subjects. Given this, the overall response was astonishingly poor: Of the 10,979 test E-mails I sent, I received 6,551 acknowledgements; a gross success rate of just 60%; or more pointedly, a failure rate of 40%.

But not all the subgroups yielded the same performance. To see and understand the differences and to have a basis for analyzing the gross results, we need to go into more detail:

Test Group One
This test group was closest to the small-scale test I'd originally envisioned; intended to simulate an original, unanticipated (i.e. non-whitelisted or preapproved) correspondence from an unknown, but nonhostile personal sender.

This test group comprised 1,500 individually composed, addressed, and mailed messages. (This was by far the most labor-intensive test.) The messages were in plain text with no attachments and sent one at a time. Each message had one recipient--one of the volunteers. (I used the volunteers' addresses in the order in which I'd received them--first volunteer, second volunteer, third volunteer, etc.) The E-mails were sent from a name and E-mail account the volunteers had not seen before: Liam Sugob, or [email protected] (a temporarily valid personal and E-mail address I set up at my Freetune.Com domain). The E-mails carried a totally generic subject line of "Hello."

This is what the E-mail looked like:

To: [volunteer's address]
From: Liam Sugob <[email protected]>
Subject: hello

Hello. This is actually Fred Langa. You recently volunteered to participate in a worldwide test of E-mail reliability, as described in a recent LangaList newsletter. (See "Let's Test Email Reliability" in the 2003-10-23 issue at Langa.Com.)

This letter is part of the test.

First, some reassurances: This test exists in isolation. Once this test is done, I'll simply delete the list of volunteer E-mail addresses, and that will be that. I'll never use your address for any other purpose--honest!

If you're a subscriber to the LangaList, don't worry; this test is 100% independent of the newsletter mailing operations, and nothing here will affect your subscription. Likewise, if you're not a member of the newsletter, you won't be signed up for it.

Completing the E-mail test requires one more step:

You see, mailbots can auto-reply to E-mails, and I'm trying to see if I can contact a human being--you!--not your mailbots. So, if a live human is reading this, please do the following:

Please do not--repeat do not--simply reply to this E-mail!

Instead, please create (or forward) a NEW E-mail to [target address was placed here].

Again, please do not--repeat do not--hit "reply." That will not work.

In your new E-mail, and as with your original volunteer E-mail, please put your general physical location in the subject line with the COUNTRY FIRST and the CITY/TOWN SECOND. (This will let me sort the replies geographically.)

For example, your subject line might be "USA, Boston," or "England, London" or "New Zealand, Wellington" etc.

If there's more than one city with the same name in your country, please add a further identifier as a THIRD element in the subject line: "USA, Portland, Maine" and "USA, Portland, Oregon" for example.

Again, for clarity, the E-mail should be addressed to [target address was placed here] and the subject line should contain either:

COUNTRY, CITY
or
COUNTRY, CITY, STATE/PROVINCE

You can leave the body of the E-mail blank, or put whatever you like there.

That's it! In a week or so, I'll see how many humans (not mailbots or auto-responders) managed to get this E-mail, and reply. I'll report the results in a future issue. It should be interesting!

Thanks for your help! As stated earlier, your E-mail will not be used for anything else. You have my personal word on that!

"Liam Sugob"
a.k.a. Fred Langa

(PS: "Liam Sugob" is "bogus mail" backwards! I used this name and address because many readers have their mail filters already set to allow anything from a "Langa" address to pass. By sending from an address you've never seen before, we can see what the "raw" delivery rate is. -- Fred)

I sent out 1,500 of these messages and received 1,005 acknowledgements; a success rate of 67%; a failure rate of 33%. (I'll analyze these response rates later on.)

Test Group Two

This group was larger than Group One, with 2,497 recipients, and involved one major difference from the previous test group: These E-mails were sent as replies to the volunteer's original E-mail. The subject line read "follow up," and the E-mail body opened by quoting whatever text the volunteer had originally sent to me. I expected this to increase the response rate, but oddly, that wasn't the case.

Here's an example of what the E-mails looked like; of course, the quoted matter (the material set off with ">>") varied in each message:

To: [volunteer's address]
From: Liam Sugob <[email protected]>
Subject: follow up

At 05:08 AM 10/23/2003 -0700, you wrote:
>>Fred, I'll volunteer if you don't already have someone in Arizona.
>>Glad to help if I can.
>>Joe

Hello. This is actually Fred Langa. You recently volunteered to participate in a worldwide test of E-mail reliability, as described in a recent LangaList newsletter....

[rest of note was the same as in Group 1, above.]

Of the 2,497 mails I sent out in this fashion, I received only 749 acknowledgements, a success rate of only 30%--or a failure rate of an astonishing 70%! (Again, I'll analyze these response rates later on.)

Test Group Three
This test group was the largest, with 5,432 recipients. The mailing was designed to simulate a corporate, divisional, or large departmental-scale mailing, or a mailing that might be sent out by larger clubs or organizations. The E-mails weren't sent in bulk or by mailing-list-software (tools known to run afoul of spam filters), but rather were sent by a normal desktop E-mail client. Each E-mail was addressed to "[email protected]," an address none of the volunteers had seen before. The volunteers' real E-mail address was entered in a blind copy (BCC) list. There were 97 BCC recipients per message; a number chosen because many ISPs limit CCs and BCCs to 100 or fewer per message.

Each recipient saw this:

To: [email protected]
From: Liam Sugob <[email protected]>
Subject: follow up

[message body was like that in Group One]

I sent 56 messages like the above, each with 97 BCC recipients, for a total of 5,432 recipients. I received 3,667 acknowledgements; a success rate of 68%; a failure rate of 32%.

Test Group Four
This group simulated a smaller, workgroup-scale mailing, or the mailings that might be done by small clubs or organizations; or even by families and friends. Everything was identical to that in Group Three, except that each E-mail had only 25 BCC recipients per message instead of 97.

I sent out 62 such messages, each with 25 BCCs, for a total of 1,550 recipients. I received 1,130 acknowledgements, for a success rate of 73%; a failure rate of 27%.

What It All Means
Overall, this test yielded only a 60% success rate; fully 40% of the E-mails I sent generated no response, despite the fact that the mails were sent to a motivated, eager group of E-mail-savvy recipients who were expecting some kind of mail from me. This failure rate is astonishingly poor, and bodes ill for anyone who is trying to rely on E-mail as a serious communications tool. (Imagine if 40% of your phone calls failed, or 40% of your paper mail went astray...)

Even the high end of our test range was atrocious: Three of the test groups (One, Three and Four, comprising 8,482 test mails) all yielded failure rates of around 30%, plus or minus a few percent. Because this rate was consistent across three groups of different sizes and mailing methodologies, it's safe to say that actions on my end--varying the E-mail format and mailing method--didn't appear to have any significant effect on the response. Instead, I think the majority of the effects we're seeing are from actions going on at the other end of the E-mail pipeline. This Group One volunteer's note provides a clue:

Hi, Fred: Well, your 'Liam' e-mail made it through the anti-spam net OK. But only just! MailWasher marked it as 'possible spam,' so it's just as well I've set it for manual operations, otherwise all your good work would have been for nothing! Good luck with the test ... and thanks for the opportunity of taking part.
-- Ian Robinson

In fact, I got many such notes: A large number of readers reported finding their test mail incorrectly consigned to their spam bucket or trash can. This leads me to believe that many or most of the delivery failures were due to hyperactive spam filters at the ISP or desktop level that incorrectly intercepted and trashed the test mail.

My guess is that something similar, but even more extreme, is behind the extremely poor 70% failure rate of Group Two: That group's "To:" addressing was the same as in Group One, and the subject line was the same as Groups Three and Four--test groups that performed far better. It seems logical to me to assume that quoting the volunteers' own message would actually increase the response rate, because the volunteer would see his or her own words--a near-impossibility in spam mail. Since that didn't happen--and in fact, the response rate actually went way down--I have to conclude that many of these messages never made it to human eyeballs. Something ate the E-mails before the recipients ever saw them.

The most likely candidate is a blacklist/blocklist that may have incorrectly picked up the "freetune.com" address, falsely listing it as a spam source for a while. Blacklists are notoriously stupid and cause huge amounts of collateral damage (incorrectly blocking valid E-mails). It only takes a relative handful of false-positive spam reports, relayed to a central blacklister/blocklister, for all valid E-mails from a given domain (such as freetune.com) to be blocked for all users of anti-spam tools that rely on that blacklist. If the block happens at the ISP level, the misidentified, non-spam E-mail may be discarded with no notice to the end user, and with no way to recover the message. I have no way of knowing if freetune.com was blacklisted, but it wouldn't surprise me: Blacklists are crude, evil tools that do far more harm than good. (Don't believe me? See "Real-Life Spam Solutions".)

Or, perhaps there's a less draconian explanation: Maybe the addition of quoted matter to the body of the E-mail somehow convinced some spam filters that the mail was less legitimate than identical mails without the quoted matter. That's illogical--but then, so are many spam filters.

What You Can Do About It
Unfortunately, there's no good way around spam filters: They're a necessary evil because spam continues to grow in volume. Late last year, for example, the spam-tracking company Brightmail reported that the volume of spam climbed over the 50% mark for the first time ever: There's now more spam in general circulation than valid E-mail!

The new U.S. "Can Spam" legislation isn't likely to help much, although it may force some of the more obnoxious U.S.-based spammers offshore. But once in a spam-haven, they'll continue much as before.

And spammers are getting smarter, too. You've probably gotten some spam mails with long blocks of nonsense verbiage at the bottom, for example. These spams are intended to overwhelm Bayesian filters by altering the context in which the spam trigger words appear.

But, despite the problems that all spam filters have, Bayesian filters are still the best available choice; blocklist/blacklist-oriented filters are still the worst. We all have to use spam filters, but make sure you're using a good one; keep it up to date with the latest detection rules; and verify its operation by checking what's being discarded from time to time. Odds are, you will find valid messages tagged as spam and thrown away, no matter what filter you use.

There are other steps you can take, too. For example, when you send E-mail, don't assume that your outbound messages will be received and read by your recipient until and unless you've established prior contact and have whitelisted each other's E-mail addresses. For initial contacts (the type of E-mail we tested), it might be best to open communications with a very short message--just a line or two, with no words or phrases likely to trigger a spam filter--to let the recipient know who you are, and that there's more mail on the way from you. That way, they can get your address whitelisted before your real message arrives.

Business mailers might try something similar. Instead of sending out a long E-mail, you might try sending a very brief E-mail with a link to a Web page that contains the real message. Or, in your initial contact, keep your message very brief and as un-spamlike as possible; and include information such as the domain or IP from which all your business E-mail will be sent to assist your recipients in presetting their filters to let your mail in.

The one thing you cannot do either as a business mailer or as a private individual is simply to treat E-mail today the same way you did even as recently as a year ago. The E-mail world has changed: It's now almost certain that some of your legitimate E-mails are getting trashed on their way to or from you.

E-mail has become horribly unreliable, and we all need to adjust our expectations--and actions--accordingly.

What's your take? Have you had valid E-mails filtered incorrectly? What filter do you use, and what's your experience been? Do you have a way to track what's been filtered or discarded, and if so, what percentage of false-positives are you seeing? Join in the discussion!


To discuss this column with other readers, please visit Fred Langa's forum on the Listening Post.

To find out more about Fred Langa, please visit his page on the Listening Post.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Email This  | 
Print  | 
RSS
More Insights
Copyright © 2019 UBM Electronics, A UBM company, All rights reserved. Privacy Policy | Terms of Service