E-mail Validation to clean-up mailing lists

blakeb_1

Hello, I am interested in creating a program to handle large a mailing list and help eliminated bounced messages by doing e-mail validation. I've looked at articles on this site and elsewhere on the web, and it appears that most of the validation approaches include validating the e-mail address syntax with regular expressions, then validating that the domain exists, then connecting to the smtp server and making sure that it will accept mail for a certain address. However, I have read that this will not work for some e-mail providers, such as aol, hotmail, servers based on ms outlook, etc. and that the only way to find out if an e-mail address is really valid for these is to wait and see if the message bounces back. Does anyone know of a sure way to validate all e-mail addresses, or is it just not possible? I guess if there is no 100% correct way, then I can try to combine the two approaches by first checking addresses by contacting the smtp server, and then also analyzing bounced messages. Thanks for any input. Blake

Kentamanos

I did this for a large email list, and in the end, it was mostly a waste of time. You can forget getting anything from AOL. When I tried it, hotmail and msn actually did verify the addresses. I'm not sure if that's still true. You end up creating a fairly complicated program to handle all of the conditions (couldn't find the MX record, couldn't connect to any of the given MX's, try again later? how many times?). You'll get so many false positives (SMTP servers that say "sure, I'll send email to fred@... when fred doesn't exist). Depending on your connection, you'll also run into servers that won't even allow you to connect ("you're on a cable modem? you have no business trying to relay mail..."). Don't get me wrong, it's a pretty fun project. Just don't expect to get too much valuable information from it :).

I, for one, do not think the problem was that the band was down. I think that the problem may have been that there was a Stonehenge monument on the stage that was in danger of being crushed by a dwarf.
-David St. Hubbins

Heath Stewart

You can validate the actual email address string. There is an RFC that defines valid addresses. The regex string below is the most common:

^[\w\.\-]+@[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]{1,})*(\.[a-zA-Z]{2,3}){1,2}$

As far as validating the MX records, the other poster was correct that it's not 100% effective, but it's at least worth a shot (while you may get false positives, you will get correct rejections). See http://www.codeproject.com/aspnet/emailvalidator.asp[^] for a good article on MX validation, although there are others. Just try this search[^] (watch the ratings to filter the bad ones).

Microsoft MVP, Visual C# My Articles

Kentamanos

From what I understand, the RFC allows a LOT of crap that "nobody" uses. For instance, fred@123.123.123.123 (IP address) is supported in the RFC. There's also a lot of people who try to check things like the length of the TLD (2 - 6 characters for the museum TLD). I'm currently using the regular expression here. It's a pretty decent one. Also notice how many submissions there are on that site for email regular expressions. Some people stick to the RFC and some people realize the RFC is not quite restrictive enough for "real" usage.

I, for one, do not think the problem was that the band was down. I think that the problem may have been that there was a Stonehenge monument on the stage that was in danger of being crushed by a dwarf.
-David St. Hubbins

Troy G

Blake, Just my two cents. I had this same problem where I work with a list of over 2 million. The final solution that I implemented was to send all of the emails with a unique identifier in the return address. Like so: member_number@yourdomain.com. Then when a bounce occurs, the email will come back to our server. I use exchange so I had to write an event sink to parse the email address that was coming in and get the member number from whence it bounced and then update that member's email valid in our database. Not sure if this will work for you depending on your setup. It was tricky but we have it running perfectly now and we now bounce only 2% of our very large lists. One thing we had to keep in my mind was that there are different types of bounces, hard, soft, etc... Most mail servers will send this information back to you in the "bounce" email header and it is up to you to handle it accordingly. For us, we allow 3 soft bounces or one hard. Like I said, it was a little tricky but we cut our bounces down DRAMATICALLY. If you are interested in the details and some sample code, just reply here and I will get it to you. Thanks, Troy G

blakeb_1

Troy, This is exactly what I have been thinking about doing (I read an article about it on this website), and I thought that I might even combine it with the MX checking that other people had posted about. I would love to see some details and code if you want to send it to me. Maybe I can figure out a way to make it work for our purposes. You can send it to blakeb_1@hotmail.com Thanks a lot Blake

blakeb_1

Troy, Actually, I've just realized that this isn't going to work because the mailing list program we are using isn't going to allow to do any extra programming. I think I might just stick to verifying through the smtp servers for now, and hopefully it will at least cut down on a little of the bounces. Thanks for offering to send some code though. Blake

blakeb_1

Can I use that mx validator component in a C# program?

Heath Stewart

Unmodified, no. A validator in ASP.NET takes the name of a control which it finds in in the control collection of the container. This wouldn't work for Windows Forms. You can use the code, though. There's other code out there as well, including an MX validator I wrote a long time ago that I believe I posted in this C# forum over a year ago. Good luck finding it, though. In any case, DNS queries are a standard so the only real difference in code is how you do it.

Microsoft MVP, Visual C# My Articles

User 1869559

Hi Troy G, I am working on a project where i need to track bounced mails and update the database with types of bounce. I am able to send a unique identifier in the return address. Each time there is abounce the mail with the unique identifier is there in the bounce box. Now, the problem is reading the Headers like the member_number in member_number@yourdomain.com and the bounce type and updating it to the database. If u have any idea or code that can do this, then it would really be a great help. I am interested in getting the details. Thanking you in advance. With Warm Regards, Arvind