E-mail Validation to clean-up mailing lists
-
Hello, I am interested in creating a program to handle large a mailing list and help eliminated bounced messages by doing e-mail validation. I've looked at articles on this site and elsewhere on the web, and it appears that most of the validation approaches include validating the e-mail address syntax with regular expressions, then validating that the domain exists, then connecting to the smtp server and making sure that it will accept mail for a certain address. However, I have read that this will not work for some e-mail providers, such as aol, hotmail, servers based on ms outlook, etc. and that the only way to find out if an e-mail address is really valid for these is to wait and see if the message bounces back. Does anyone know of a sure way to validate all e-mail addresses, or is it just not possible? I guess if there is no 100% correct way, then I can try to combine the two approaches by first checking addresses by contacting the smtp server, and then also analyzing bounced messages. Thanks for any input. Blake
-
Hello, I am interested in creating a program to handle large a mailing list and help eliminated bounced messages by doing e-mail validation. I've looked at articles on this site and elsewhere on the web, and it appears that most of the validation approaches include validating the e-mail address syntax with regular expressions, then validating that the domain exists, then connecting to the smtp server and making sure that it will accept mail for a certain address. However, I have read that this will not work for some e-mail providers, such as aol, hotmail, servers based on ms outlook, etc. and that the only way to find out if an e-mail address is really valid for these is to wait and see if the message bounces back. Does anyone know of a sure way to validate all e-mail addresses, or is it just not possible? I guess if there is no 100% correct way, then I can try to combine the two approaches by first checking addresses by contacting the smtp server, and then also analyzing bounced messages. Thanks for any input. Blake
I did this for a large email list, and in the end, it was mostly a waste of time. You can forget getting anything from AOL. When I tried it, hotmail and msn actually did verify the addresses. I'm not sure if that's still true. You end up creating a fairly complicated program to handle all of the conditions (couldn't find the MX record, couldn't connect to any of the given MX's, try again later? how many times?). You'll get so many false positives (SMTP servers that say "sure, I'll send email to fred@... when fred doesn't exist). Depending on your connection, you'll also run into servers that won't even allow you to connect ("you're on a cable modem? you have no business trying to relay mail..."). Don't get me wrong, it's a pretty fun project. Just don't expect to get too much valuable information from it :).
I, for one, do not think the problem was that the band was down. I think that the problem may have been that there was a Stonehenge monument on the stage that was in danger of being crushed by a dwarf.
-David St. Hubbins -
Hello, I am interested in creating a program to handle large a mailing list and help eliminated bounced messages by doing e-mail validation. I've looked at articles on this site and elsewhere on the web, and it appears that most of the validation approaches include validating the e-mail address syntax with regular expressions, then validating that the domain exists, then connecting to the smtp server and making sure that it will accept mail for a certain address. However, I have read that this will not work for some e-mail providers, such as aol, hotmail, servers based on ms outlook, etc. and that the only way to find out if an e-mail address is really valid for these is to wait and see if the message bounces back. Does anyone know of a sure way to validate all e-mail addresses, or is it just not possible? I guess if there is no 100% correct way, then I can try to combine the two approaches by first checking addresses by contacting the smtp server, and then also analyzing bounced messages. Thanks for any input. Blake
You can validate the actual email address string. There is an RFC that defines valid addresses. The regex string below is the most common:
^[\w\.\-]+@[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]{1,})*(\.[a-zA-Z]{2,3}){1,2}$
As far as validating the MX records, the other poster was correct that it's not 100% effective, but it's at least worth a shot (while you may get false positives, you will get correct rejections). See http://www.codeproject.com/aspnet/emailvalidator.asp[^] for a good article on MX validation, although there are others. Just try this search[^] (watch the ratings to filter the bad ones).
Microsoft MVP, Visual C# My Articles
-
You can validate the actual email address string. There is an RFC that defines valid addresses. The regex string below is the most common:
^[\w\.\-]+@[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]{1,})*(\.[a-zA-Z]{2,3}){1,2}$
As far as validating the MX records, the other poster was correct that it's not 100% effective, but it's at least worth a shot (while you may get false positives, you will get correct rejections). See http://www.codeproject.com/aspnet/emailvalidator.asp[^] for a good article on MX validation, although there are others. Just try this search[^] (watch the ratings to filter the bad ones).
Microsoft MVP, Visual C# My Articles
From what I understand, the RFC allows a LOT of crap that "nobody" uses. For instance, fred@123.123.123.123 (IP address) is supported in the RFC. There's also a lot of people who try to check things like the length of the TLD (2 - 6 characters for the museum TLD). I'm currently using the regular expression here. It's a pretty decent one. Also notice how many submissions there are on that site for email regular expressions. Some people stick to the RFC and some people realize the RFC is not quite restrictive enough for "real" usage.
I, for one, do not think the problem was that the band was down. I think that the problem may have been that there was a Stonehenge monument on the stage that was in danger of being crushed by a dwarf.
-David St. Hubbins -
Hello, I am interested in creating a program to handle large a mailing list and help eliminated bounced messages by doing e-mail validation. I've looked at articles on this site and elsewhere on the web, and it appears that most of the validation approaches include validating the e-mail address syntax with regular expressions, then validating that the domain exists, then connecting to the smtp server and making sure that it will accept mail for a certain address. However, I have read that this will not work for some e-mail providers, such as aol, hotmail, servers based on ms outlook, etc. and that the only way to find out if an e-mail address is really valid for these is to wait and see if the message bounces back. Does anyone know of a sure way to validate all e-mail addresses, or is it just not possible? I guess if there is no 100% correct way, then I can try to combine the two approaches by first checking addresses by contacting the smtp server, and then also analyzing bounced messages. Thanks for any input. Blake
Blake, Just my two cents. I had this same problem where I work with a list of over 2 million. The final solution that I implemented was to send all of the emails with a unique identifier in the return address. Like so: member_number@yourdomain.com. Then when a bounce occurs, the email will come back to our server. I use exchange so I had to write an event sink to parse the email address that was coming in and get the member number from whence it bounced and then update that member's email valid in our database. Not sure if this will work for you depending on your setup. It was tricky but we have it running perfectly now and we now bounce only 2% of our very large lists. One thing we had to keep in my mind was that there are different types of bounces, hard, soft, etc... Most mail servers will send this information back to you in the "bounce" email header and it is up to you to handle it accordingly. For us, we allow 3 soft bounces or one hard. Like I said, it was a little tricky but we cut our bounces down DRAMATICALLY. If you are interested in the details and some sample code, just reply here and I will get it to you. Thanks, Troy G
-
Blake, Just my two cents. I had this same problem where I work with a list of over 2 million. The final solution that I implemented was to send all of the emails with a unique identifier in the return address. Like so: member_number@yourdomain.com. Then when a bounce occurs, the email will come back to our server. I use exchange so I had to write an event sink to parse the email address that was coming in and get the member number from whence it bounced and then update that member's email valid in our database. Not sure if this will work for you depending on your setup. It was tricky but we have it running perfectly now and we now bounce only 2% of our very large lists. One thing we had to keep in my mind was that there are different types of bounces, hard, soft, etc... Most mail servers will send this information back to you in the "bounce" email header and it is up to you to handle it accordingly. For us, we allow 3 soft bounces or one hard. Like I said, it was a little tricky but we cut our bounces down DRAMATICALLY. If you are interested in the details and some sample code, just reply here and I will get it to you. Thanks, Troy G
Troy, This is exactly what I have been thinking about doing (I read an article about it on this website), and I thought that I might even combine it with the MX checking that other people had posted about. I would love to see some details and code if you want to send it to me. Maybe I can figure out a way to make it work for our purposes. You can send it to blakeb_1@hotmail.com Thanks a lot Blake
-
Troy, This is exactly what I have been thinking about doing (I read an article about it on this website), and I thought that I might even combine it with the MX checking that other people had posted about. I would love to see some details and code if you want to send it to me. Maybe I can figure out a way to make it work for our purposes. You can send it to blakeb_1@hotmail.com Thanks a lot Blake
Troy, Actually, I've just realized that this isn't going to work because the mailing list program we are using isn't going to allow to do any extra programming. I think I might just stick to verifying through the smtp servers for now, and hopefully it will at least cut down on a little of the bounces. Thanks for offering to send some code though. Blake
-
You can validate the actual email address string. There is an RFC that defines valid addresses. The regex string below is the most common:
^[\w\.\-]+@[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]{1,})*(\.[a-zA-Z]{2,3}){1,2}$
As far as validating the MX records, the other poster was correct that it's not 100% effective, but it's at least worth a shot (while you may get false positives, you will get correct rejections). See http://www.codeproject.com/aspnet/emailvalidator.asp[^] for a good article on MX validation, although there are others. Just try this search[^] (watch the ratings to filter the bad ones).
Microsoft MVP, Visual C# My Articles
-
Unmodified, no. A validator in ASP.NET takes the name of a control which it finds in in the control collection of the container. This wouldn't work for Windows Forms. You can use the code, though. There's other code out there as well, including an MX validator I wrote a long time ago that I believe I posted in this C# forum over a year ago. Good luck finding it, though. In any case, DNS queries are a standard so the only real difference in code is how you do it.
Microsoft MVP, Visual C# My Articles
-
Blake, Just my two cents. I had this same problem where I work with a list of over 2 million. The final solution that I implemented was to send all of the emails with a unique identifier in the return address. Like so: member_number@yourdomain.com. Then when a bounce occurs, the email will come back to our server. I use exchange so I had to write an event sink to parse the email address that was coming in and get the member number from whence it bounced and then update that member's email valid in our database. Not sure if this will work for you depending on your setup. It was tricky but we have it running perfectly now and we now bounce only 2% of our very large lists. One thing we had to keep in my mind was that there are different types of bounces, hard, soft, etc... Most mail servers will send this information back to you in the "bounce" email header and it is up to you to handle it accordingly. For us, we allow 3 soft bounces or one hard. Like I said, it was a little tricky but we cut our bounces down DRAMATICALLY. If you are interested in the details and some sample code, just reply here and I will get it to you. Thanks, Troy G
Hi Troy G, I am working on a project where i need to track bounced mails and update the database with types of bounce. I am able to send a unique identifier in the return address. Each time there is abounce the mail with the unique identifier is there in the bounce box. Now, the problem is reading the Headers like the member_number in member_number@yourdomain.com and the bounce type and updating it to the database. If u have any idea or code that can do this, then it would really be a great help. I am interested in getting the details. Thanking you in advance. With Warm Regards, Arvind