Wednesday, March 26, 2008

Validating Email Addresses with Ruby

After my post on how to send email thru action mailer I thought of writing how to validate that email address and thought it would be useful for all.

Here it goes

In any application in which a user enters an email address, there is the very real possibility that the user will make a typo and your application will end up with an invalid address. You can have them enter it twice, but this seems clunky. And you can, of course, send an email with an activation link, which provides the only true validation, but there’s no need to bother sending the email if you know the address is no good. Furthermore, once you’re past the page where the user enters their email address, you’ve missed your chance to tell them there’s something wrong and they should correct it.

So you really should do what you can to validate the address when the user enters it. I recently made a simple addition to my applications that helps a lot: verify that the domain name is valid.

It’s surprisingly easy to do—especially with a little help from Peter Cooper’s excellent Beginning Ruby, which has a very useful chapter on network programming. The following code is adapted from his examples:

require 'resolv'
def validate_email_domain(email)
domain = email.match(/\@(.+)/)[1]
Resolv::DNS.open do |dns|
@mx = dns.getresources(domain, Resolv::DNS::Resource::IN::MX)
end
@mx.size > 0 ? true : false
end

This example makes use of the Ruby standard library “resolv”, so you need to require it first.

The first step is to separate the domain name from the rest of the email address. The regular expression captures the part of the string that follows the @ symbol.

Then the code creates a new DNS resolver object and queries the resolver for an MX (mail exchanger) resource at the specified domain. This returns an array, which will be empty if there is no MX record for the domain.

(Note: In a previous version of this article, I used Resolv.getaddress to see if there is a DNS entry for the domain, instead of checking for an MX record. This approach works most of the time, but it rejects any domain for which there is no A record. If a domain is used only for email and not for a web server, there might not be an A record. Also, some domains have an A record only for www.domain.com, which will also fail the simple getaddress test.)

You can use something like the following in the validate method within the appropriate model:

unless email.blank?
unless email =~ /^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$/
errors.add(:email, "Your email address does not appear to be valid")
else
errors.add(:email, "Your email domain name appears to be incorrect") unless validate_email_domain(email)
end
end

I first check to make sure the email address is not blank, because that’s detected by a simple validates_presence_of :email statement that produces a different error message.

Then I make sure that the email address is at least syntactically reasonable, with a rather ugly regular expression, before bothering to check the DNS.

It should be noted that the regex I use here isn’t designed to cover all of the RFC2822 cases, nor with other RFC drafts dealing with non-ASCII addressing.

An even better approach would be to use an observer to validate the address with an Ajax call before the user submits the form.

It is possible to take this a step further by sending the SMTP server referenced in the MX record a “RCPT TO:” command. In theory, this would check that the user name is valid as well as the domain name. This takes additional time, however, and I’ve read that the response from mail servers is often not reliable. If anyone has tried this, I’d appreciate any feedback on how well it worked.

No comments: