And the 'Email Validation' SQL Puzzle Winner Is... - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
9/27/2007
08:59 AM
Joe Celko
Joe Celko
Commentary
50%
50%

And the 'Email Validation' SQL Puzzle Winner Is...

The winner of last week's 'Email Address Validation' SQL puzzle is "Guest" (see comments), because he/she bothered to do the research and come up with an answer that is generic enough to port to any SQL dialect with a SIMILAR TO or a regexp() function. So, "Guest" please email me with your snail mail address (and some attempt to validate your SQL mastery/identity) and I'll send you one of my books. Here's my answer to last week's puzzle...

The winner of last week's 'Email Address Validation' SQL puzzle is "Guest" (see comments), because he/she bothered to do the research and come up with an answer that is generic enough to port to any SQL dialect with a SIMILAR TO or a regexp() function. So, "Guest" please email me with your snail mail address (and some attempt to validate your SQL mastery/identity) and I'll send you one of my books.

My answer to last week's puzzle is as follows:You can find regular expressions at http://regexlib.com/ and copy them into your code, making changes for your dialect. For example, this regular expression checks an email format against the RFC 3696 Standard and was written by David Thompson

^[a-z0-9!$'*+\-_]+ (\.[a-z0-9!$'*+\-_]+)* @ ([a-z0-9]+(-+[a-z0-9]+)*\.)+ ([a-z]{2} |aero|arpa|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|travel)$

If you do not read regular expressions, this says that a valid email address is one or more groups of strings of alphanumeric characters and some limited punctuation marks, optionally separated by a period. Then there is one "little snail" or "at-sign" in the middle. This followed by more groups of strings of alphanumeric characters and a more limited set of punctuation marks separated by periods. The string finally ends with either a two-letter country code or one of several explicit domain codes.

The problem is that the "[a-z]{2}" pattern matches any two letters even when they are not a valid country code.

If you do not have a SIMILAR TO predicate in your SQL, there is another approach. Set up a CREATE TRANSLATION declaration that maps the legal postfixes into a single unique token not used in an email address. This result is then passed on to another TRANSLATE () expression which reduces the alphanumeric and punctuation characters to a second unique token. Eventually, you wind up with a reduced pattern made up of the two tokens and the at-sign, say '#@#?' since neither '#' nor '?' appear in an email address.

A third approach is to use the TRIM( FROM ) function to reduce the suspect email address to a single at-sign or empty string.

The same effect can be had with nested REPLACE expressions in some dialects but the nesting can be pretty deep. If you are a LISP programmer, you will not mind a bit.

Are these good methods to use in place of using an external call to an external procedure in a 3GL language or SQL/PSM? Baroque as these suggestions are, they often run much faster than the external call and they are portable. But they are ugly to maintain.

Joe Celko is an independent consultant in Austin, Texas, and the author of SQL Puzzles and Answers (2006), Joe Celko's SQL for Smarties: Advanced SQL Programming (2005), and Joe Celko's Trees and Hierarchies in SQL for Smarties (2004).The winner of last week's 'Email Address Validation' SQL puzzle is "Guest" (see comments), because he/she bothered to do the research and come up with an answer that is generic enough to port to any SQL dialect with a SIMILAR TO or a regexp() function. So, "Guest" please email me with your snail mail address (and some attempt to validate your SQL mastery/identity) and I'll send you one of my books. Here's my answer to last week's puzzle...

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
News
COVID-19: Using Data to Map Infections, Hospital Beds, and More
Jessica Davis, Senior Editor, Enterprise Apps,  3/25/2020
Commentary
Enterprise Guide to Robotic Process Automation
Cathleen Gagne, Managing Editor, InformationWeek,  3/23/2020
Slideshows
How Startup Innovation Can Help Enterprises Face COVID-19
Joao-Pierre S. Ruth, Senior Writer,  3/24/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
IT Careers: Tech Drives Constant Change
Advances in information technology and management concepts mean that IT professionals must update their skill sets, even their career goals on an almost yearly basis. In this IT Trend Report, experts share advice on how IT pros can keep up with this every-changing job market. Read it today!
Slideshows
Flash Poll