Tuesday, June 24, 2008

Part 1. Power of a Regular Expressions

Rather than just posting the slides, I decided to do a series of blog posts on the subject - consider it as a intro or tutorial to Regular Expressions.

What are Regular Expressions?

Regular Expressions (or RegEx for short) is a technique to shorten coding by using bunch of letters and symbols (metacharacters). Most modern languages support regular expressions. You can use it in code. But where they are really useful is data cleanup, extraction, converting legacy data, grabbing data from the internet, etc. Regular Expressions can be considered as the SQL for freeform text.

RegExs are avoided by most people (even experienced programmers shy away from it), because they either don't understand it, haven't taken the time to learn it -- or they think that it is sediment left over from the Unix era.

At the end of this tutorial you will see that, regular expressions can actually be easier and quicker to code. Once you understand how it works, and some of the tools and techniques, you will see how dramatically it can shorten code, and save you a time with coding and debugging. Also, maintenance can actually be easier!

Here is a terrific example. Today, I was browsing through the prep book for MCTS Exam 70-528, and came across this example for asp.net custom validator control to validate passwords. The password rules are:

  • 6-14 characters,
  • at least one lowercase letter,
  • at least one uppercase letter
  • at least one number

Here is the conventional method. (Code directly copied from the book, pg 469 with misspelled "argument" and all)

<script language="javascript" type="text/javascript">
function ValidatePassword(source, arguements)
{
var data = arguements.Value.split('');
//start by setting false
arguements.IsValid=false;
//check length
if(data.length < 6 || data.length > 14) return;
//check for uppercase
var uc = false;
for(var c in data)
{
if(data[c] >= 'A' && data[c] <= 'Z')
{
uc=true; break;
}
}
if(!uc) return;
//check for lowercase
var lc = false;
for(var c in data)
{
if(data[c] >= 'a' && data[c] <= 'z')
{
lc=true; break;
}
}
if(!lc) return;
//check for numeric
var num = false;
for(var c in data)
{
if(data[c] >= '0' && data[c] <= '9')
{
num=true; break;
}
}
if(!num) return;
//must be valid
arguements.IsValid=true;
}
</script>

Now, with the use of regular expressions, we can shorten this into just ONE line of code:
<script language="javascript" type="text/javascript">
function ValidatePassword(src, args)
{
args.IsValid =
args.Value.length>=6 && args.Value.length<=14
&& /[a-z]/.test(args.Value) //find a lowercase
&& /[A-Z]/.test(args.Value) //find a uppercase
&& /\d/.test(args.Value) //find a digit
}
</script>

You've got to love the elegance and compactness of this code. I believe it is actually easier to understand and debug - no messy loops and "if" constructs. And it reads exactly like the password rules specification above. Such is the amazing power of regular expressions. Note: "\d" is the character class for identifying a single digit. We could have just as well said "[0-9]".

In my next post we will look at a few more easy examples and examine the metacharacters.

Sunday, June 22, 2008

Regular Expression talk at Lansing's DODN

Last night's party at Jeff's was a lot of fun. We had the Xbox's "Rock Band" connected with all the controllers. I have always wondered what the hell is the deal was with games like these (why waste time on pseudo guitars when you can spend that time learning real guitars). But watching my buddies belt out lyrics to Radiohead and Blue Oyster Cult, I can see now see what the draw is -- they were really having fun. What a fantastic product this is, from a marketing and innovation standpoint. (the credit really goes to original innovator - Wii and Guitar Hero)





I thought yesterday's Day of Dot Net was a terrific success. We organized it at a grand scale pulled it through, with record participation, great publicity from local press and the finale with Lansing Mayor Bernero's address. Thank you Joe, Jeff, Vivek and others for doing such a bang-up job. For our first event we outdid both Ann Arbor and Grand Rapids in the scale and detailed level of planning. We had managed to get very famous speakers, had incredible amount of swag and other giveaways, and had great of food and beverages. The LCC West campus was a fantastic facility. I felt excited to be part of this group.

I really enjoyed speaking at the event. It was very interactive with lot's of questions. Since a few attendees requested the code samples, which I will post shortly. Here is the presentation.