The Long and Short of It
Coding Without and With Regular Expressions
Issue: 12.2 (March/April 2014)
Author: Kem Tekinay
Author Bio: Kem Tekinay is a Macintosh consultant and programmer who started with Xojo when it was still REALbasic. He is the author of RegExRX (http://www.mactechnologies.com/index.php:i?page=downloads#regexrx), the popular regular expression editor for Mac and Windows.
Article Description: No description available.
Article Length (in bytes): 17,747
Starting Page Number: 76
Article Number: 12215
project12215.zip Updated: 2014-03-03 00:48:20
Related Web Link(s):
Excerpt of article text...
I was reminded of a project I had done long ago for a client who needed an easy way to import information from a web site into a spreadsheet without having to retype it all. The site itself offered no solution, so I created an AppleScript to parse the HTML and copy the results to the clipboard. All they had to do was open the page in Safari, run the script, switch to Excel and paste. That made them happy for years.
It was less thrilling for me since it was before I had learned about regular expressions, so I had to write a lot of code to parse the text and pick out the relevant bits. Every time the site changed their format, the script would break and force me to re-identify the proper text markers. Sometimes the difference was just an additional space or return, but the triviality of it didn't matter.
Over time, as the script broke again, I changed parts of the code to use regular expressions to make it more flexible. By writing patterns that anticipated simple modifications like additional whitespace, the time between failures increased measurably. It also made the script easier to read and maintain because I could get the same results with less code.
This taught me the same valuable lesson you'd learn the first time you tried to use pliers to loosen a stubborn bolt: Any tool might work, but the right tool will make your life easier.
To illustrate, I will address a simpler version of my client's request using pure Xojo code vs. using regular expressions within Xojo vs. using a slightly more complex regular expression. Although there are options, I will use only the tokens that I've reviewed in previous columns. In other words, I'm not even going to get fancy about it. (The exception is subgroups, and those are essential to this, and almost every, regular expression project. I'll be covering those in the future.)
Defining The Problem
...End of Excerpt. Please purchase the magazine to read the full article.