Special

Introducing the “Welcome to Xojo” Bundle!

New to Xojo and looking for guidance? We've put together a terrific bundle to welcome you! Xojo Bundle

This bundle includes six back issues of the magazine -- all of year 14 in printed book and digital formats -- plus a one-year subscription so you'll be learning all about Xojo for the next year. It's the perfect way to get started programming with Xojo. And you save as much as $35 over the non-bundle price!

This offer is only available for a limited time as supplies are limited, so hurry today and order this special bundle before the offer goes away!

Article Preview


Buy Now

Issue 1.1

COLUMN

Regular Expressions Overdrive

Issue: 1.1 (August/September 2002)
Author: Didier Barbas
Author Bio: Didier has been a dilettante programmer and linguist for more than 20 years. Unusual for a Frenchman, he speaks 11 languages, including Korean and PowerPC machine-language; he manages the Korean branch of a Dutch company that doesn't do banking, chemicals, or consumer products. Go figure!
Article Description: Advanced regular expressions.
Article Length (in bytes): 18,156
Starting Page Number: 46
Article Number: 1016
Resource File(s):

Download Icon 1016.zip Updated: 2013-03-11 19:07:55

Related Link(s): None

Excerpt of article text...

This article assumes that you already have covered the basics of regular expressions (RegExes), and at least read Matt Neuburg's article on page ## of this issue. We will focus here on techniques that will make your coding (and your life) easier. These techniques are answers to real-life problems, some of my own, and some to questions asked on the REALbasic discussion lists. I will also show that regular expressions are not always the right tool -- some require extra help or are just not fit for the task.

Just don't bother.

A discussion we had some time ago on one of the REALbasic discussion lists was on how to suppress extra spaces in a text. The pattern that will come up immediately to most people is [\t ]+, to be replaced with a single space. In the discussion, it was argued that the correct pattern should be [\t ][\t ]+, since RB's RegEx engine should start matching only when there are at least two tabs or spaces. It was, however, noted that the speed difference on average-sized texts was quite negligible, at least from the stand-point of a human being (applied to this article, which has few double spaces, [\t ][\t ]+ is six times faster than [\t ]+). On the other hand, all this discussion, while fascinating, was quite academic since a) we're talking microseconds or milliseconds, not seconds, and b) another fellow had come up with an example using replaceAll, which was very much faster. I tweaked it a little bit further and made it even faster by changing inStr to inStrB, and by adding a line of code to first remove odd-numbers of spaces:

...End of Excerpt. Please purchase the magazine to read the full article.