Nu Class Reference

NuRegex

A Perl-compatible regular expression class.

Superclass: NSObject
Declared in: objc/regex.h

NuRegex is a slight modification of Aram Greenman's AGRegex, updated to use the latest PCRE and to use UTF-8 character encodings by default. The NuRegex documentation is from the AGRegex distribution and contains only minor modifications for Nu.

In Nu source code, a NuRegex may be created using the regex operator. A NuRegex may also be created with -initWithPattern: or -initWithPattern:options: or the corresponding class methods +regexWithPattern: or +regexWithPattern:options:. These take a regular expression pattern string and the bitwise OR of zero or more option flags. For example:

    NuRegex *regex = [[NuRegex alloc] initWithPattern:@"(paran|andr)oid" options:NuRegexCaseInsensitive];

Matching is done with -findInString: or -findInString:range: which look for the first occurrence of the pattern in the target string and return an NuRegexMatch or nil if the pattern was not found.

    NuRegexMatch *match = [regex findInString:@"paranoid android"];

A match object returns a captured subpattern by -group, -groupAtIndex:, or -groupNamed:, or the range of a captured subpattern by -range, -rangeAtIndex:, or -rangeNamed:. The subpatterns are indexed in order of their opening parentheses, 0 is the entire pattern, 1 is the first capturing subpattern, and so on. -count returns the total number of subpatterns, including the pattern itself. The following prints the result of our last match case:

    for (i = 0; i < [match count]; i++)
        NSLog(@"%d %@ %@", i, NSStringFromRange([match rangeAtIndex:i]), [match groupAtIndex:i]);


    0 {0, 8} paranoid
    1 {0, 5} paran


If any of the subpatterns didn't match, -groupAtIndex: will return nil, and -rangeAtIndex: will return {NSNotFound, 0}. For example, if we change our original pattern to "(?:(paran)|(andr))oid" we will get the following output:

    0 {0, 8} paranoid
    1 {0, 5} paran
    2 {2147483647, 0} (null)


-findAllInString: and -findAllInString:range: return an NSArray of all non-overlapping occurrences of the pattern in the target string. -findEnumeratorInString: and -findEnumeratorInString:range: return an NSEnumerator for all non-overlapping occurrences of the pattern in the target string. For example,

    NSArray *all = [regex findAllInString:@"paranoid android"];

The first object in the returned array is the match case for "paranoid" and the second object is the match case for "android".

NuRegex provides the methods -replaceWithString:inString: and -replaceWithString:inString:limit: to perform substitution on strings.

    NuRegex *regex = [NuRegex regexWithPattern:@"remote"];
    NSString *result = [regex replaceWithString:@"complete" inString:@"remote control"]; // result is "complete control"


Captured subpatterns can be interpolated into the replacement string using the syntax $x or ${x} where x is the index or name of the subpattern. $0 and $& both refer to the entire pattern. Additionally, the case modifier sequences \U...\E, \L...\E, \u, and \l are allowed in the replacement string. All other escape sequences are handled literally.

    NuRegex *regex = [NuRegex regexWithPattern:@"[usr]"];
    NSString *result = [regex replaceWithString:@"\\u$&." inString:@"Back in the ussr"]; // result is "Back in the U.S.S.R."


Note that you have to escape a backslash to get it into an NSString literal.

Named subpatterns may also be used in the pattern and replacement strings, like in Python.

    NuRegex *regex = [NuRegex regexWithPattern:@"(?P<who>\\w+) is a (?P<what>\\w+)"];
    NSString *result = [regex replaceWithString:@"Jackie is a $what, $who is a runt" inString:@"Judy is a punk"]); // result is "Jackie is a punk, Judy is a runt"


Finally, NuRegex provides -splitString: and -splitString:limit: which return an NSArray created by splitting the target string at each occurrence of the pattern. For example:

    NuRegex *regex = [NuRegex regexWithPattern:@"ea?"];
    NSArray *result = [regex splitString:@"Repeater"]; // result is "R", "p", "t", "r"


If there are captured subpatterns, they are returned in the array.

    NuRegex *regex = [NuRegex regexWithPattern:@"e(a)?"];
    NSArray *result = [regex splitString:@"Repeater"]; // result is "R", "p", "a", "t", "r"


In Perl, this would return "R", undef, "p", "a", "t", undef, "r". Unfortunately, there is no convenient way to represent this in an NSArray. (NSNull could be used in place of undef, but then all members of the array couldn't be expected to be NSStrings.)

Methods

+ (id) regexWithPattern: (NSString *) pat
Creates a new regex using the given pattern string. Returns nil if the pattern string is invalid.

in objc/regex.h

+ (id) regexWithPattern: (NSString *) pat
options: (int) opts
Creates a new regex using the given pattern string and option flags. Returns nil if the pattern string is invalid.

in objc/regex.h

- (NSArray *) findAllInString: (NSString *) str
Calls findAllInString:range: using the full range of the target string.

in objc/regex.h

- (NSArray *) findAllInString: (NSString *) str
range: (NSRange) r
Returns an array of all non-overlapping occurrences of the regex in the given range of the target string. The members of the array are NuRegexMatches.

in objc/regex.h

- (NSEnumerator *) findEnumeratorInString: (NSString *) str
Calls findEnumeratorInString:range: using the full range of the target string.

in objc/regex.h

- (NSEnumerator *) findEnumeratorInString: (NSString *) str
range: (NSRange) r
Returns an enumerator for all non-overlapping occurrences of the regex in the given range of the target string. The objects returned by the enumerator are NuRegexMatches.

in objc/regex.h

- (NuRegexMatch *) findInString: (NSString *) str
Calls findInString:range: using the full range of the target string.

in objc/regex.h

- (NuRegexMatch *) findInString: (NSString *) str
range: (NSRange) r
Returns an NuRegexMatch for the first occurrence of the regex in the given range of the target string or nil if none is found.

in objc/regex.h

- (id) initWithPattern: (NSString *) pat
Initializes the regex using the given pattern string. Returns nil if the pattern string is invalid.

in objc/regex.h

- (id) initWithPattern: (NSString *) pat
options: (int) opts
Initializes the regex using the given pattern string and option flags. Returns nil if the pattern string is invalid.

in objc/regex.h

- (int) options
Returns the options used to create the regex.

in objc/regex.h

- (NSString *) pattern
Returns the pattern used to create the regex.

in objc/regex.h

- (NSString *) replaceWithString: (NSString *) rep
inString: (NSString *) str
limit: (int) limit
Returns the string created by replacing occurrences of the regex in the target string with the replacement string. If the limit is positive, no more than that many replacements will be made.

Captured subpatterns can be interpolated into the replacement string using the syntax $x or ${x} where x is the index or name of the subpattern. $0 and $& both refer to the entire pattern. Additionally, the case modifier sequences \U...\E, \L...\E, \u, and \l are allowed in the replacement string. All other escape sequences are handled literally.

in objc/regex.h

- (NSArray *) splitString: (NSString *) str
Call splitString:limit: with no limit.

in objc/regex.h

- (NSArray *) splitString: (NSString *) str
limit: (int) lim
Returns an array of strings created by splitting the target string at each occurrence of the pattern. If the limit is positive, no more than that many splits will be made. If there are captured subpatterns, they are returned in the array.

in objc/regex.h