Regular Expression Replace


RegexReplace$(s$,f$,r$)


Returns a string (s$) with all occurrences of (f$) replaced with (r$).


This command works the same as Replace$() except that the find string (f$) may be a regular expression (regex).


WARNING: It is not possible to fully cover the topic of regular expressions here. For a complete explanation see the book "Mastering Regular Expressions" by Jeffrey E.F. Friedl. A brief overview of regular expressions is given below.


A regular expression is a string of characters that describes or matches a given amount of text. For example, the sequence "tom", considered as a regular expression, would match any occurrence of the word "tom" inside a string. Regular expressions (sometimes referred to as 'regex' for short) have both literal characters and meta characters. In "tom", all three characters are literal, so all occurrences of the literal string "tom" would be replaced.

We might also have the regular expression "^tom". In this case, the '^' is a meta character, it does not match the character '^', but instead indicates the "beginning of a line."

For example:

Print RegexReplace$("tom went to the park with tom", "^tom", "bill")

bill went to the park with tom


Notice that only the "tom" at the beginning of the string was replaced.


Here are a few common meta-characters used to get you started:

Position Metacharacters

^

beginning of string

$

end of string

\\b

word boundary

\\B

a non word boundary

Single Character Metacharacters

.

any one character

\\d

any digit from 0 to 9

\\w

any word character (a-z,A-Z,0-9)

\\W

any non-word character

\\s

any whitespace character

\\S

any non whitespace character

Quantifiers (refer to the character that precedes it)

?

appearing once or not at all

*

appearing zero or more times

+

appearing one or more times

{min,max}

appearing within the specified range

For example:

"^$" - matches beginning of line followed by end of line, i.e. match any blank line.

"ing\\b" - matches 'ing' followed by a word boundary, i.e. any time 'ing' appears at the end of a word.

Character Classes

Character Classes allow you to select groups of characters and are denoted by characters enclosed in brackets. [aeiou] means match any vowel. Using a "^" negates the character class. [^aeiou] means match any character which is not a vowel. This is not just limited to letters, it really means anything at all that is not an a, e, i, o, or u. A hyphen indicates a range of characters, such as [0-9] or [a-z].

Another key metacharacter is |, meaning or. This is known as the concept of Alternation.

For example:

"John|Jon" - match "John" or "Jon"

note: this regex could also be written as "Joh?n", meaning match "Jon" with an optional "h" between the "o" and "n".






Become a Patron