Last modified on 13 November 2013, at 05:07

JavaScript/Regular Expressions

Previous: Arrays Index Next: Operators

JavaScript implements regular expressions (regex for short) when searching for matches within a string. In the following example, we are replacing the word be by the word exist in a text:

  1. var shakespeareText = "To be or not to be? That is the question.";
    
  2.  
    
  3. var regularExpression = new RegExp("be", "g");
    
  4. var spoiltShakespeareText = shakespeareText.replace(regularExpression, "exist");
    
  5.  
    
  6. alert(spoiltShakespeareText);
    

It will display:

Alert

To exist or not to exist? That is the question.

As with other scripting languages, this allows searching beyond a simple letter-by-letter match, and can even be used to parse strings in a certain format. Regular expressions most commonly appear in conjunction with the string.match() and string.replace() methods. A regular expression object can be created with the RegExp constructor. The first parameter is the pattern and the second the options:

  1. var regularExpression = new RegExp("be", "g");
    

Alternatively, it can be delimited by the slash (/) character, and may have some options appended:

  1. var regularExpression = /be/g;
    

CompatibilityEdit

JavaScript's set of regular expressions follows the extended set. While copying a Regex pattern from JavaScript to another location may work as expected, some older programs may not function as expected.

  • In the search term, \1 is used to back reference a matched group, as in other implementations.
  • In the replacement string, $1 is substituted with a matched group in the search, instead of \1.
    • Example: "abbc".replace(/(.)\1/g, "$1") => "abc"
  • | is magic, \| is literal
  • ( is magic, \( is literal
  • The syntaxes (?=...), (?!...), (?<=...), (?<!...) are not available.

MatchingEdit

  1. string = "Hello world!".match(/world/);
    
  2. stringArray = "Hello world!".match(/l/g); // Matched strings are returned in a string array
    
  3. "abc".match(/a(b)c/)[1] => "b" // Matched subgroup is the second member (having the index "1") of the resulting array
    

ReplacementEdit

  1. string = string.replace(/expression without quotation marks/g, "replacement");
    
  2. string = string.replace(/escape the slash in this\/way/g, "replacement");
    
  3. string = string.replace( ... ).replace ( ... ). replace( ... );
    

TestEdit

  1. if (string.match(/regexp without quotation marks/)) {
    

ModifiersEdit

Modifier Note
g Global. The list of matches is returned in an array.
i Case-insensitive search
m

Multiline. If the operand string has multiple lines, ^ and $ match the beginning and end of each line within the string, instead of matching the beginning and end of the whole string only.

  • "a\nb\nc".replace(/^b$/g,"d") => "a\nb\nc"
  • "a\nb\nc".replace(/^b$/gm,"d") => "a\nd\nc"

OperatorsEdit

Operator Effect
\b Matches boundary of a word.
\w Matches an alphanumeric character, including "_".
\W Negation of \w.
\s Matches a whitespace character (space, tab, newline, formfeed)
\S Negation of \s.
\d Matches a digit.
\D Negation of \d.

Function callEdit

For complex operations, a function can process the matched substrings. In the following code, we are capitalizing all the words. It can't be done by a simple replacement as each letter to capitalize is a different character:

  1. var capitalize = function(matchobj) {
    
  2.   var group1 = matchobj.replace(/^(\W)[a-zA-Z]+$/g, "$1");
    
  3.   var group2 = matchobj.replace(/^\W([a-zA-Z])[a-zA-Z]+$/g, "$1");
    
  4.   var group3 = matchobj.replace(/^\W[a-zA-Z]([a-zA-Z]+)$/g, "$1");
    
  5.   return group1 + group2.toUpperCase() + group3;
    
  6. };
    
  7.  
    
  8. var shakespeareText = "To be or not to be? That is the question.";
    
  9.  
    
  10. var spoiltShakespeareText = shakespeareText.replace(/\W[a-zA-Z]+/g, capitalize);
    
  11.  
    
  12. alert(spoiltShakespeareText);
    

It will display:

Alert

To Be Or Not To Be? That Is The Question.

The function is called for each substring. Here is the signature of the function:

function (<matchedSubstring>[, <capture1>, ...<captureN>, <indexInText>, <entireText>]) {
...
return <stringThatWillReplaceInText>;

}

  • The first parameter is the substring that matches the pattern,
  • The next parameters are the captures in the substrings. There are as many parameters as there are captures,
  • The next parameter is the index of the beginning of the substring starting from the beginning of the text,
  • The last parameter is a remainder of the entire text,
  • The return value will be put in the text instead of the matching substring.

See alsoEdit

External linksEdit

Previous: Arrays Index Next: Operators