.NET Regex Reference

About Regular Expressions

Regular expressions (often abbreviated "regex") are written in a formal language and provide a powerful and concise way to find complex patterns inside text. Although regular expressions can seem cryptic and confusing at first, they can also save you hours of writing procedural code to perform the same task.

Regular Expressions in the Microsoft .NET Framework

.NET uses a very powerful set of regular expression functionality based on the often imitated Perl 5 implementation. Therefore, Perl regular expressions often work with the .NET regular expression engine. However, for all practical purposes the .NET regular expression engine is a unique implementation since it has some unique features of its own.

Syntax Reference for .NET Regular Expressions

The reference below is based on material provided by MSDN. You may click some of the items below to see corresponding examples. Or for more help see Microsoft's Developer Guide for Regular Expressions.

Characters

The following expressions will match single characters. For more information see Microsoft's article on Character Classes.

Ordinary characters Characters other than . $ ^ { [ ( | ) * + ? \ match themselves.
a matches a and b matches b
. Matches any character excluding the line feed. Includes the line feed in single-line mode.
. matches a or 1 or almost anything else
[abc] A bracket expression (may contain more than one character). Matches any character that is contained within the brackets, in no particular order.
[abc] matches a, b, or c
[^abc] The opposite of [ ]. Matches all characters not contained within the brackets.
[^abc] matches anything except a, b, or c
[first-last] Character range: Matches any single character in the range from first to last.
[a-z] matches a, m, or z
\w Matches an alpha-numeric character (a-z, A-Z, 0-9, and underscore).
\w matches a or b
\W The opposite of \w. Matches any non-alphanumeric character.
\W matches - but does not match a
\d Matches a decimal character (0-9).
\d matches 1 or 2
\D The opposite of \d. Matches any non-decimal character.
\D matches a or b
\s Matches a character of whitespace (space, tab, carriage return, line feed).
a\sb matches a b
\S The opposite of \s. Matches any non-whitespace character.
a\Sb matches a-b
\r Matches a carriage return.
a\rb matches a
b
\n Matches a new line (line feed).
a\nb matches a
b
\f Matches a form feed.
\t Matches a tab.
a\tb matches a    b
\v Matches a vertical tab.
\a Matches a bell character.
\b In a character class, matches a backspace.
\e Matches an escape.
\040 Uses octal representation to specify a character (octal consists of up to three digits).
\x20 Uses hexadecimal representation to specify a character (hex consists of exactly two digits).
\c0003 Matches the specified 4-digit ASCII control character.
\u0020 Matches a Unicode character by using hexadecimal representation (exactly four digits).
\p{name} Matches any single character in the Unicode general category or named block specified by name.
\P{name} Matches any single character that is not in the Unicode general category or named block specified by name.
\ In front of any of the special characters (. $ ^ { [ ( | ) * + ? \), this will match the character itself.
\$5 matches $5 and \\ matches \

Assertions

The following expressions specify the location to search for a match, but do not match anything themselves.

^ The match must start at the beginning of the string (or beginning of the line in multiline mode).
^cat matches cat but does not match bobcat
$ The match must occur at the end of the string or before \n at the end of the string (or end of the line in multiline mode).
dog$ matches dog but does not match dogfight
\A The match must occur at the start of the string.
\Z The match must occur at the end of the string or before \n at the end of the string.
\z The match must occur at the end of the string.
\G The match must occur at the point where the previous match ended.
\b Asserts a boundary between word and non-word characters.
grape\b matches grape, cherry but does not match grapefruit
\B The opposite of \b. Asserts a location that is not a boundary between word and non-word characters.
grape\B matches grapefruit but does not match grape, cherry
(?=pattern) Asserts that the specified pattern exists immediately after this location. Known as a positive lookahead.
too many(?= secrets) matches too many secrets but does not match too many
(?!pattern) Asserts that the specified pattern does not exist immediately after this location. Known as a negative lookahead.
too many(?! secrets) matches too many but does not match too many secrets
(?<=pattern) Asserts that the specified pattern exists immediately before this location. Known as a positive lookbehind.
(?<=too )many secrets matches too many secrets but does not match many secrets
(?<!pattern) Asserts that the specified pattern does not exist immediately before this location. Known as a negative lookbehind.
(?<!too )many secrets matches many secrets but does not match too many secrets

Quantifiers

The following expressions will indicate a repetition of the previous character or group.

? Repeat 0 or 1 time matching as many times as possible.
abc.? matches abc or abcd
* Repeat 0 or more times matching as many times as possible.
abc.* matches abc or abcd or abcde
+ Repeat 1 or more times matching as many times as possible.
abc.+ matches abcd or abcde
?? Repeat 0 times or 1 time matching 0 times if possible.
abc?? matches abc
*? Repeat 0 or more times matching as few times as possible.
ab.*?c matches abc or ab c
+? Repeat 1 or more times matching as few times as possible.
ab.+?c matches abc or abbc
{n} Repeat exactly n times.
\d{1} matches 5
{n,} Repeat at least n times, matching as many times as possible.
\d{1,} matches 5 or 555
{n,}? Repeat at least n times, matching as few times as possible.
\d{1,}? matches 555
{n,m} Repeat at least n times, but no more than m times.
\n{1,2} matches 5 or 55
{n,m}? Repeat at least n times, but no more than m times while matching as few as possible.
\n{1,2}? matches 55

Grouping

The following expressions allow grouped matching.

(pattern) Captures the specified pattern as a group. Each group is numbered automatically starting from 1. Group 0 is actually not a group at all but refers to the text matched by the entire regular expression.
(?<name>pattern) Captures the specified pattern into the specified group name. The string used for the name must not contain any punctuation and cannot begin with a number.
(?<name1-name2>pattern) Defines a balancing group definition.
(?:pattern) Does not capture the substring matched by this pattern. Known as a noncapturing group.
(?imnsx-imnsx:pattern) Applies or disables the specified options within subexpression. For more information, see .NET's Regular Expression Options.
(?>pattern) Nonbacktracking (or "greedy") subexpression.

Backreferences

A backreference allows a previously matched subexpression to be identified subsequently in the same regular expression.

\number Backreference. Matches the value of a numbered subexpression.
\k<name> Named backreference. Matches the value of a named expression.

Substitutions

Substititions are allowed only within replacement patterns.

$number Substitutes the last substring matched by the specified group number. The numbering scheme for groups starts at 1 (0 represents the entire match).
${name} Substitutes the last substring matched by a named group.
$& Substitutes a copy of the entire match itself.
$` Substitutes all the text of the input string before the match.
$' Substitutes all the text of the input string after the match.
$+ Substitutes the last group captured.
$_ Substitutes the entire input string.

Alternation

The following expressions allow either/or matching.

| Acts as a logical OR. When between two characters or groups, matches one or the other.
(?(pattern)yes|no) Matches the first pattern in the OR statement (yes) if the specified pattern is found at this point. Otherwise, matches the second pattern in the OR statement (no).
(?(<name>)yes|no) Matches the first pattern in the OR statement (yes) if the specified named group is found at this point. Otherwise, matches the second pattern in the OR statement (no).

Comments

The following expression allows comments to be inserted in your regular expression.

(?#comment) Everything from the pound sign (#) to the end parenthesis is a comment and will be ignored.
#comment X-mode comment. The comment starts at an unescaped # and continues to the end of the line.

Example

To put some of this to use let's take a classic example:
The quick brown fox jumps over the lazy dog

Let's say we didn't care that the fox is brown, or that the dog is lazy but we wanted to match the whole string regardless of these details.

So the regular expression would be something like this:
^The quick\s?\w* fox jumps over the\s?\w* dog$

This would match the string even if the fox was red and the dog was hungry.

Legend

regular expression
matched text
unmatched text


Try it in the Regex Hero Tester to see for yourself.


Also check out the public library to see practical examples.