Assertions allows a regular expression to match only under certain controlled conditions.
An assertion does not need a character to match, it rather investigates the surroundings of a possible match before acknowledging it. For example the word boundary assertion does not try to find a non word character opposite a word one at its position, instead it makes sure that there is not a word character. This means that the assertion can match where there is no character, i.e. at the ends of a searched string.
Some assertions actually do have a pattern to match, but the part of the string matching that will not be a part of the result of the match of the full expression.
Regular Expressions as documented here supports the following assertions:
^
(caret: beginning of string)Matches the beginning of the searched string.
The expression
^Peter
will match at “Peter” in the string “Peter, hey!” but not in “Hey, Peter!”$
(end of string)Matches the end of the searched string.
The expression
you\?$
will match at the last you in the string “You didn't do that, did you?” but nowhere in “You didn't do that, right?”\b
(word boundary)Matches if there is a word character at one side and not a word character at the other.
This is useful to find word ends, for example both ends to find a whole word. The expression
\bin\b
will match at the separate “in” in the string “He came in through the window”, but not at the “in” in “window”.\B
(non word boundary)Matches wherever “\b” does not.
That means that it will match for example within words: The expression
\Bin\B
will match at in “window” but not in “integer” or “I'm in love”.(?=PATTERN)
(Positive lookahead)A lookahead assertion looks at the part of the string following a possible match. The positive lookahead will prevent the string from matching if the text following the possible match does not match the PATTERN of the assertion, but the text matched by that will not be included in the result.
The expression
handy(?=\w)
will match at “handy” in “handyman” but not in “That came in handy!”(?!PATTERN)
(Negative lookahead)The negative lookahead prevents a possible match to be acknowledged if the following part of the searched string does match its PATTERN.
The expression
const \w+\b(?!\s*&)
will match at “const char” in the string “const char* foo” while it can not match “const QString” in “const QString& bar” because the “&” matches the negative lookahead assertion pattern.(?<=PATTERN)
(Positive lookbehind)Lookbehind has the same effect as the lookahead, but works backwards. A lookbehind looks at the part of the string previous a possible match. The positive lookbehind will match a string only if it is preceded by the PATTERN of the assertion, but the text matched by that will not be included in the result.
The expression
(?<=cup)cake
will match at “cake” if it is succeeded by “cup” (in “cupcake” but not in “cheesecake” or in “cake” alone).(?<!PATTERN)
(Negative lookbehind)The negative lookbehind prevents a possible match to be acknowledged if the previous part of the searched string does match its PATTERN.
The expression
(?<![\w\.])[0-9]+
will match at “123” in the strings “=123” and “-123” while it can not match “123” in “.123” or “word123”.(PATTERN)
(Capturing group)The sub pattern within the parentheses is captured and remembered, so that it can be used in back references. For example, the expression
("+)[^"]*\1
matches""""text""""
and"text"
.See the section Capturing matching text (back references) for more information.
(?:PATTERN)
(Non-capturing group)The sub pattern within the parentheses is not captured and is not remembered. It is preferable to always use non-capturing groups if the captures will not be used.