Comprehension on Regex in ServiceNow - Support and Troubleshooting

Comprehension on Regex in ServiceNow Summary Most times, we come across Regex for various requirements, including Email Reply Separators, Entities & Server-Side Scripts in ServiceNow. This article will explain how Regex works.

That's short for regular expression. Think of it as a powerful search pattern that you can use to find and manipulate text. Instead of just looking for exact words, you can define rules that describe the kind of text you're interested in.

Imagine you have a big document and you want to find all the email addresses in it. You could read through the whole thing, but that would take forever! With regex, you can create a pattern that says "look for a sequence of characters, then an '@' symbol, then another sequence of characters, then a dot, and finally another sequence of characters." A regex engine can then quickly scan the document and pick out all the strings that match that pattern.

Here are some key things to understand about regex:

Patterns: At its heart, regex is about creating patterns using special characters and symbols. These symbols have specific meanings that allow you to define flexible search criteria. Matching: The process of applying a regex pattern to a piece of text is called "matching." The regex engine tries to find parts of the text that conform to the rules you've defined in your pattern. Versatility: Regex is used in a huge range of applications, from validating user input on websites (like checking if an email address is in the correct format) to powerful text processing in programming languages and text editors. It's also used in network analysis, security, and data mining. Here's a super simple example. Let's say you want to find the word "cat" in a sentence. The regex pattern for that would just be:

cat

Special Characters:

But regex gets much more interesting when you start using special characters. For instance:

. (dot): Matches any single character (except a newline by default). So, c.t would match "cat", "cot", "cut", "c!t", etc. * (asterisk): Matches the preceding character zero or more times. So, ca*t would match "ct", "cat", "caat", "caaaat", etc. + (plus): Matches the preceding character one or more times. So, ca+t would match "cat", "caat", "caaaat", but not "ct". ? (question mark): Matches the preceding character zero or one time. So, colou?r would match both "color" and "colour". [] (square brackets): Defines a character set. For example, [aeiou] would match any vowel. [a-z] would match any lowercase letter. () (parentheses): Groups parts of the pattern and can also be used for capturing matched text. Quantifiers (Beyond * , + , ? ):

{n} : Matches the preceding character or group exactly n times. For example, a{3} matches "aaa". {n,} : Matches the preceding character or group n or more times. For example, a{2,} matches "aa", "aaa", "aaaa", and so on. {n,m} : Matches the preceding character or group between n and m times (inclusive). For example, a{2,4} matches "aa", "aaa", and "aaaa". Anchors:

^ : Matches the beginning of the string or the beginning of a line (depending on multiline mode). For example, ^Hello matches a string that starts with "Hello". $ : Matches the end of the string or the end of a line (depending on multiline mode). For example, World$ matches a string that ends with "World". \b : Matches a word boundary. This is a zero-width assertion that matches the position between a word character (alphanumeric and underscore) and a non-word character (or the beginning/end of the string). For example, \bcat\b would match "cat" but not "category" or "tomcat". \B : Matches a non-word boundary. This matches any position that is not a word boundary. Character Classes (Beyond . and [] ):

\d : Matches any digit character (0-9). Equivalent to [0-9] . \D : Matches any non-digit character. Equivalent to [^0-9] . \s : Matches any whitespace character (space, tab, newline, carriage return, form feed). \S : Matches any non-whitespace character. \w : Matches any word character (alphanumeric characters and underscore: a-z, A-Z, 0-9, _). \W : Matches any non-word character. Grouping and Capturing:

( ) : Creates a capturing group. The text matched by the expression inside the parentheses is captured and can be referred to later (e.g., for backreferences or extracting specific parts of the match). (?: ) : Creates a non-capturing group. This groups parts of the pattern together but does not capture the matched text. This can be useful for applying quantifiers or alternations to a group without needing to refer back to it. Alternation:

| : Acts as an "OR" operator. It matches either the expression before the | or the expression after it. For example, cat|dog matches either "cat" or "dog". Special Characters and Escaping:

\ : The backslash is used to escape special characters so they are treated literally. For example, to match a literal dot, you would use \. . Similarly, to match a literal asterisk, you would use \* . You also saw this used for character classes like \d , \s , \w , and anchors like \b . Lookarounds (Advanced):

These are zero-width assertions that check for a pattern ahead or behind the current position without including that pattern in the actual match.

(?=pattern) : Positive lookahead. Asserts that the pattern must match ahead of the current position. For example, Windows(?=95|98|NT) matches "Windows" only if it is followed by "95", "98", or "NT". (?!pattern) : Negative lookahead. Asserts that the pattern must not match ahead of the current position. For example, Windows(?!95|98|NT) matches "Windows" only if it is not followed by "95", "98", or "NT". (? : Positive lookbehind. Asserts that the pattern must match behind the current position. For example, (? matches one or more digits only if they are preceded by "USD". (? : Negative lookbehind. Asserts that the pattern must not match behind the current position. For example, (? matches one or more digits only if they are not preceded by "USD". Modifiers (Flags):

These are usually applied outside the main regex pattern and affect how the regex engine performs the matching. Common modifiers include:

i : Case-insensitive matching. g : Global matching (find all matches, not just the first). m : Multiline mode ( ^ and $ match the start and end of each line, not just the start and end of the entire string). s (or . ): Dotall mode (the . metacharacter matches any character, including newline).

CONCLUSION:

This is a more comprehensive overview of the common characters and constructs you'll encounter in regular expressions. The specific support for these features can vary slightly depending on the regex engine being used JavaScript, but the core concepts remain largely the same.

As you delve deeper into regex, you'll find that combining these characters in various ways allows you to create very powerful and precise patterns for text manipulation and searching!

You may also refer to our ServiceNow Product Documentation for further insights.