RegEx Series: (Part 3) Special Characters
This is the final installment of my special characters post within my regular expressions series. I will mention in this post, the wildcard character, optional character, and the zero or more repetitions character.
Let’s start with the optional character as that one is my favorite. I will be using regex101.com again to create the examples. If you want to follow along, you can use that website as it is an excellent tool for testing your regex.
The optional character uses the
? in the regular expression. If you want to match a pure question mark for punctuation purposes, you will have to escape it like all special characters using the backslash
The interesting thing about the optional character is that it pays attention to the preceding character.
For instance, I will create two test strings so that we can see what is going on. They will be “Teste” and “Tester”
Since the ‘r’ precedes our optional character, our regular expression is looking to match a pattern that at the very least has the string “Teste”. What this is saying is that the ‘r’ is optional, and because it is, it matches the ‘Teste”. As the ‘r’ is present within “Tester”, it also matches the string “Tester”. In this example, the ‘r’ can occur zero or more times.
The wildcard character is another interesting special character because it essentially will tell your pattern to match any character as long as it is not a newline character.
The wildcard character uses the
. in the regular expression. If you want to match a pure period for punctuation purposes, you will have to escape it like using a backslash before the period:
The wildcard character as I said will match any single character. So I will type in the string ‘cat’ for our test string.
If I use the wildcard character, (in this case I will use the global flag) it will match each individual character. It matches the ‘c’, it matches the ‘a’, and it matches the ‘t’. Even though my test string is alphabetical, the wildcard character can match it anyway.
If you would like to match the whole instance of the string ‘cat’, you can add the
+, (otherwise known as the one or more repetitions character I mentioned it in my post about quantifiers.) It will tell your wildcard character to match any character (with the exception of a new line), and that it must occur ONE or more times.
And as you can see, we now match the whole string of ‘cat’.
Zero Or More Repetition Character
The zero or more repetition character is also another useful special character and one that is indicated by a
* character. This character will follow a character set or group and will check the characters that precede it. It makes the character that appears before it optional, but it can also appear up to an unlimited amount of times.
I decided to make our test string have a required literal string of ‘cat’ but it could be preceded by a character that occurs zero or more times. In this case, I chose the letter ‘b’.
The ‘b’ is followed by the
* and so it can occur zero or more times.
While my regular expression will only match a string that has the word ‘cat’ in it, it will also include matches that have the letter ‘b’ in it as well.
- On the first line, I have six b’s that precede the string ‘cat’ and it matches.
2. The second line has one ‘b’ that precedes that string ‘cat ’and that gets matched as well.
3. On the last line, however, only the ‘cat’ is matched as the preceding character is an ‘a’ and not a ‘b’. ALL IS NOT LOST THOUGH! The great thing about this is that we will still match our required string rather than not match anything at all.
This is great is because it will match your required string (cat) or it will match the required string plus your optional characters (bcat or bbbbbbcat) or both. This lets me know that I could potentially be matching ‘cat’, and in addition, any other instance where it may be preceded by a ‘b’ such as ‘bcat’ and ‘bbbbbbcat’.
This is the end of my special characters section and in the next post, I will go over matching characters in different parts of the string such as at the end or the beginning of the string. See you next time!