Regex for validating emails|explained

☝️ Introduction

Validating an email address at first glance seems like an easy task but just thinking about what qualifies to be used as the name section of the email or what should be considered a valid TLD could lead to complex implementations.

To combat these issues software developers employ simple implementations to validate the structure of the email and then later send a confirmation email to the user to further validate the email address.

Today we are going to look at a simple regex implementation to validate the structure of an email address.

👉 A StackOverflow example

For this tutorial, we will be using an answer I picked from StackOverflow

01: function validateEmail(email) 
02:     {
03:         var re = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
04:         return re.test(email);
05:     }
06: 
07: console.log(validateEmail('anystring@anystring.anystring'));

What we are interested in is the the regex used in the above code /^[^\s@]+@[^\s@]+\.[^\s@]+$/. Also, we will assume our goal is to match myname@email.com as a valid email and mo man@@email.tld as an invalid one

👉 Let's break the syntax down

/^[^\s@]+@[^\s@]+\.[^\s@]+$/:

The leading / character marks the beginning of our regex. This way Javascript knows what follows next is our regex.

/^[^\s@]+@[^\s@]+\.[^\s@]+$/:

^ signifies that we would like to match or validate the email from the beginning. For example, the verification of mo man@@email.tld will start from the m at the beginning of the email. Do note that the interpretation of this holds as long as it's not within []. We will see an example where the meaning of the caret symbol is different.

/^[^\s@]+@[^\s@]+\.[^\s@]+$/:

$ at the end of the regex on the other hand signifies matching the end of the email address as well.

/^[^\s@]+@[^\s@]+\.[^\s@]+$/:

[^\s@]can be found used in three parts of our regex. That is because its used to match the text section of our emails. For example take myname@email.com this can be broken into three parts [myname]@[email].[com], all of these three parts can be any form of text.

The regex within [] starts off with a caret ^ which matches everything except \s which signifies whitespaces and the @ symbol. So those three parts can never have whitespaces or the @ symbol.

/^[^\s@]+@[^\s@]+\.[^\s@]+$/`:

Without the + [^\s@] will only match one character thus instead of checking or matching myname it will only check the m. Adding the + ensures that the entire myname section is matched.

/^[^\s@]+@[^\s@]+\.[^\s@]+$/:

The @ symbol matches the literal @ in the email address ie: myname@email.com

/^[^\s@]+@[^\s@]+\.[^\s@]+$/:

The \. symbol matches the literal . in the email address ie: myname@email.com. However, unlike matching the @ symbol above this one starts with a backslash \ reason being that . is a regex syntax for matching all characters except line breaks.

In other to use the literal . we use the backslash to escape it.