Coding

Matching Passwords | REGEX DEMYSTIFIED

  • 00:00:01 welcome back to this video welcome back
  • 00:00:04 to will some more regular expression
  • 00:00:06 demystifying in the last video we had a
  • 00:00:09 look at groups capturing groups and
  • 00:00:12 positive look Ahead's it can be hard to
  • 00:00:15 grasp the concept if you don't apply it
  • 00:00:17 to a real use case so in this video
  • 00:00:21 we're going to do that will match a
  • 00:00:23 password which should include a special
  • 00:00:25 character a number a pro case and a
  • 00:00:27 lowercase character and maybe it should
  • 00:00:29 even be at least 8 characters long let's
  • 00:00:32 do this together in this video
  • 00:00:37 so the goal is to match a password the
  • 00:00:39 following password should be treated as
  • 00:00:42 correct so we should completely match it
  • 00:00:45 let's say we have HG uppercase T one age
  • 00:00:51 one exclamation mark question mark s
  • 00:00:55 this should be a valid carrier password
  • 00:00:58 this or this or this should all not be
  • 00:01:04 treated as valid this has eight
  • 00:01:07 characters it has a pro case and lower
  • 00:01:09 case character it has a number and it
  • 00:01:11 has a special character and it may have
  • 00:01:13 more than one of each of these
  • 00:01:14 categories but it has at least one now
  • 00:01:17 how do we start with matching this how
  • 00:01:19 do we start with this regular expression
  • 00:01:21 well we want to allow lower case upper
  • 00:01:23 case special character in digits so we
  • 00:01:26 can create a range with lower case
  • 00:01:28 whoops lowercase uppercase digit and
  • 00:01:33 some special characters like the
  • 00:01:34 exclamation mark in the question mark
  • 00:01:36 and you can't add more and now we got a
  • 00:01:38 lot of matches now if we add the plus
  • 00:01:40 sign we're also saying this should at
  • 00:01:42 least occur once or more often we can
  • 00:01:45 also say it should be at least eight
  • 00:01:47 characters long by using curly braces
  • 00:01:49 and one comma nothing is just a
  • 00:01:53 replacement for a plus it also means
  • 00:01:56 from at least one match to as many as
  • 00:01:58 you want
  • 00:01:59 therefore eight comma means only match
  • 00:02:02 patterns where we have at least eight
  • 00:02:04 subsequent characters from that range
  • 00:02:06 and then as many as you want so now all
  • 00:02:09 options except for this second one are
  • 00:02:10 matched well that's the exact issue we
  • 00:02:12 have here though the third and fourth
  • 00:02:15 pattern are also matched because they
  • 00:02:17 are long enough and that you use a lower
  • 00:02:19 case or at it yet but we're not
  • 00:02:21 enforcing that there should be at least
  • 00:02:23 one lower case at least one uppercase
  • 00:02:25 character and we can't enforce it with
  • 00:02:28 the range alone now obviously you could
  • 00:02:32 refactor this and say H is said at least
  • 00:02:36 once followed by uppercase H who said at
  • 00:02:40 least once and so on and then you could
  • 00:02:43 kind of group that but the problem is
  • 00:02:46 due to the pattern here being parsed
  • 00:02:49 from left to right
  • 00:02:51 passwords which started with a lowercase
  • 00:02:53 character and then are preceded with
  • 00:02:56 uppercase characters are marked as
  • 00:02:59 correct so a a a a a and I should remove
  • 00:03:05 that because otherwise you would have to
  • 00:03:07 repeat that pattern a times that would
  • 00:03:09 be matched but the problem is this is
  • 00:03:13 not what we want to do we also want to
  • 00:03:15 allow a AAA or uppercase characters
  • 00:03:19 mixed in so this is not what we can use
  • 00:03:22 we should go back to our previous
  • 00:03:24 solution of having one big range which
  • 00:03:26 should occur or where each character or
  • 00:03:30 where any character off the drain should
  • 00:03:32 occur at least eight times but we need
  • 00:03:35 to fine tune the fact or the pattern to
  • 00:03:39 take care that we also have some minimum
  • 00:03:42 rules we can do that with look Ahead's
  • 00:03:44 so let's add some look Ahead's to our
  • 00:03:47 rule here we're now saying any character
  • 00:03:50 from this range and at least eight of
  • 00:03:53 them let's now add a positive look-ahead
  • 00:03:55 after that with question mark equal sign
  • 00:03:57 at the beginning of the group and now
  • 00:04:00 let's say it should at least have a
  • 00:04:03 lowercase character so eight you said
  • 00:04:05 like this now we see something
  • 00:04:07 interesting only the first and the
  • 00:04:10 second string are matched well almost
  • 00:04:13 the last character is submitted the
  • 00:04:15 reason for dad is that whooped up look
  • 00:04:17 ahead we're saying match any string
  • 00:04:20 where this first rule is correct which
  • 00:04:24 basically is followed or where the
  • 00:04:28 result of this first rule is followed by
  • 00:04:30 a single lowercase character and that's
  • 00:04:34 the case for the first and the third
  • 00:04:36 string here now I don't want to have it
  • 00:04:40 being followed by a lowercase character
  • 00:04:43 I just want my whole string to include
  • 00:04:46 at least one lowercase character but
  • 00:04:48 that should be matched in this first
  • 00:04:50 part here for that we can set D or put
  • 00:04:54 the positive look ahead to the beginning
  • 00:04:56 of the word this kind of makes it like a
  • 00:04:58 minimum requirement now we're matching
  • 00:05:02 anything which has a lowercase character
  • 00:05:04 at least one and fulfills this role too
  • 00:05:08 with the 8 characters
  • 00:05:09 therefore the just digit string is no
  • 00:05:12 longer matched because it has no
  • 00:05:14 lowercase character now we can add
  • 00:05:16 multiple look Ahead's I can add another
  • 00:05:18 look ahead in front of it four digits
  • 00:05:21 let's say so 0 to 9 we might have
  • 00:05:26 expected that now all the previous
  • 00:05:28 strings and the digit string is matched
  • 00:05:30 instead none of them is matched anymore
  • 00:05:33 because with that it's basically looking
  • 00:05:35 at the start of our strings we want to
  • 00:05:38 match and looking if it starts with a
  • 00:05:41 lowercase character and add it yet of
  • 00:05:44 course it can't start with both we have
  • 00:05:46 to be more flexible we have to add
  • 00:05:48 something to our positive look at we
  • 00:05:50 should say yeah we want to look if there
  • 00:05:52 is a lowercase character but we don't
  • 00:05:54 care about the position as characters at
  • 00:05:56 so there may be any character in front
  • 00:05:58 of the lowercase character and to any
  • 00:06:01 character that's the dot and this dot
  • 00:06:04 doesn't have to so this any character
  • 00:06:07 it's not the dot that's any character
  • 00:06:09 doesn't have to appear but it may appear
  • 00:06:12 so the a or the B or whatever is
  • 00:06:15 included in this range may be at the
  • 00:06:17 beginning of the string but it doesn't
  • 00:06:19 have to so let's add a star after the
  • 00:06:21 dot to say yeah there may be any
  • 00:06:24 character in front of this rule and the
  • 00:06:26 same for the digits there may be any
  • 00:06:28 character in front of digits and now we
  • 00:06:30 get a match for this mixed string
  • 00:06:32 because now we're saying only a string
  • 00:06:35 which includes lowercase characters and
  • 00:06:37 numbers is treated as valid and that's
  • 00:06:40 only true for the first string if I add
  • 00:06:42 a lowercase character to their fourth
  • 00:06:43 string this is also treated as valid and
  • 00:06:46 now we can simply repeat this positive
  • 00:06:50 look ahead one more time for uppercase
  • 00:06:52 characters so that now only string
  • 00:06:55 switching also include an uppercase
  • 00:06:57 character are treated as valid therefore
  • 00:06:59 the 4th string is not treated as valid
  • 00:07:01 if I only at a lowercase character I
  • 00:07:03 also have to add a uppercase one to make
  • 00:07:05 it valid and now we can repeat this one
  • 00:07:08 more time for special characters for the
  • 00:07:11 ones we also then accept thereafter so
  • 00:07:14 here I want to allow exclamation mark
  • 00:07:16 and a question mark
  • 00:07:18 and therefore only do first stringers
  • 00:07:20 matched again because it includes one of
  • 00:07:22 them
  • 00:07:22 if I remove both it's not treated as
  • 00:07:25 valid anymore this is how you construct
  • 00:07:28 such a password validator look Ahead's
  • 00:07:32 allow you to ensure that a special
  • 00:07:34 character occurs at least once but that
  • 00:07:37 the position doesn't matter and then you
  • 00:07:41 have to set of allowed characters at the
  • 00:07:43 end where you define how often or how
  • 00:07:45 many of them should be included so that
  • 00:07:47 you can also check the quantity this is
  • 00:07:50 password validation and this is where
  • 00:07:52 positive look Ahead's can be very useful
  • 00:07:55 if you want to look at the complete
  • 00:07:56 string and see if it includes something
  • 00:07:59 at any position with the dot star here
  • 00:08:02 and then you proceed this match with
  • 00:08:07 your rule of how often should something
  • 00:08:09 occur this is how we match passwords
  • 00:08:12 this is a typical pattern you would use
  • 00:08:13 and I hope that with that positive look
  • 00:08:16 Ahead's became a bit more useful and
  • 00:08:19 clearer they are great to define minimum
  • 00:08:22 requirements so character which should
  • 00:08:25 occur at least once for example