Coding

Matching Full Words | REGEX DEMYSTIFIED

  • 00:00:01 welcome back to this regex serious in
  • 00:00:04 the last video we had a look at ranges
  • 00:00:07 and quantifiers now once you build a
  • 00:00:10 pattern with that I want to match a word
  • 00:00:14 which looks like that
  • 00:00:16 any amount of lowercase characters then
  • 00:00:20 digits and then uppercase and then one
  • 00:00:23 and only one exclamation mark how can we
  • 00:00:27 match this exact pattern the following
  • 00:00:30 word shouldn't match for example so here
  • 00:00:34 I also have the same elements in general
  • 00:00:36 digits lowercase characters special
  • 00:00:38 character one exclamation mark to be
  • 00:00:40 precise still the first should match
  • 00:00:43 this one shouldn't let's find out how to
  • 00:00:47 create such a pattern in this video
  • 00:00:52 so I want to match exactly that pattern
  • 00:00:54 where we have lower case than digit then
  • 00:00:56 upper case then exclamation mark we can
  • 00:00:59 create this match with ranges and
  • 00:01:01 quantifiers actually the simplest way to
  • 00:01:04 match this word is to copy it and put it
  • 00:01:08 in there but yeah
  • 00:01:09 that's probably not that helpful so
  • 00:01:11 let's find a more regular expression way
  • 00:01:14 to do that I want to match any word
  • 00:01:18 which starts with lowercase characters
  • 00:01:20 that's the first part we can agree on so
  • 00:01:23 let's add a range with any lowercase
  • 00:01:25 character let's now add a plus to say
  • 00:01:29 there should be at least one lowercase
  • 00:01:31 character but it may be followed by any
  • 00:01:34 amount of lowercase characters humanly
  • 00:01:36 possible so let's add a plus you see
  • 00:01:40 I match this part I also match this part
  • 00:01:43 here on the right so let's now build on
  • 00:01:46 in this expression now I want to have a
  • 00:01:50 word which starts with that obviously it
  • 00:01:52 should be followed by another set of
  • 00:01:55 digits so 0 to 9 and that also may occur
  • 00:02:00 as often as possible or as wanted but at
  • 00:02:03 least one time now let's say I then want
  • 00:02:07 to have uppercase characters at a
  • 00:02:11 minimum of one character but then any
  • 00:02:13 amount of uppercase characters and
  • 00:02:15 finally it should end with an
  • 00:02:19 exclamation mark now if I add more to
  • 00:02:23 this first word it is met or matched up
  • 00:02:26 to this exclamation mark the second word
  • 00:02:28 isn't matched though because it doesn't
  • 00:02:31 fulfill this pattern it does contain a
  • 00:02:34 portion which is lowercase characters
  • 00:02:37 followed by a digit so this part up
  • 00:02:40 until the plus after the digit range
  • 00:02:42 would be met but there after we require
  • 00:02:46 a range of uppercase characters and
  • 00:02:48 exclamation mark and both is missing
  • 00:02:51 here it starts with that but that
  • 00:02:53 doesn't matter because we're not looking
  • 00:02:55 at anywhere in this string no this order
  • 00:02:59 is the right order we expect to have and
  • 00:03:03 this order is a must it's not option
  • 00:03:05 regular expressions are read from left
  • 00:03:09 to right and hence the order matters
  • 00:03:11 there are ways to kind of still write
  • 00:03:15 expressions where the order doesn't
  • 00:03:17 matter but for now let's stick to this
  • 00:03:19 single the truth is simpler expression
  • 00:03:21 here
  • 00:03:21 now one thing on one issue we have is if
  • 00:03:25 I add more characters to the first word
  • 00:03:28 this doesn't match the full word and it
  • 00:03:31 shouldn't because the word should end
  • 00:03:33 with an exclamation mark but if I add
  • 00:03:36 more characters I don't want to have the
  • 00:03:39 word match this at all I don't want to
  • 00:03:42 have to pattern match this word at all I
  • 00:03:44 should say it shouldn't match it up
  • 00:03:46 until to that point it should find no
  • 00:03:49 match and kind of similarly if I remove
  • 00:03:52 that uppercase and exclamation mark
  • 00:03:54 part here it also matches the lowercase
  • 00:03:59 a number part in the second word now I
  • 00:04:02 set the order matches but that simply
  • 00:04:04 means it looks for disorder anywhere in
  • 00:04:07 the string I also want to tell it the
  • 00:04:10 word should actually also start with
  • 00:04:13 that and it should end with what comes
  • 00:04:16 at the end of the regular expression to
  • 00:04:19 do this there are two special characters
  • 00:04:21 at the beginning of the regular
  • 00:04:23 expression we can add this Caray symbol
  • 00:04:27 or caret what is called caret car a
  • 00:04:29 circumflex I don't know so there's
  • 00:04:32 assembly the character we need and it
  • 00:04:35 should end with a dollar sign
  • 00:04:38 now dollar sign here has a special
  • 00:04:40 meaning it doesn't mean the word should
  • 00:04:42 end with a dollar sign
  • 00:04:42 it simply means what's contained between
  • 00:04:45 this career carrot and this dollar sign
  • 00:04:48 what's contained between them that
  • 00:04:51 actually is the full word this is why we
  • 00:04:54 don't have any match here anymore if I
  • 00:04:56 add my previous rule set of a two set
  • 00:05:01 and then an exclamation mark we still
  • 00:05:03 have no match if I remove that extra
  • 00:05:07 part at the end of the first word though
  • 00:05:09 and I remove the second word too now we
  • 00:05:13 have a match though because now the full
  • 00:05:16 string which you passed is exact
  • 00:05:18 a word with that order of characters
  • 00:05:21 anything thereafter and even if it's
  • 00:05:23 only a blank space we'll break it and
  • 00:05:26 this of course is often needed behavior
  • 00:05:29 think about passwords which you want to
  • 00:05:31 validate or email addresses
  • 00:05:34 especially in email addresses you want
  • 00:05:36 to have some text at sign some text and
  • 00:05:39 a domain ending you can't mess up the
  • 00:05:43 order a word has to contain all these
  • 00:05:46 things in exactly disorder and this is
  • 00:05:48 what we enforce with this strange
  • 00:05:51 character and the dollar sign everything
  • 00:05:54 in between has to be included and
  • 00:05:56 included in this order this is what
  • 00:05:58 we're saying not more not less so this
  • 00:06:02 is how we build such patterns from left
  • 00:06:04 to right potentially with these special
  • 00:06:07 characters at start and end to contain
  • 00:06:10 it and one word and respecting the order
  • 00:06:14 here in the matched result – if you omit
  • 00:06:17 the dollar sign at the end and decorate
  • 00:06:20 Caray whatever at their start then it
  • 00:06:23 also respects the order but it matches
  • 00:06:27 anything which fulfills this criteria no
  • 00:06:30 matter if something comes thereafter or
  • 00:06:33 if something comes in front of it this
  • 00:06:35 is a key takeaway here this is how you
  • 00:06:37 build such patterns with the tools you
  • 00:06:39 have you know the next videos we'll have
  • 00:06:42 a look at matching email addresses and
  • 00:06:44 how to build such patterns because for
  • 00:06:48 regular expressions a lot of
  • 00:06:50 understanding really comes by practicing
  • 00:06:53 it and seeing how things to work
  • 00:06:55 together so let's have a look at this in
  • 00:06:57 the next videos