Coding

#20 Using Dataloader | Build a Complete App with GraphQL, Node.js, MongoDB and React.js

  • 00:00:01 hi and welcome back to this series the
  • 00:00:04 application we're building is taking
  • 00:00:06 shape doesn't look too shabby and we now
  • 00:00:08 have a front-end that can interact with
  • 00:00:10 all our back-end API endpoints we built
  • 00:00:13 with graph QL but one problem we're
  • 00:00:15 facing thus far is the performance if I
  • 00:00:19 reload this page fetching these events
  • 00:00:22 here is okayish regarding the speed but
  • 00:00:26 we still will have the problem here that
  • 00:00:29 we have a lot of unnecessary round trips
  • 00:00:31 don't know why well we'll take a look at
  • 00:00:34 the reason and then also at the remedy
  • 00:00:35 in this video
  • 00:00:40 so this is the app we built and when I'm
  • 00:00:42 fetching all my events what in the end
  • 00:00:44 happens if we have a look at our graph
  • 00:00:46 kill schema is that this events query
  • 00:00:49 here gets executed now this returns us
  • 00:00:52 an array of events nice the problem with
  • 00:00:56 that just is an event if we have a look
  • 00:00:59 at its definition also has a creator
  • 00:01:02 field the Creator field and turn gives
  • 00:01:05 us a user and now if we have a look at
  • 00:01:07 our resolver we see that for events we
  • 00:01:10 solve that problem by we have a look at
  • 00:01:12 trends from event where the magic
  • 00:01:14 happens by simply executing this user
  • 00:01:21 function which happens to be this
  • 00:01:23 function which makes another database
  • 00:01:25 request the problem with this request
  • 00:01:28 justice it's made on every event we're
  • 00:01:31 returning so if we are returning one
  • 00:01:33 event and we also want to get the name
  • 00:01:35 of the creator of that events of the
  • 00:01:37 user who created the event then we have
  • 00:01:40 two requests right one request to get
  • 00:01:42 that event or all events actually and we
  • 00:01:44 only have one let's say and then one
  • 00:01:46 extra request for that one event we
  • 00:01:49 found to get its user data now the
  • 00:01:51 problem is if we have two events in our
  • 00:01:54 database then we make one request get
  • 00:01:57 both events and then for every went we
  • 00:02:00 make an extra request to get its user
  • 00:02:02 data so now we have three requests
  • 00:02:04 instead of just queue and that problem
  • 00:02:07 becomes worse and worse the deeper we
  • 00:02:10 drill into our data structure so if we
  • 00:02:12 then for that user also want to have the
  • 00:02:15 users events or anything like that or
  • 00:02:17 the more events and so on behalf in the
  • 00:02:19 database this is certainly not something
  • 00:02:22 we can leave like this because it's not
  • 00:02:25 scalable it's not a solution you would
  • 00:02:27 want in a production ready application
  • 00:02:29 now a tool that can help us with that is
  • 00:02:32 data loader if you google for data
  • 00:02:36 loader you should find this get up a
  • 00:02:37 repository on Facebook's
  • 00:02:39 get up account basically and this is a
  • 00:02:42 tool that helps you with batching
  • 00:02:45 your requests so here you find detailed
  • 00:02:49 instructions in case you want to dive
  • 00:02:50 deeper in the end it's a NPM library we
  • 00:02:52 can install
  • 00:02:53 where we can define certain queries and
  • 00:02:56 then the data loader will detect when we
  • 00:02:58 make a query that well fits one of our
  • 00:03:02 predefined queries so to say and it will
  • 00:03:05 then look if it already made that
  • 00:03:06 request in the past and we'll take the
  • 00:03:08 response from there or otherwise make
  • 00:03:11 the request here but automatically batch
  • 00:03:13 it with all requests that need the same
  • 00:03:15 query so a concrete example would be
  • 00:03:17 that we set up a batch query for the
  • 00:03:20 user where we are able to accept
  • 00:03:22 multiple user IDs and then we return all
  • 00:03:25 users that matched IDs IDs and with such
  • 00:03:28 a batching function set up we can
  • 00:03:31 request individual users and data loader
  • 00:03:33 will group them before making one
  • 00:03:35 request and therefore it reduces the
  • 00:03:37 amount of requests made so to make this
  • 00:03:40 work
  • 00:03:41 I will install data loader here with the
  • 00:03:47 command you also see here on the github
  • 00:03:49 repository and for that I quit my server
  • 00:03:53 my back-end server not the one serving
  • 00:03:56 the react up there after we can restart
  • 00:03:59 it now this is installed and now we can
  • 00:04:01 start implementing data loader and it
  • 00:04:03 will implement it here in my merge J's
  • 00:04:05 file which is in the end where I have
  • 00:04:07 all the logic for a drilling deeper into
  • 00:04:11 my into my resolved data so to say so
  • 00:04:15 into my models which kind of data
  • 00:04:19 loaders do we need well we certainly
  • 00:04:21 need one for events and one for users
  • 00:04:24 and then probably also one for bookings
  • 00:04:26 but let's start with the events one for
  • 00:04:29 that I'll create a new constant and I'll
  • 00:04:31 name it event loader the name is up to
  • 00:04:33 you though and I want to use the data
  • 00:04:35 loader package for this we need to
  • 00:04:37 import it and I'll store it in a
  • 00:04:40 constant name data loader and we
  • 00:04:42 imported by requiring data loader just
  • 00:04:45 as you import any libraries into a
  • 00:04:47 node.js application and here I will
  • 00:04:50 create a new data loader object and this
  • 00:04:54 now takes a function a batching function
  • 00:04:56 that it can execute for all kinds of in
  • 00:04:59 this case events so here I want to add a
  • 00:05:03 function that is able to to Han
  • 00:05:07 all the different requests I might have
  • 00:05:09 in my application and now in this
  • 00:05:11 application I need to be able to get a
  • 00:05:14 list of events and a single event and
  • 00:05:16 therefore here I will get my event IDs
  • 00:05:20 let's say as an input now in here I want
  • 00:05:24 to execute a function that can give me
  • 00:05:26 these event IDs and I will return the
  • 00:05:29 result of that function so here I will
  • 00:05:31 actually execute events so this function
  • 00:05:34 which does just that it takes an array
  • 00:05:36 of event IDs and then returns me the
  • 00:05:38 events it founds with that ID so here
  • 00:05:41 event IDs is what I pass on to the
  • 00:05:46 events function because that is what
  • 00:05:47 this function requires so now that is
  • 00:05:50 our first simple data loader which is
  • 00:05:53 capable of loading our events and we can
  • 00:05:55 pass in any IDs also just one ID if we
  • 00:05:58 want you and it will fetched it but not
  • 00:06:01 immediately
  • 00:06:01 instead in one tick of our node event
  • 00:06:04 loop on the back end it will gather all
  • 00:06:07 requests it finds that want to get one
  • 00:06:10 or multiple events identified by their
  • 00:06:13 IDs and it will then group them together
  • 00:06:17 now where do we want to use that event
  • 00:06:20 loader well in the events for solver we
  • 00:06:24 are fetching all events and I'll
  • 00:06:26 actually leave this like this because
  • 00:06:27 the data loader works when it receives
  • 00:06:30 keys for data we want to fetch be denied
  • 00:06:33 an ID or a first name but it needs some
  • 00:06:36 identifier otherwise it can't tell
  • 00:06:38 whoever it already fetched that data and
  • 00:06:40 should use that or make a new request
  • 00:06:42 basically and it can't merge the
  • 00:06:44 requests together into one big one so
  • 00:06:47 I'll leave this here and event find but
  • 00:06:50 I can use my event loader here when I'm
  • 00:06:53 fetching a single event for example
  • 00:06:56 instead of a waiting for event find PI
  • 00:06:58 ID I can use the event loader and there
  • 00:07:02 we now have a load method I can call and
  • 00:07:04 to load I can pass that ID that event ID
  • 00:07:08 just like that and event loader will do
  • 00:07:11 or data loader will do the rest will
  • 00:07:14 basically register this single event ID
  • 00:07:16 then see if in the same tick of the
  • 00:07:19 node.js event loop
  • 00:07:20 our requests to event loader with
  • 00:07:23 different IDs speeded one or multiple
  • 00:07:25 ones are sent and it will then merge all
  • 00:07:27 these IDs together send the request to
  • 00:07:30 the database with the logic we defined
  • 00:07:32 here so by calling this events function
  • 00:07:35 our case here and then the results that
  • 00:07:37 are returned are basically split up by
  • 00:07:39 event loader again so that it knows ok
  • 00:07:42 you wanted that single ID here's your
  • 00:07:44 chunk you wanted these free IDs here is
  • 00:07:47 your chunk that is how it works and that
  • 00:07:50 is why it makes sense to use it for
  • 00:07:51 single event now for user where I also
  • 00:07:56 call events here and bind my bind test
  • 00:08:00 method to basically use my user doc
  • 00:08:04 created events which is an array of IDs
  • 00:08:06 there we can also use event loader load
  • 00:08:11 and bind this function now so that this
  • 00:08:14 function now receives a user doc created
  • 00:08:16 events now in transform event there we
  • 00:08:21 can later do the same for the user with
  • 00:08:22 the user loader right now I don't need
  • 00:08:24 it for booking where I get an event well
  • 00:08:27 there I do already use single event this
  • 00:08:30 function from up there and there I
  • 00:08:32 already use the event loader so
  • 00:08:33 indirectly I use the event loader here
  • 00:08:35 already and therefore if I now save this
  • 00:08:38 in our application all events are still
  • 00:08:43 fetched like this if I have a look at my
  • 00:08:47 bookings and for that let me quickly
  • 00:08:51 book an event here then I do actually
  • 00:08:54 get an error here that I cannot read
  • 00:08:56 property date of undefined the reason
  • 00:08:59 for that is that my event loader uses
  • 00:09:01 this events function where I already
  • 00:09:03 returned a transformed event so all the
  • 00:09:06 events I fetch are already transformed
  • 00:09:08 and in transform event I do already
  • 00:09:11 adjust the date and so on the problem is
  • 00:09:14 in my bookings when I fetch these
  • 00:09:16 bookings I call transform booking on
  • 00:09:19 that and there I call single event for
  • 00:09:23 every event I'm fetching and now if we
  • 00:09:25 have a look at that single event
  • 00:09:26 function I transform this again now the
  • 00:09:29 way we restructured our logic single
  • 00:09:32 event always uses event loader
  • 00:09:34 which in turn always will use events
  • 00:09:37 function which already transforms all
  • 00:09:40 events so which basically adjusts them
  • 00:09:42 as we need it already hence here there
  • 00:09:44 is no need to transform it again and we
  • 00:09:47 can just return it like that and if we
  • 00:09:49 do this if in single event we return
  • 00:09:52 just the event now if we reload and lock
  • 00:09:55 back in and go to the bookings page we
  • 00:09:59 do get our booking again now let's also
  • 00:10:02 add a user loader also to practice this
  • 00:10:05 so here I'll add my user loader a new
  • 00:10:08 data loader that is and just as before
  • 00:10:11 the data loader always needs an array of
  • 00:10:14 identifiers because it will then merge
  • 00:10:16 all identifiers together make a batch
  • 00:10:18 request that split the result up so to
  • 00:10:20 say so here I expect to get my user IDs
  • 00:10:24 and now the function I want to execute
  • 00:10:26 there well previously we had events here
  • 00:10:30 we have no users function I only have a
  • 00:10:32 function for a single user now turns out
  • 00:10:35 in this app I never needed to fetch more
  • 00:10:38 than one user but now we do that's the
  • 00:10:40 idea behind batching because if we
  • 00:10:42 retrieve multiple events and for every
  • 00:10:44 event we want to have the Creator then
  • 00:10:46 now we actually want to merge this
  • 00:10:47 together into one request to the user
  • 00:10:50 database which fetches users for all IDs
  • 00:10:53 of all the users of the events were
  • 00:10:55 trying to access and that is exactly
  • 00:10:56 what data loader will do for us so
  • 00:10:59 therefore here I will define my own
  • 00:11:02 logic where I return user and this is my
  • 00:11:09 user model find and I want to find all
  • 00:11:13 users where the ID is in and that's the
  • 00:11:17 same logic as with events here is in my
  • 00:11:22 user IDs array here and that
  • 00:11:25 automatically returns a promise which is
  • 00:11:28 exactly what data loader needs we need
  • 00:11:30 to return a promise here user find
  • 00:11:32 returns such a promise with our array of
  • 00:11:34 users now we have that user loader and
  • 00:11:37 in places where I need it for example
  • 00:11:42 here
  • 00:11:43 trends from event I can now call user
  • 00:11:48 loader load and find this so that the
  • 00:11:52 event creator which is the ID of the
  • 00:11:54 user that created that event is passed
  • 00:11:57 into the load function here when this is
  • 00:11:59 called which in turn happens when we
  • 00:12:01 request AB dead extra creator data for
  • 00:12:06 bookings it's the same here instead of
  • 00:12:09 accessing my user function here I
  • 00:12:13 instead want you use my user loader but
  • 00:12:16 actually now that I think about this let
  • 00:12:18 me reverse this here for transform event
  • 00:12:21 I will use this user function we used
  • 00:12:23 before so the exact same code I used
  • 00:12:24 before and also for booking here I'll
  • 00:12:27 stick to user bind so I'll stick to this
  • 00:12:29 function and in this function here I now
  • 00:12:32 want to use my user loader simply
  • 00:12:35 because here I then do the additional
  • 00:12:37 setup you link to my created events and
  • 00:12:40 so on and don't want to rewrite that
  • 00:12:41 code we would have to rewrite it here in
  • 00:12:44 our user loader otherwise I'll not do
  • 00:12:47 this so instead here I'll simply call
  • 00:12:49 user loader load for that given user ID
  • 00:12:53 now the effect of that should be that if
  • 00:12:56 I save this let's have a look at our
  • 00:13:00 front and events page here events chasse
  • 00:13:03 in the pages folder the function where I
  • 00:13:06 fetch all events which is this one here
  • 00:13:11 there I do get my creator ID and email
  • 00:13:14 and this should now not lead to
  • 00:13:16 duplicate requests anymore instead it
  • 00:13:21 failed so let's quickly have a look at
  • 00:13:23 the error data loader must be
  • 00:13:25 constructed with a function the function
  • 00:13:27 did not return of the same length so
  • 00:13:30 we're essentially returning wrong data
  • 00:13:33 here in the user loader we're not
  • 00:13:36 returning the array of data it expected
  • 00:13:40 and that can actually be a hard error to
  • 00:13:42 debug it gets easier if we dump a
  • 00:13:45 console log into our user loader up
  • 00:13:47 there and lock the user IDs we're
  • 00:13:49 getting if we do that and we reload our
  • 00:13:52 page with the events which fails we see
  • 00:13:55 that we get an array with these
  • 00:13:57 data fields now we're trying to fetch
  • 00:14:00 five different users but if we look
  • 00:14:02 closer we see that all but one have the
  • 00:14:04 same ID now normally data loader should
  • 00:14:08 intelligently merge this together it
  • 00:14:10 should not try to make that you
  • 00:14:12 basically get five different data pieces
  • 00:14:15 here if four of them are for the same
  • 00:14:16 key the problem and that's really hard
  • 00:14:19 to spot is here when we use user loader
  • 00:14:23 load I pass in my user ID which I get
  • 00:14:26 here and then turn the user ID I'm
  • 00:14:28 getting here in the user function for
  • 00:14:30 example comes from transform event here
  • 00:14:32 I do pass in my event creator here and
  • 00:14:35 that is an ID but what is an ID in
  • 00:14:38 MongoDB it's not a string it's this
  • 00:14:41 object ID thing instead and it is an
  • 00:14:43 object therefore in JavaScript objects
  • 00:14:47 are not primitives so the check data
  • 00:14:50 loader performs where it's checks if a
  • 00:14:52 given key is already included in the
  • 00:14:55 array of keys it constructed will fail
  • 00:14:57 here because queue objects even if they
  • 00:15:01 hold the same value are not equal in
  • 00:15:03 JavaScript that can be tricky to wrap
  • 00:15:05 your head around but that is how
  • 00:15:06 JavaScript works and therefore in the
  • 00:15:09 end all these IDs are treated separately
  • 00:15:11 because they aren't just strings they
  • 00:15:14 are objects here even though we don't
  • 00:15:15 see that here so this fix is simple in
  • 00:15:19 the user function where we call user
  • 00:15:21 loader load we should call queue string
  • 00:15:25 here so to convert this object user ID
  • 00:15:29 this object ID thing here queue a string
  • 00:15:32 and thereafter this will work now by the
  • 00:15:36 way it of course is the same for the
  • 00:15:37 event loader otherwise we'll have the
  • 00:15:40 same inefficiency here so event loader
  • 00:15:42 load to string with that fixed now if we
  • 00:15:46 save that if we reload events now we're
  • 00:15:49 fetching all events and this will now
  • 00:15:51 make fewer requests because it will not
  • 00:15:54 fetch the user for every single event
  • 00:15:56 here and make an extra request instead
  • 00:15:58 data loader intelligently merges this
  • 00:16:01 together on the back end and then makes
  • 00:16:03 one combined request to the users
  • 00:16:05 collection gets all the user IDs and
  • 00:16:07 then returns them back to the functions
  • 00:16:10 that original
  • 00:16:11 wanted them which were our event object
  • 00:16:14 resolvers in the end and there we
  • 00:16:17 therefore it and half the user data but
  • 00:16:19 one combined request was sent now last
  • 00:16:23 but not least let's also see in our
  • 00:16:27 schema what else makes sense for a user
  • 00:16:31 we got no nested data here or we got the
  • 00:16:34 events actually but we already handled
  • 00:16:36 this with the event loader for an event
  • 00:16:38 we got the Creator which we're handling
  • 00:16:40 with the user loader and we need no
  • 00:16:42 booking loader because in this API I
  • 00:16:45 have no other data where I would get the
  • 00:16:48 related bookings I only get related
  • 00:16:51 events and users and that is handled
  • 00:16:53 with these two loaders so wrapping your
  • 00:16:56 head around data loader can be complex
  • 00:16:58 in the end it's a batching mechanism it
  • 00:17:00 makes sure that multiple requests are
  • 00:17:03 bad together database requests I mean
  • 00:17:06 our batch or a merged together so that
  • 00:17:09 one bigger request descent for all the
  • 00:17:11 keys you need it and then this is
  • 00:17:13 returned and this is then split up back
  • 00:17:16 such that your app and the different
  • 00:17:18 parts of your app that requests that the
  • 00:17:20 different keys get their data this magic
  • 00:17:23 so to say is done by data loader and it
  • 00:17:26 speeds up our API and simply prevents
  • 00:17:28 duplicate requests and therefore indeed
  • 00:17:31 if we log in this is now actually faster
  • 00:17:34 because of our improved setup here the
  • 00:17:39 rest of course still works as before