MongoDB Aggregate Query: A simple example
Let's take an example MongoDB Aggregate Query to understand certain fundamentals of aggregate queries.
The speaker presents an example of a MongoDB aggregate query designed to find hotels associated with a specific restaurant in a particular city. Starting with matching the provided restaurant ID, the query proceeds with lookups and unwinding of arrays to navigate through the relationships between restaurants, hotels, and cities. It carefully matches IDs and attributes at each stage to filter the results effectively. Finally, the query sorts the output by hotel ID and city ID. This example illustrates the use of MongoDB’s aggregate framework to handle complex data relationships and retrieve targeted information efficiently.
Summary
Introduction to Aggregate Queries:
Provides a brief overview of aggregate queries, setting the stage for the demonstration.
Use Case Introduction:
Describes a hypothetical scenario to be used for demonstration purposes.
Introduces the collections involved: restaurant, hotel, and city.
Requirement Analysis:
Explains the specific requirement of finding a restaurant tied to certain hotels in a particular city.
Explanation of Aggregate Query Components:
Breaks down the aggregate query step by step:
Starts with the driving collection (restaurant collection).
Matches with the restaurant ID.
Performs a lookup to find hotels associated with the restaurant.
Unwinds arrays in the hotel collection.
Matches hotels with linked restaurants.
Unwinds arrays in the city collection.
Performs a lookup to find cities associated with hotels.
Filters cities based on the desired city name.
Optionally unwinds arrays within the city collection.
Sorts the results by hotel ID and city ID.
Podcast
Transcript
0:00
Hope you’re doing well. In this video, let’s take a quick look at an aggregate query. We’ll examine a MongoDB aggregate query and learn a little bit about it. I’ve written a query that’s not super complex. It’s got a couple of joins, so to speak.
0:16
So let’s dig into it. I’ve come up with a use case here. It’s again a figment of my imagination. I’m not working on anything related to this, but hopefully, this example makes sense. So here we have a restaurant collection.
0:35
We have three collections here: a restaurant collection, a hotel collection, and then we also have a city collection, right? So let’s say our need here. The requirement here is to find all the… Now we have a restaurant ID.
0:52
So let’s say I’m looking for a particular restaurant that’s tied to a certain… that’s tied to a number of hotels which happen to be located in a particular city that’s of interest to me, right? So I have a restaurant that I have in mind and I have a city that I have in mind.
1:11
The hotels are in between, right? So I don’t care whether the restaurant is in a Marriott or Hilton, for instance, as long as it’s in a hotel and it’s not standalone. I’m interested in that restaurant, but I want it to be a certain restaurant, a particular one that I’m looking for.
1:30
And I’m interested in a city. So if the hotel happens to be located in a different city, then I’m not interested, right? So that’s kind of high-level, that’s the high-level requirement. So here’s how an aggregate query might look like. So you would start, the driving collection is important, the driver table or the driver collection in this case, because a lot of it is, you know, you want to reduce your results as quickly as possible.
1:57
So you want to start off with your collection of interest, right? Let’s call it the driver collection. So we start with the restaurant and the first thing we do is we match with the restaurant ID. So remember I mentioned that I’m interested in a particular restaurant and let’s in this case I happen to have the ID of the restaurant.
2:18
It could have been a name, in which case this would be underscore name possibly and then we would do a restaurant name, right? So doesn’t matter. So we look for that restaurant. The name may or may not be unique, so ID is a little bit more appropriate in this case.
2:35
So let’s go with the ID. Then we do a lookup. So we look up all the hotels, the hotel collection that happens to have this restaurant, right? So we do a lookup from that hotel collection, Let essentially is a variable assignment, right?
2:54
So we are saying assign the ID of the restaurant to a variable called restaurant ID. That’s what we’re doing here. And then we have pipelines, right? MongoDB aggregates works by virtue of pipelines. You can go through sort of a funnel, imagine, think of it as a funnel.
3:11
You keep filtering and you end up with what you’re looking for eventually. So the first pipeline, we have something that says unwind link to restaurants. So let’s understand this better. Link to restaurants is essentially an attribute on the hotel collection.
3:32
It has an array of IDs that this hotel is linked to. So it’s an array of restaurant IDs that this hotel is linked to. So we are interested in this particular restaurant ID, say 123.
3:48
But before we do the actual before we execute this part of the query, we need to do an unwind because it’s an array. So in MongoDB aggregates, if you want to do a query on arrays, then you need to make sure that you unwind that.
4:06
So essentially you’re able to get to the attributes inside of those arrays, right? So it’s essentially an array of hashes. But to get to the attribute inside the array, you need to do an unwind, and the preserved null and empty arrays is essentially what it sounds like, right?
4:23
So if it is no match, do you want the hotel or do you not want the hotel, right? That’s what this determines. The default is false, so if it’s a default you don’t have to include it. If it’s false, you don’t have to include it. And you could return this in a bit more succinct manner where you just do unwind link to restaurants.
4:44
You can replace the whole path, but in this case, I am interested in the hotel even if there is no match. So I just have a preserved null and true. Then we have a match here that essentially says link to restaurants ID.
4:59
So remember I mentioned that there is an ID attribute which is essentially why we are doing an unwind. We go do a match against that particular attribute and we use the variable that we assigned up here on line #7, the restaurant ID.
5:16
So when this part of the and then the as is an alias. So when this part of the aggregate executes, you now have all the hotels that this particular restaurant is linked to. And then we keep going, we unwind the hotels because now the hotel has another array and we preserve the null and empty arrays and that attribute, that array attribute that we interested in, in the hotels collection is linked to cities, right.
5:42
So we first unwind hotels, then we unwind linked to cities because that also is an array and that has nested attributes. Now we do a lookup in the city further down. We are sort of at Level 3 right now.
5:58
We’re going to do something similar. We’re going to do hotels dot link to hotels dot ID and we’re going to assign it to the hotel ID attribute. I don’t know why I had the hotels here. I’m not sure if Oh yeah, I think I will need it because this we’ve used that particular alias.
6:18
So let’s keep going. So we have assigned the hotel ID. So all the hotels that have this restaurant linked to it, we’re going to take each of those hotels and look for that ID. And now we compare that hotel ID, again, it’s assigned here, we compare it to the city ID essentially, right.
6:37
So we have another pipeline that’s inside the city collection. We’re going to make a comparison and then we have another equality check that says if the city name equals something, right. So just to recap, we picked a restaurant ID and then we went said give me all the hotels that have the restaurant linked to it and then we say give me all the cities that this hotel is linked to and if that city is the one that I’m interested in then make this part of the result set right.
7:05
And then finally we unwind cities as well because we want some more attributes from city like last modify. We only need to do this unwind if we need to extract other attributes from arrays within the city collection.
7:22
I think earlier I had some of the attributes, but this version of the query does not have it. So we may not actually have to do this unwind here possibly. But if you did want to go deeper and extract some more attributes from within the city’s collection that happened to be an array, then you definitely do have to do this unwind.
7:41
And finally we do a sort. We just sort by hotel ID and city ID. The one says you’re doing an ascending order sort. So that’s an example of a MongoDB aggregate query. Hopefully it gives you an idea. Thanks.
Snowpal Products
Backends as Services on AWS Marketplace
Mobile Apps on App Store and Play Store
Education Platform for Learners and Course Creators