I’m going to show you how to easily query data across multiple collections in MongoDB shell. Which is almost equal to SQL’s JOIN but not ;)
Assume we have two collections users
> db.users.insert({_id: "usr-1", name: "Tom"})
> db.users.insert({_id: "usr-2", name: "John"})
> db.users.insert({_id: "usr-3", name: "Kate"})
> db.users.find()
{ "_id" : "usr-1", "name" : "Tom" }
{ "_id" : "usr-2", "name" : "John" }
{ "_id" : "usr-3", "name" : "Kate" }
and posts
> db.posts.insert({_id: "pst-1", user_id: "usr-1", body: "body 1", tags: ["apples", "oranges"]})
> db.posts.insert({_id: "pst-2", user_id: "usr-1", body: "body 2", tags: ["apples"]})
> db.posts.insert({_id: "pst-3", user_id: "usr-2", body: "body 3", tags: ["oranges"]})
> db.posts.insert({_id: "pst-4", user_id: "usr-1", body: "body 4", tags: []})
> db.posts.insert({_id: "pst-5", user_id: "usr-3", body: "body 5", tags: ["apples"]})
> db.posts.find()
{ "_id" : "pst-1", "user_id" : "usr-1", "body" : "body 1", "tags" : [ "apples", "oranges" ] }
{ "_id" : "pst-2", "user_id" : "usr-1", "body" : "body 2", "tags" : [ "apples" ] }
{ "_id" : "pst-3", "user_id" : "usr-2", "body" : "body 3", "tags" : [ "oranges" ] }
{ "_id" : "pst-4", "user_id" : "usr-1", "body" : "body 4", "tags" : [ ] }
{ "_id" : "pst-5", "user_id" : "usr-3", "body" : "body 5", "tags" : [ "apples" ] }
…and let’s say we need to find all the users who have at least one post tagged with oranges.
There are quite a few options to do it in half manual way or with MongoDB v3.2 and new $lookup operator from aggregation framework, but here is a pretty simple solution which is easy to remember and it’s a one liner:
db.users.find({_id: {$in: db.posts.find({tags: {$in: ["oranges"]}}).map(function(post){ return post.user_id })}})
{ "_id" : "usr-1", "name" : "Tom" }
{ "_id" : "usr-2", "name" : "John" }
Let’s break it down:
-
First part is just a standard finding users with
_id
matching values$in
a given arraydb.users.find({_id: {$in:...})
-
Second part is a sub query which uses JavaScript map() function on the query cursor and maps results into array of users’
_id
(as we know mongodb query does not return an array of objects, but cursor) -
Inside the
map()
functions we specify a callback functions and what field we want to return from every post object. If we run just the second query we would get an array of users’_id
> db.posts.find({tags: {$in: ["oranges"]}}).map(function(post){ return post.user_id })
[ "usr-1", "usr-2" ]
- So basically under the hood our original query becomes this
> db.users.find({_id: {$in: [ "usr-1", "usr-2" ]}})
Another trick is if we have a case when second query users .FindOne()
which returns a singular object instead of cursor, we do not need to use map()
function and our query becomes a bit simpler
> db.users.find({_id: db.posts.findOne({_id: "pst-5"}).user_id})
{ "_id" : "usr-3", "name" : "Kate" }
Hope you enjoyed it! And keep hacking!