4

I am trying a fairly complex aggregate command on two collections involving $lookup pipeline. This normally works just fine on simple aggregation as long as index is set on foreignField.

But my $lookup is more complex as the indexed field is not just a normal Int64 field but actually an array of Int64. When doing a simple find(), it is easy to verify using explain() that the index is being used. But explaining the aggregate pipeline does not explain whether index is being used in the $lookup pipeline. All my timing tests seem to indicate that the index is not being used. MongoDB version is 3.6.2. Db compatibility is set to 3.6.

As I said earlier, I am not using simple foreignField lookup but the 3.6-specific pipeline + $match + $expr...

Could using pipeline be showstopper for the index? Does anyone have any deep experience with the new $lookup pipeline syntax and/or the index on an array field?

Examples

Either of the following works fine and if explained, shows that index on followers is being used.

db.col1.find({followers: {$eq : 823778}})
db.col1.find({followers: {$in : [823778]}})

But the following one does not seem to make use of the index on followers [there are more steps in the pipeline, stripped for readability].

db.col2.aggregate([
    {$match:{field: "123"}},
    {$lookup:{
       from: "col1",
       let : {follower : "$follower"},
       pipeline: [{
            $match: {
                $expr: {
                    $or: [
                        { $eq : ["$follower", "$$follower"] },                       
                        { $in : ["$$follower", "$followers"]}
                       ]
                }                        
            }
        }],
       as: "followers_all"
     }
}])
4
  • did you find anything relevant ? I've found the jira fixed in 3.7.1 ( dev version). Can you update and see if it works ? This may be relavant as well. Commented Jan 18, 2018 at 16:28
  • Well, no. I ended up changing my db structure so that I don't have to use this complex lookup. I am very reluctant to use any sort of developer build on a production database. I guess I will have to wait for the next update cycle. Commented Jan 18, 2018 at 16:48
  • oh no. never meant to have you use dev build in prod. I was just wondering if you would verify and use the prod version when it is available. Commented Jan 18, 2018 at 16:50
  • I'd absolutely use it. But it is going to be a while before the next update. BTW, you seem to have given an answer to my question considering this is probably a bug or a missing feature. If you formulate it into an answer, I'd gladly mark it. Thanks! Commented Jan 18, 2018 at 17:02

1 Answer 1

2

This is a missing feature which is going to part of 3.8 version.

Currently eq matches in lookup sub pipeline are optimised to use indexes. Refer jira fixed in 3.7.1 ( dev version).

Also, this may be relevant as well for non-multi key indexes.

Sign up to request clarification or add additional context in comments.

3 Comments

Hi can you please provide a link to where it is stated that it will be part of 3.8? Cant seem to find any information on this improvement.
@medewit issue linked is backported to 3.6 ( 3.6.3) as well. So it should be available in the latest mongo version.
Thanks for the response, Im currently experiencing the same issue when using $gte in the lookup pipeline stage (vs just using $eq). Ive posted my (similar) question on the MongoDB mailing list.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.