4

I want to filter an array inside an array in Mongo DB. For example, given these 2 documents:

{ 
"_id" : 1.0, 
"pages" : [
    {
        "texts" : [
            {
                "text" : "foo"
            }, 
            {
                "text" : "foo1"
            }, 
            {
                "text" : "foo2"
            }
        ]
    }
]
}
{ 
"_id" : 2.0, 
"pages" : [
    {
        "texts" : [
            {
                "text" : "bar"
            }, 
            {
                "text" : "bar1"
            }, 
            {
                "text" : "bar2"
            }
        ]
    }
]
}

I want a query that finds the _id and text given the text as input, for example, for the keyword "bar1" the output would be something like this:

{ 
"_id" : 2.0, 
"text": "bar1"
}

I can achieve this if the text that I am looking for is in a single array at the top level of the document, but I cannot do it with an array inside another array.

So far my query looks like this:

db.collection.aggregate([{$match:{"pages.texts.text":"bar1"}},
                { $project:
                    { text: 
                        { $filter: 
                            { input: '$pages.texts.text', 
                              as: 'text', 
                              cond: { $eq: [ "$$text" , "bar1"]}
                            }
                        },
                      _id:1
                    }
                }
                ])

Which gives as output this:

{ 
"_id" : 2.0, 
"text" : [

]
}

It shows the correct _id but no text.

If I change the operator $eq for $ne in the filter the output is:

{ 
"_id" : 2.0, 
"text" : [
    [
        "bar", 
        "bar1", 
        "bar2"
    ]
]
}

It seems that text is an array of arrays and I cannot compare directly the keyword with this field. I have tried different approaches to solve this but no luck so far.

Any help would be greatly appreciated.

Thank you!

1 Answer 1

3

Since you only want the (double) nested element that matches your search, you can $unwind both arrays first, then just use $match to get to the element you want:

[
    {
        $unwind: 
            "$pages"

    },
    {
        $unwind: 
            "$pages.texts"

    },
    {
        $match:{
            "pages.texts.text":"bar1"
        }
    },
    {
        $project: {
            _id:true,
            text: "$pages.texts.text"
        }
    }
]               
Sign up to request clarification or add additional context in comments.

2 Comments

I would add {$project:{_id:1, pages:1} as the first stage to optimise it a bit.
Unfortunately I cannot do this because $pages.texts.text has a text index (a compount text-ascending index with another field) and mongo db does not allow me to match in that way. I didn't mention that too keep things simple, but it seems it affects.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.