3

I have a Table which stores data in a JSONB column.

Now, what i want to do is, query that table, and fetch records, which have specific values for a key.

This works fine:

SELECT "documents".*
FROM "documents"
WHERE (data @> '{"type": "foo"}')

But what i want to do is, fetch all the rows in the table, which have types foo OR bar.

I tried this:

SELECT "documents".*
FROM "documents"
WHERE (data @> '{"type": ["foo", "bar"]}')

But this doesn't seem to work.

I also tried this:

SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar'])

Which works, but if I specify a key like so data->'type' it takes away the dynamicity of the query.

BTW, I am using Ruby on Rails with Postgres, so all the queries are going through ActiveRecord. This is how:

Document.where("data @> ?", query)

1 Answer 1

3
+50

if I specify a key like so data->'type' it takes away the dynamicity of the query.

I understand you have a gin index on the column data defined like this:

CREATE INDEX ON documents USING GIN (data);

The index works for this query:

EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> '{"type": "foo"}';

                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on documents  (cost=30.32..857.00 rows=300 width=25) (actual time=0.639..0.640 rows=1 loops=1)
   Recheck Cond: (data @> '{"type": "foo"}'::jsonb)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on documents_data_idx  (cost=0.00..30.25 rows=300 width=0) (actual time=0.581..0.581 rows=1 loops=1)
         Index Cond: (data @> '{"type": "foo"}'::jsonb)
 Planning time: 7.928 ms
 Execution time: 0.841 ms

but not for this one:

EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar']);

                                                QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Seq Scan on documents  (cost=0.00..6702.98 rows=300 width=25) (actual time=31.895..92.813 rows=2 loops=1)
   Filter: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
   Rows Removed by Filter: 299997
 Planning time: 1.836 ms
 Execution time: 92.839 ms

Solution 1. Use the operator @> twice, the index will be used for both conditions:

EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> '{"type": "foo"}'
OR data @> '{"type": "bar"}';

                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on documents  (cost=60.80..1408.13 rows=600 width=25) (actual time=0.222..0.233 rows=2 loops=1)
   Recheck Cond: ((data @> '{"type": "foo"}'::jsonb) OR (data @> '{"type": "bar"}'::jsonb))
   Heap Blocks: exact=2
   ->  BitmapOr  (cost=60.80..60.80 rows=600 width=0) (actual time=0.204..0.204 rows=0 loops=1)
         ->  Bitmap Index Scan on documents_data_idx  (cost=0.00..30.25 rows=300 width=0) (actual time=0.144..0.144 rows=1 loops=1)
               Index Cond: (data @> '{"type": "foo"}'::jsonb)
         ->  Bitmap Index Scan on documents_data_idx  (cost=0.00..30.25 rows=300 width=0) (actual time=0.059..0.059 rows=1 loops=1)
               Index Cond: (data @> '{"type": "bar"}'::jsonb)
 Planning time: 3.170 ms
 Execution time: 0.289 ms

Solution 2. Create an additional index on (data->'type'):

CREATE INDEX ON documents USING GIN ((data->'type'));

EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar']);

                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on documents  (cost=30.32..857.75 rows=300 width=25) (actual time=0.056..0.067 rows=2 loops=1)
   Recheck Cond: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
   Heap Blocks: exact=2
   ->  Bitmap Index Scan on documents_expr_idx  (cost=0.00..30.25 rows=300 width=0) (actual time=0.035..0.035 rows=2 loops=1)
         Index Cond: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
 Planning time: 2.951 ms
 Execution time: 0.108 ms   

Solution 3. In fact this is a variant of the solution 1, with a different format of the condition which may be more convenient to use by the client program:

EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> any(array['{"type": "foo"}', '{"type": "bar"}']::jsonb[]);

                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on documents  (cost=60.65..1544.20 rows=600 width=26) (actual time=0.803..0.819 rows=2 loops=1)
   Recheck Cond: (data @> ANY ('{"{\"type\": \"foo\"}","{\"type\": \"bar\"}"}'::jsonb[]))
   Heap Blocks: exact=2
   ->  Bitmap Index Scan on documents_data_idx  (cost=0.00..60.50 rows=600 width=0) (actual time=0.778..0.778 rows=2 loops=1)
         Index Cond: (data @> ANY ('{"{\"type\": \"foo\"}","{\"type\": \"bar\"}"}'::jsonb[]))
 Planning time: 2.080 ms
 Execution time: 0.304 ms
(7 rows)

Read more in the documentation.

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for the response @klin. The Solution 1 would work, but I was thinking of some other way, coz Rails is the API part, and there can be request of any kind, and i don't want to get into parsing params. So was thinking of some cleaner way, which i feel should be there, as this is just an elementary operation. What do you think?
I know what you mean and that's why I wrote about the second solution. There is no cleaner one.
Yeah, but the attribute can change, it will not always be type, so again, the parsing would come into picture.
Thanks @klin. Solution 3 looks promising.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.