SQL Alchemy ORM returning a single column, how to avoid common post processing

Question

I'm using SQL Alchemy's ORM and I find when I return a single column I get the results like so:

[(result,), (result_2,)] # etc...

With a set like this I find that I have to do this often:

results = [r[0] for r in results] # So that I just have a list of result values

This isn't that "bad" because my result sets are usually small, but if they weren't this could add significant overhead. The biggest thing is I feel it clutters the source, and missing this step is a pretty common error I run into.

Is there any way to avoid this extra step?

A related aside: This behaviour of the orm seems inconvenient in this case, but another case where my result set was, [(id, value)] it ends up like this:

[(result_1_id, result_1_val), (result_2_id, result_2_val)]

I then can just do:

results = dict(results) # so I have a map of id to value

This one has the advantage of making sense as a useful step after returning the results.

Is this really a problem or am I just being a nitpick and the post processing after getting the result set makes sense for both cases? I'm sure we can think of some other common post processing operations to make the result set more usable in the application code. Is there high performance and convenient solutions across the board or is post processing unavoidable, and merely required for varying application usages?

When my application can actually take advantage of the objects that are returned by SQL Alchemy's ORM it seems extremely helpful, but in cases where I can't or don't, not so much. Is this just a common problem of ORMs in general? Am I better off not using the ORM layer in cases like this?

I suppose I should show an example of the actual orm queries I'm talking about:

session.query(OrmObj.column_name).all()

or

session.query(OrmObj.id_column_name, OrmObj.value_column_name).all()

Of course, in a real query there'd normally be some filters, etc.

Dag Høidahl · Accepted Answer · 2012-10-29 08:51:21Z

29

One way to decrease the clutter in the source is to iterate like this:

results = [r for (r, ) in results]

Although this solution is one character longer than using the [] operator, I think it's easier on the eyes.

For even less clutter, remove the parenthesis. This makes it harder when reading the code, to notice that you're actually handling tuples, though:

results = [r for r, in results]

answered Oct 29, 2012 at 8:51

Dag Høidahl

8,3458 gold badges55 silver badges70 bronze badges

To do this internal to sqlalchemy in 1.4+, check out @b3000's answer lower down.
– Matt Sanders
Commented May 3, 2023 at 4:36

Add a comment |

Beright · Accepted Answer · 2012-10-04 07:01:50Z

Python's zip combined with the * inline expansion operator is a pretty handy solution to this:

>>> results = [('result',), ('result_2',), ('result_3',)]
>>> zip(*results)
[('result', 'result_2', 'result_3')]

Then you only have to [0] index in once. For such a short list your comprehension is faster:

>>> timeit('result = zip(*[("result",), ("result_2",), ("result_3",)])', number=10000)
0.010490894317626953
>>> timeit('result = [ result[0] for result in [("result",), ("result_2",), ("result_3",)] ]', number=10000)
0.0028390884399414062

However for longer lists zip should be faster:

>>> timeit('result = zip(*[(1,)]*100)', number=10000)
0.049577951431274414
>>> timeit('result = [ result[0] for result in [(1,)]*100 ]', number=10000)
0.11178708076477051

So it's up to you to determine which is better for your situation.

The assumption here is that there might be a million elements in the list and suddenly this doesn't look very clever. — Antti Haapala, Commented May 10, 2020 at 12:23

b3000 · Accepted Answer · 2022-09-07 21:47:01Z

23

Starting from version 1.4 SQLAlchemy provides a method to retrieve results for a single column as a list of values:

# ORM
>>> session.scalars(select(User.name)).all()
['ed', 'wendy', 'mary', 'fred']
# or
>>> query = session.query(User.name)
>>> session.scalars(query).all()
['ed', 'wendy', 'mary', 'fred']

# Core
>>> with engine.connect() as connection:
...     result = connection.execute(text("select name from users"))
...     result.scalars().all()
... 
['ed', 'wendy', 'mary', 'fred']

See the SQLAlchemy documentation.

edited Sep 7, 2022 at 21:47

answered Mar 25, 2022 at 16:06

b3000

1,6771 gold badge17 silver badges28 bronze badges

2

This method is not present in a query object, only after running a literal SQL sentence.
– josemfc
Commented May 13, 2022 at 12:08
5

@josemfc you can do session.scalars(session.query(MyModel.id)) or session.scalars(select(MyModel.id)) if you are using the ORM.
– snakecharmerb
Commented Sep 4, 2022 at 9:00
@b3000 it would be great if you could update your answer to cover ORM models and Core tables.
– snakecharmerb
Commented Sep 4, 2022 at 9:03
1

Note that the snippet submitted by @snakecharmerb, raises a depracation warning : SADeprecationWarning: Object <flask_sqlalchemy.BaseQuery ...> should not be used directly in a SQL statement context, such as passing to methods such as session.execute(). This usage will be disallowed in a future release...
– edg
Commented Oct 6, 2022 at 9:08
Good answer. Just a site note 🐉🔥: If you select multiple columns .scalars().all() returns the first column only. So select(User.name, User.id) has the same ScalarResult as select(User.name). However, to select all columns using select(User), the ScalarResult is helpful again, as it returns a flattened list of User-objects instead of the Row-tuples.
– Kim
Commented Nov 15, 2022 at 22:09

Add a comment |

Pakman · Accepted Answer · 2013-04-25 21:01:59Z

19

I struggled with this too until I realized it's just like any other query:

for result in results:
     print result.column_name

answered Apr 25, 2013 at 21:01

Pakman

2,2103 gold badges25 silver badges41 bronze badges

Yeah, I agree, the NamedTuples are not immediately obvious. My most common post processing is still having to create a dictionary of some sorts. Most of my post processing has been eliminated by better database architecture and SQLAlchemy usage so I have the values I need attached to ORM objects, or available through methods.
– Derek Litz
Commented Apr 25, 2013 at 21:45

Add a comment |

Roman Susi · Accepted Answer · 2017-01-01 13:18:13Z

I found the following more readable, also includes the answer for the dict (in Python 2.7):

d = {id_: name for id_, name in session.query(Customer.id, Customer.name).all()}
l = [r.id for r in session.query(Customer).all()]

For the single value, borrowing from another answer:

l = [name for (name, ) in session.query(Customer.name).all()]

Compare with the built-in zip solution, adapted to the list:

l = list(zip(*session.query(Customer.id).all())[0])

which in my timeits provides only about 4% speed improvements.

brunsgaard · Accepted Answer · 2015-11-17 11:40:24Z

0

My solution looks like this ;)

def column(self):
    for column, *_ in Model.query.with_entities(Model.column).all():
        yield column

NOTE: py3 only.

answered Nov 17, 2015 at 11:40

brunsgaard

5,1882 gold badges18 silver badges15 bronze badges

Add a comment |

neonua · Accepted Answer · 2018-02-26 13:34:41Z

-3

Wow, guys, why strain? There are method steeper way, faster and more elegant)

>>> results = [('result',), ('result_2',), ('result_3',)]
>>> sum(results, tuple())
('result', 'result_2', 'result_3')

Speed:

>>> timeit('result = zip(*[("result",), ("result_2",), ("result_3",)])', number=10000)
0.004222994000883773
>>> timeit('result = sum([("result",), ("result_2",), ("result_3",)], ())', number=10000)
0.0038205889868550003

But if more elements in list - use only zip. Zip more speed.

edited Feb 26, 2018 at 13:34

answered Feb 26, 2018 at 13:09

neonua

1722 silver badges5 bronze badges

3

This takes quadratic time, as you create a new tuple for every sum step. For 1000 elements, you create an empty tuple, then a tuple with 1 element, then a tuple with 2 elements, etc. Each new tuple created has to copy all references from the previous tuple first, so in the end you created N * N! copies of the references. Do not use sum() to concatenate sequences.
– Martijn Pieters
Commented Aug 16, 2018 at 15:12
1

That the 3-element sum is faster is lucky, you should really time this with much, much larger inputs.
– Martijn Pieters
Commented Aug 16, 2018 at 15:13

Add a comment |

Collectives™ on Stack Overflow

SQL Alchemy ORM returning a single column, how to avoid common post processing

7 Answers 7

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Linked

Related