121

I'd like to know if it's possible to generate a SELECT COUNT(*) FROM TABLE statement in SQLAlchemy without explicitly asking for it with execute(). If I use:

session.query(table).count()

then it generates something like:

SELECT count(*) AS count_1 FROM
    (SELECT table.col1 as col1, table.col2 as col2, ... from table)

which is significantly slower in MySQL with InnoDB. I am looking for a solution that doesn't require the table to have a known primary key, as suggested in Get the number of rows in table using SQLAlchemy.

3

10 Answers 10

119

I managed to render the following SELECT with SQLAlchemy on both layers.

SELECT count(*) AS count_1
FROM "table"

Usage from the SQL Expression layer

from sqlalchemy import select, func, Integer, Table, Column, MetaData

metadata = MetaData()

table = Table("table", metadata,
              Column('primary_key', Integer),
              Column('other_column', Integer)  # just to illustrate
             )   

print select([func.count()]).select_from(table)

Usage from the ORM layer

You just subclass Query (you have probably anyway) and provide a specialized count() method, like this one.

from sqlalchemy.sql.expression import func

class BaseQuery(Query):
    def count_star(self):
        count_query = (self.statement.with_only_columns([func.count()])
                       .order_by(None))
        return self.session.execute(count_query).scalar()

Please note that order_by(None) resets the ordering of the query, which is irrelevant to the counting.

Using this method you can have a count(*) on any ORM Query, that will honor all the filter andjoin conditions already specified.

9
  • Thanks a lot! Seems a bit convoluted, but it works :) I'll wait a bit to see if someone comes up with a simpler syntax before accepting.
    – tiho
    Commented Oct 17, 2012 at 20:18
  • 1
    Oops, sorry I was wrong, it doesn't actually work. I mean, it does what you say, but it still doesn't yield SELECT(*) (which is about twice as fast as your suggestion in my real use case).
    – tiho
    Commented Oct 19, 2012 at 15:43
  • 1
    Use sqlalchemy.sql.func.count() instead of sqlalchemy.sql.text('count(*)')
    – ddaa
    Commented Nov 26, 2012 at 10:46
  • 4
    This is likely to drop the table from the query for simple select ... from table queries. Commented Apr 17, 2021 at 11:34
  • 1
    I currently get a deprecation warning with select([func.count()]).select_from(table) and use func.count().select().select_from(table) instead
    – tofarr
    Commented Oct 11, 2023 at 14:21
117

Query for just a single known column:

session.query(MyTable.col1).count()
3
  • 8
    This will construct the count as a wrapped select. The query optimizer of MySQL should take care of it though.
    – pi.
    Commented Oct 17, 2012 at 20:15
  • 2
    Thanks: although it doesn't exactly answer my question, it is indeed faster. I was afraid it would ignore NULL entries but that's not the case. However, it remains slower than an explicit SELECT(*).
    – tiho
    Commented Oct 17, 2012 at 20:15
  • 1
    @Nathan Villaescusa Can you please provide an example for multiple columns? example: session.query(MyTable.col1, MyTable.col2).count(). sql: SELECT col1, COUNT(col1) FROM "table" Commented Jan 11, 2021 at 12:18
33

I needed to do a count of a very complex query with many joins. I was using the joins as filters, so I only wanted to know the count of the actual objects. count() was insufficient, but I found the answer in the docs here:

http://docs.sqlalchemy.org/en/latest/orm/tutorial.html

The code would look something like this (to count user objects):

from sqlalchemy import func

session.query(func.count(User.id)).scalar() 
2
  • 6
    Nowhere in your example is there any query being referenced… Commented Feb 18, 2019 at 16:21
  • 2
    does anyone know how the speed of this compares with session.query(User).count()? it's been asked here as well: stackoverflow.com/questions/14754994/…
    – Matt
    Commented Nov 13, 2022 at 19:03
33

This is not answer to the original question posted more than 10 years ago but general answer for the problem "how to make SELECT COUNT(*)". Also note that many answers here are very outdated and suitable only for old SQLAlchemy versions.

Here is solution for SQLAlchemy 1.4.x and 2.0.x:

from sqlalchemy import func, select

class MyModel(Base):
    ...

statement = select(func.count()).select_from(MyModel)
count: int = session.execute(statement).scalar()
19

Addition to the Usage from the ORM layer in the accepted answer: count(*) can be done for ORM using the query.with_entities(func.count()), like this:

session.query(MyModel).with_entities(func.count()).scalar()

It can also be used in more complex cases, when we have joins and filters - the important thing here is to place with_entities after joins, otherwise SQLAlchemy could raise the Don't know how to join error.

For example:

  • we have User model (id, name) and Song model (id, title, genre)
  • we have user-song data - the UserSong model (user_id, song_id, is_liked) where user_id + song_id is a primary key)

We want to get a number of user's liked rock songs:

SELECT count(*) 
  FROM user_song
  JOIN song ON user_song.song_id = song.id 
 WHERE user_song.user_id = %(user_id)
   AND user_song.is_liked IS 1
   AND song.genre = 'rock'

This query can be generated in a following way:

user_id = 1

query = session.query(UserSong)
query = query.join(Song, Song.id == UserSong.song_id)
query = query.filter(
    and_(
        UserSong.user_id == user_id, 
        UserSong.is_liked.is_(True),
        Song.genre == 'rock'
    )
)
# Note: important to place `with_entities` after the join
query = query.with_entities(func.count())
liked_count = query.scalar()

Complete example is here.

5
  • 4
    this can only work for queries using .join(). Otherwise the query is "optimized" and the WHERE clause is simply removed. for example session.query(Model).with_entities(func.count()).scalar() generates SELECT count(*) AS count_1 which I doubt anyone is searching the result of ^^. Commented May 30, 2020 at 14:11
  • @RomainVincent for me the session.query(MyModel).filter(MyModel.rank < 100).with_entities(func.count()) generates SELECT count(*) AS count_1 FROM my_table WHERE my_table.rank < %(rank_1)s, so you may hit some edge case or a bug. Commented Jun 24, 2020 at 10:58
  • yes I over generalized my reply. SO a better way to say it would be "This will not work if you don't explicitly use the columns in the query() arguments. I must admit I don't know the full extent of what SQLA will do or not, but from my point of view, this method seems dangerous to use since it can lead to unexpected results. Commented Jun 25, 2020 at 13:28
  • 3
    Query.with_entities() essentially does the same thing as Query.statement.with_only_columns() (as in the accepted answer), and has the same problem in that for a plain session.query(Model) this produces SELECT COUNT(*) (no FROM). Until gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/2973 lands, there is no good way to do this at the ORM level. Commented Apr 17, 2021 at 11:48
  • I haven't checked since when is it available, but query.with_only_columns(func.count(), maintain_column_froms=True) seems to work. Commented Feb 18, 2023 at 18:29
1

If you are using the SQL Expression Style approach there is another way to construct the count statement if you already have your table object.

Preparations to get the table object. There are also different ways.

import sqlalchemy

database_engine = sqlalchemy.create_engine("connection string")

# Populate existing database via reflection into sqlalchemy objects
database_metadata = sqlalchemy.MetaData()
database_metadata.reflect(bind=database_engine)

table_object = database_metadata.tables.get("table_name") # This is just for illustration how to get the table_object                    

Issuing the count query on the table_object

query = table_object.count()
# This will produce something like, where id is a primary key column in "table_name" automatically selected by sqlalchemy
# 'SELECT count(table_name.id) AS tbl_row_count FROM table_name'

count_result = database_engine.scalar(query)
0
query = session.query(table.column).filter(<filters>).with_entities(func.count(table.column.distinct()))
count = query.scalar()

this worked for me.

Gives the query:

SELECT count(DISTINCT table.column) AS count_1 
FROM table where ...
-1

Below is the way to find the count of any query.

aliased_query = alias(query)
db.session.query(func.count('*')).select_from(aliased_query).scalar()

Here is the link to the reference document if you want to explore more options or read details.

1
  • 3
    This still wraps the aliased query into a subselect. An alias produces without a name is just an anonymous subselect. The SQL your answer produces is exactly the same as what the question asker wanted to avoid. Commented Jul 23, 2021 at 20:37
-1

I'm not clear on what you mean by "without explicitly asking for it with execute()" So this might be exactly what you are not asking for. OTOH, this might help others.

You can just run the textual SQL:

your_query="""
SELECT count(*) from table
"""
the_count = session.execute(text(your_query)).scalar()
1
  • 6
    Yes this is indeed exactly what I wanted not to do ;)
    – tiho
    Commented Feb 11, 2022 at 3:09
-2
def test_query(val: str):    
   query = f"select count(*) from table where col1='{val}'"
   rtn = database_engine.query(query)
   cnt = rtn.one().count

but you can find the way if you checked debug watch

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.