Description
Feature or enhancement
Proposal:
I think that the sqlite3
module is a hidden gem in the Python standard library: the functionality it provides is like a mini-Pandas, similar to the array
module, which is a mini-NumPy. The only problem is that it's more programmatic than interactive, in that it's hard to look at the results of a query without either writing a loop to iterate over sqlite3.Cursor
results or passing the query to Pandas (which instantiates all of the results in memory).
This came up in a problem I was working on and I didn't find any standard solutions for visualizing an SQLite query, so I wrote one on StackOverflow. Just it's just complex enough that variants of code like this shouldn't be floating around the web; a single version should be built into the sqlite3.Cursor
object.
I'd like a method that you can use like this:
db.execute("SELECT * FROM compleat_works").print()
that prints a table like this:
title | type | characters | year_low | year_high
----------------+-----------+------------+----------+----------
'The Sonnets' | 'poetry' | None | 1609 | 1609
'All’s Well tha | 'comedy' | 23 | 1604 | 1605
'The Tragedy of | 'tragedy' | 42 | 1606 | 1606
'As You Like It | 'comedy' | 27 | 1599 | 1600
'The Comedy of | 'comedy' | 18 | 1594 | 1594
... | ... | ... | ... | ...
'A Lover’s Comp | 'poetry' | None | 1609 | 1609
'The Passionate | 'poetry' | None | 1599 | 1599
'The Phoenix an | 'poetry' | None | 1601 | 1601
'The Rape of Lu | 'poetry' | 2 | 1594 | 1594
'Venus and Adon | 'poetry' | 2 | 1593 | 1593
--- 44 rows ----+-----------+------------+----------+----------
Here's my sample implementation:
def print(
self,
*,
first: int = 5,
last: int = 5,
show_header: bool = True,
show_count: bool = True,
total_width: int = 80,
max_width: int = 15,
stream = sys.stdout,
):
# name of each column
header = [x[0] for x in self.description]
# width of the repr of each column
widths = [min(len(x), max_width) for x in header]
# iterate over the whole Cursor, but only keep `first + 2*last` rows in memory
first_rows = []
last_rows = []
row_count = 0
for row in self:
if len(first_rows) < first + last:
first_rows.append(row)
last_rows.append(row)
if len(last_rows) > last:
last_rows = last_rows[1:]
row_count += 1
if row_count <= first + last:
# if the number of rows <= `first + last`, show them all
rows = first_rows
else:
# otherwise, show the `first`, an ellipsis row, and then `last`
rows = first_rows[:first] + [[...] * len(header)] + last_rows
# represent rows with mutable lists so that we can replace them with reprs
rows = [list(x) for x in rows]
align = [">"] * len(header)
for row in rows:
assert len(row) == len(header), f"{len(row)} columns != {len(header)} columns"
for i, cell in enumerate(row):
# if all values are str or bytes (ignoring None), left-align
if cell != ... and isinstance(cell, (str, bytes)) and cell is not None:
align[i] = "<"
# replace data with their repr strings (except ellipsis)
row[i] = "..." if cell == ... else repr(cell)
# identify the maximum (string) width of each column, up to max_width
widths[i] = min(max(widths[i], len(row[i])), max_width)
# if the table is too wide, replace the last column with ellipsis
if sum(widths) + (len(widths) - 1) * 2 > total_width:
header[-1] = "..."
widths[-1] = 3
for row in rows:
row[-1] = "..."
# if the table is still too wide, remove columns
while sum(widths) + (len(widths) - 1) * 2 > total_width and len(header) > 1:
del header[-2]
del widths[-2]
for row in rows:
del row[-2]
# prepare a format string for each line of text
formatter = " | ".join(f"{{:{a}{w}s}}" for a, w in zip(align, widths))
# prepare the horizontal line between header and data
header_separator = "-+-".join("-" * w for w in widths)
if show_header:
# print the table column names and a horizontal line under it
stream.write(formatter.format(*[x[:w] for x, w in zip(header, widths)]) + "\n")
stream.write(header_separator + "\n")
for row in rows:
# print each table row
stream.write(formatter.format(*[x[:w] for x, w in zip(row, widths)]) + "\n")
if show_count:
# print the number of rows in another horizontal line
count = f"--- {row_count} rows ---"
stream.write(count + header_separator[len(count) :] + "\n")
And/or maybe a head
method that only shows the beginning, rather than both the beginning and end (to avoid iterating over millions of rows). Maybe also a _repr_html_
method for IPython/Jupyter, but that's probably more specific than the standard model should be (only useful in a third-party library, albeit a major one).
What do you think? Is this going against the direction you want sqlite3
to go? (E.g. are you trying to reduce its presence, rather than increase it?) Is the idea sound, but the method names (print
, head
) are bad? Is it a problem that it "eats" the Cursor, and should instead be a method on sqlite3.Connection
that takes a query: str
?
I'm willing to write the PR; I just want to start by checking to see if such a PR would be welcome.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response