1

I would like to get list or csv and store it in GCS, where the list contains list of all the tables and the last modified date.

I have more than 130 datasets and 10-20 tables in each datasets. Since I will be running this query everyday using scheduled github actions to compare and check which tables are modified after 2 months gap and send alert notification everyday to my team, I am trying to find a simple solution to get fetch this information from BQ.

I have got the query for single dataset:

SELECT 
   table_id,
   DATE(TIMESTAMP_MILLIS(last_modified_time)) AS last_modified
FROM
   `project_id.dataset_name.__TABLES__`

and to find all datasets I can use query :

SELECT
    schema_name
FROM
    `project_id`.`region-europe-west3`.INFORMATION_SCHEMA.SCHEMATA;

1 Answer 1

0

In general, this is called dynamic SQL and it can be done in GBQ with the execute immediate command.

execute immediate (
  select
    string_agg(
      concat(
        'select ',
        chr(39),
        schema_name,
        chr(39),
        ' as schema_name, table_id, date(timestamp_millis(last_modified_time)) as last_modified from project-id.', 
        schema_name,
        '.__TABLES__'
      ),
      ' union all '
    ) as query
  from `project-id`.INFORMATION_SCHEMA.SCHEMATA
)
;
2
  • I am getting an error Execute Immediate sql string cannot be Null at [ 1:19 ]
    – Shasha
    Commented Apr 29 at 8:40
  • If you are using my query, I think the inner query is returning no rows because you are in Europe and leaving off the regional qualifier defaults to the US region. Can you add back the regional qualifier to that query and see if that works? That's my bad; all my tables are in the US.
    – gmw
    Commented yesterday

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.