This page describes how to integrate BigQuery tables into Earth Engine workflows
as ee.FeatureCollection
objects, using the
ee.FeatureCollection.loadBigQueryTable()
and ee.FeatureCollection.runBigQuery()
methods.
Load data from BigQuery
The ee.FeatureCollection.loadBigQueryTable()
function seamlessly reads a BigQuery table into an ee.FeatureCollection
object. It connects to a specified table, converts all data types, applies the
necessary filters and selectors, and adds indexing to the
collection if necessary. The function uses Earth Engine's interactive
environment,
returning results directly to the client to be viewed or used as a component of
a larger analysis.
JavaScript
// Load the BigQuery table with a specified geometry column. var features = ee.FeatureCollection.loadBigQueryTable({ table: 'my_project.my_dataset.my_table', geometryColumn: 'geo' }); // Display features on the map. Map.addLayer(features);
Python
# Load the BigQuery table with a specified geometry column. features = ee.FeatureCollection.loadBigQueryTable( table='my_project.my_dataset.my_table', geometryColumn='geo') # Display the first feature. display(features.first().getInfo())
Billing
The cost of EECU-hours used during processing of the request is billed to the caller like for any other Earth Engine method (see EECUs overview).
There are no additional BigQuery costs associated with transferring the data to Earth Engine. Corresponding BigQuery usage will be visible in used project's Google Cloud API Dashboard (see Monitoring API usage), but no cost will be incurred for reading BigQuery data this way.
Query data from BigQuery
The ee.FeatureCollection.runBigQuery()
method runs a BigQuery SQL query and returns the results as an
ee.FeatureCollection
object (see Run a query doc
to learn more about queries).
JavaScript
// Construct a BigQuery query. var query = 'SELECT * FROM my_project.my_dataset.my_table WHERE column > 1000'; // Run the query and return the results as a FeatureCollection. var features = ee.FeatureCollection.runBigQuery(query); // Print the first feature. print(features.first());
Python
# Construct a BigQuery query. query = 'SELECT * FROM my_project.my_dataset.my_table WHERE column > 1000' # Run the query and retrieve the results as a FeatureCollection. features = ee.FeatureCollection.runBigQuery(query) # Print the first feature. print(features.first().getInfo())
BigQuery queries
Each call to ee.FeatureCollection.runBigQuery()
initiates a separate BigQuery query job (see more about queries in Run a query
documentation),
allowing you to use key BigQuery capabilities:
- Job history: Access a six-month history of your project's query executions (see more at List jobs).
- Query caching: BigQuery automatically caches query results when possible. Subsequent identical queries retrieve data from the cache, preventing redundant charges (see more at Using cached query results)
To learn about queries or how to use them in BigQuery see BigQuery documentation.
Billing
Cost of EECUs used during processing of the request is billed to the caller like for any other Earth Engine method (see EECUs overview). Additionally running a query is billed to the caller according to the BigQuery billing model.
There are no additional BigQuery costs associated with transferring the data to Earth Engine. Corresponding BigQuery usage will be visible in used project's Google Cloud API Dashboard (see Monitoring API usage), but no cost will be incurred for reading BigQuery data this way.
To control potential costs associated with ee.FeatureCollection.runBigQuery()
,
the maxBytesBilled
parameter acts as a safeguard. Any BigQuery job that
exceeds this limit will fail and won't be billed. The default value of
maxBytesBilled
is 100 GB. If your call is blocked by exceeding this limit, you
can specify a different value in your script.
Prerequisites and permissions
To use this feature, the caller's Cloud project needs to have BigQuery API and BigQuery Storage API enabled. Follow instructions on the Enable API page to enable appropriate APIs.
In addition to the standard Earth Engine roles and permissions, you need to have read access on the referenced BigQuery table, permission to create read sessions and jobs in the target project. The specific BigQuery permissions required are:
bigquery.tables.get
(on any accessed table)bigquery.tables.getData
(on any accessed table)bigquery.readSession.create
bigquery.jobs.create
Refer to the BigQuery access control documentation for detailed information on managing permissions.
Data filtering
Every ee.FeatureCollection
can be filtered using the
.filter(Filter)
method. To allow Google Earth Engine users to benefit from highly parallelized
BigQuery tabular data processing, we translate Earth Engine filters to language
understandable by BigQuery and send them together with a read table
request. This approach indeed moves filter processing to BigQuery stack, but it
also is subject to two limitations:
Like every other query in BigQuery (see BigQuery quotas) this request is limited to 10 MB in size. This means that passed filters cannot be overly complicated. Hitting the 10 MB limit results in the following error:
Filter sent to BigQuery is too long. This error may be caused by too complicated geometry in geometry filters. Consider simplifying the filter and used values.
Filtering by geometries that contain many vertices is a common cause for this error. To solve this issue consider using ee.Geometry.simplify() on the problematic object.
Some more complicated Earth Engine filters can't be converted to their BigQuery equivalents. For example, BigQuery does not support ARRAY equality checks. In such cases, we don't translate the filter and instead apply it in Earth Engine after reading the data.
Data indexing
Earth Engine collections rely on internal indexing, whereas BigQuery discourages keeping tables indexed. To make those two systems work together we create collection indexes in the following way:
If the BigQuery table contains a column named
system:index
, we use it for indexing FeatureCollection.In such cases it is on the caller to make sure indexes are unique. Otherwise the collection may misbehave in an unexpected manner. Feature index must be a non-empty string, so loading BigQuery table with a non-string or
null
value for asystem:index
column will fail.If the BigQuery table does not contain the
system:index
column, it is generated automatically.Indexes in between two read requests are stable, but only if the requests are exactly the same, taking filters into account. Otherwise we cannot rely on indexes to correspond to the same features. Therefore if precisely unique data indexing is important for the caller, we recommend adding the
system:index
column in BigQuery manually.
Limitations
The size of all selected columns of the table referenced in a
ee.FeatureCollection.loadBigQueryTable()
call is limited to 400 GB. Hitting this limit will result in the following error:Failed to read table from BigQuery: Requested data size is too large to read. Consider using selectors to specify only required columns.
In such cases consider choosing more restrictive selectors to read-only necessary columns or consider using
ee.FeatureCollection.runBigQuery()
to preprocess the table in BigQuery and lower the amount of fetched data.The
ee.FeatureCollection.runBigQuery()
method imposes a 10 GB limit on query result sizes. Although source tables can be of arbitrary size, processing larger data volumes will increase query costs.Translated filter size is limited to 10 MB. See the Data filtering section for details.
Caveats
The
ee.FeatureCollection.loadBigQueryTable()
does not support resources from Linked Datasets. Trying to load data from such table results in "table not found" error.As a workaround, consider running
ee.FeatureCollection.runBigQuery()
with a query specifying the requested table from the linked dataset. For example:JavaScript
var features = ee.FeatureCollection.runBigQuery({ query: 'SELECT * FROM my_project.my_linked_dataset.my_table', geometryColumn: 'geo' });
Python
features = ee.FeatureCollection.runBigQuery( query='SELECT * FROM my_project.my_linked_dataset.my_table', geometryColumn='geo')
Joining on
system:index
for BigQuery tables with auto-generated IDs may lead to unexpected behaviours. To prevent this from happening, consider addingsystem:index
to BigQuery table manually or joining the table on a different property. Read more about indexing in the Data indexing section.The
ee.FeatureCollection.randomColumn()
method does not work with BigQuery auto-generated IDs. Consider specifying an alternative key using therowKeys
parameter in theee.FeatureCollection.randomColumn()
method. You can also manually addrandom
orsystem:index
columns to the BigQuery source table. Read more about indexing in the Data indexing section.