Skip to main content

All Questions

1 vote
2 answers
2k views

Pulling data from an API for Google DataFlow (Python) to ingest and to load onto BigQuery

I am new to data ingestion but have worked through some examples of using Google Data Flow in batch and stream mode. I am now ready to build the actual project. However, I need to choose one of ...
Dylan Solms's user avatar
0 votes
0 answers
500 views

Is it possible to pause or schedule my GCP Dataflow streaming pipeline to limit costs?

I've set up a GCP Dataflow streaming pipeline (using python for the template), and I am looking to limit the cost of the pipeline. I'm aware that streaming pipelines need a minimum of 1 worker, but is ...
Peter4137's user avatar
1 vote
1 answer
52 views

Is there a recommended method for running user defined code against IoT data streams?

I'm sending IoT data from an OPC UA server to a pub/sub topic. Each single message to the topic includes 15 minutes of minute-by-minute data for about ~100 sensors. Dataflow reads from this pub/sub ...
Chris G.'s user avatar
  • 3,981
0 votes
1 answer
124 views

How to handle multiple deferred tasks as one?

There is one type of problem I'm facing in multiple projects that I haven't yet found a good solution for within the Google Cloud services. I would like to queue/defer some work originating from user ...
Thijs Koerselman's user avatar
0 votes
0 answers
145 views

How to activate the cloud function after a successful dataflow job?

The code for my Apache Beam pipeline is below. My cloud function (Topic- name: projects/wisdomcircle-350611/topics/uat-timestamp-job-trigger) must only be activated when the dataflow task below is ...
Shubham Belwal's user avatar
1 vote
1 answer
198 views

Cloud Function to trigger a Dataflow job in Go

I have a Dataflow job written using the Python SDK. But I want to trigger this Dataflow Job using Cloud Functions which should written in Go. I found a thread similar to this which has the function ...
Ashok KS's user avatar
  • 701
0 votes
1 answer
107 views

Dataflow Job is failing within 40 Seconds

I have a simple google could http trigger function which is responsible for triggering Dataflow runner job that loads data from CSV on Cloud Storage to a BigQuery table. My code looks is given below:- ...
Vibhor Gupta's user avatar
3 votes
2 answers
760 views

GCP DataFlow vs CloudFunctions for small size and less frequent update

If I have a csv which is to update entries in SQL database. The file size is max 50 KB and update frequency is twice a week. Also if I have a requirement to do some automated sanity testing What ...
anupam's user avatar
  • 357
1 vote
1 answer
492 views

Trigger in cloud function that monitors cloud storage and triggers dataflow

I created a cloud function in python that monitors if any file has been created or modified in cloud storage and if so triggers a job in the data stream with a template created by me in apache beam ...
Paulo Enrique's user avatar
0 votes
0 answers
774 views

GCP cloud function - 9 minutes quota. What other options do I have

I have a python code which reads data from an API and creates a json (its not just a simple read , there are same transformations as well) I need to get the data into GCP (specifically cloud storage) ...
K_python2022's user avatar
0 votes
1 answer
754 views

GCS Bucket file path as input for Dataflow pipeline triggered by Cloud Function - GCP

I am trying to use Cloud Function (Create/Finalize) trigger for GCS Bucket to start a Data flow pipeline. I am trying to figure out how to give the csv file path in GCS Bucket to custom dataflow ...
Krishna Potharaju's user avatar
0 votes
0 answers
89 views

Why cannot I insert rows longer than 65521 from Cloud Functions to BigQuery table via Pub/Sub and Dataflow?

To learn more about Cloud Functions, I decided to implement a scraping script. The function loadds the https://www.bbc.com/, encodes it using base64, then publishes the result to a Pub/Sub topic. A ...
zabop's user avatar
  • 7,960
2 votes
0 answers
507 views

Unzipping files with Google Dataflow Bulk Decompress template?

I am trying to unzip uploaded zip files to Cloud Storage which contains only image files without any other folders inside. I was able to do that with cloud functions but seems like I get memory-...
Onur's user avatar
  • 21
0 votes
1 answer
624 views

how to launch a cloud dataflow pipeline when particular set of files reaches Cloud storage from a google cloud function

I have a requirement to create a cloud function which should check for a set of files in a GCS bucket and if all of those files arrives in GCS bucket then only it should launch the dataflow templates ...
srinidhi's user avatar
1 vote
2 answers
753 views

Advice/Guidance - composer/beam/dataflow on gcp

I am trying to learn/try out cloud composer/beam/dataflow on gcp. I have written functions to do some basic cleaning of data in python, and used a DAG in cloud composer to run this function to ...
user3095083's user avatar

15 30 50 per page