All Questions
54 questions
1
vote
2
answers
2k
views
Pulling data from an API for Google DataFlow (Python) to ingest and to load onto BigQuery
I am new to data ingestion but have worked through some examples of using Google Data Flow in batch and stream mode. I am now ready to build the actual project. However, I need to choose one of ...
0
votes
0
answers
500
views
Is it possible to pause or schedule my GCP Dataflow streaming pipeline to limit costs?
I've set up a GCP Dataflow streaming pipeline (using python for the template), and I am looking to limit the cost of the pipeline. I'm aware that streaming pipelines need a minimum of 1 worker, but is ...
1
vote
1
answer
52
views
Is there a recommended method for running user defined code against IoT data streams?
I'm sending IoT data from an OPC UA server to a pub/sub topic. Each single message to the topic includes 15 minutes of minute-by-minute data for about ~100 sensors. Dataflow reads from this pub/sub ...
0
votes
1
answer
124
views
How to handle multiple deferred tasks as one?
There is one type of problem I'm facing in multiple projects that I haven't yet found a good solution for within the Google Cloud services.
I would like to queue/defer some work originating from user ...
0
votes
0
answers
145
views
How to activate the cloud function after a successful dataflow job?
The code for my Apache Beam pipeline is below. My cloud function (Topic- name: projects/wisdomcircle-350611/topics/uat-timestamp-job-trigger) must only be activated when the dataflow task below is ...
1
vote
1
answer
198
views
Cloud Function to trigger a Dataflow job in Go
I have a Dataflow job written using the Python SDK. But I want to trigger this Dataflow Job using Cloud Functions which should written in Go. I found a thread similar to this which has the function ...
0
votes
1
answer
107
views
Dataflow Job is failing within 40 Seconds
I have a simple google could http trigger function which is responsible for triggering Dataflow runner job that loads data from CSV on Cloud Storage to a BigQuery table.
My code looks is given below:-
...
3
votes
2
answers
760
views
GCP DataFlow vs CloudFunctions for small size and less frequent update
If I have a csv which is to update entries in SQL database. The file size is max 50 KB and update frequency is twice a week. Also if I have a requirement to do some automated sanity testing
What ...
1
vote
1
answer
492
views
Trigger in cloud function that monitors cloud storage and triggers dataflow
I created a cloud function in python that monitors if any file has been created or modified in cloud storage and if so triggers a job in the data stream with a template created by me in apache beam ...
0
votes
0
answers
774
views
GCP cloud function - 9 minutes quota. What other options do I have
I have a python code which reads data from an API and creates a json (its not just a simple read , there are same transformations as well)
I need to get the data into GCP (specifically cloud storage) ...
0
votes
1
answer
754
views
GCS Bucket file path as input for Dataflow pipeline triggered by Cloud Function - GCP
I am trying to use Cloud Function (Create/Finalize) trigger for GCS Bucket to start a Data flow pipeline. I am trying to figure out how to give the csv file path in GCS Bucket to custom dataflow ...
0
votes
0
answers
89
views
Why cannot I insert rows longer than 65521 from Cloud Functions to BigQuery table via Pub/Sub and Dataflow?
To learn more about Cloud Functions, I decided to implement a scraping script. The function loadds the https://www.bbc.com/, encodes it using base64, then publishes the result to a Pub/Sub topic. A ...
2
votes
0
answers
507
views
Unzipping files with Google Dataflow Bulk Decompress template?
I am trying to unzip uploaded zip files to Cloud Storage which contains only image files without any other folders inside.
I was able to do that with cloud functions but seems like I get memory-...
0
votes
1
answer
624
views
how to launch a cloud dataflow pipeline when particular set of files reaches Cloud storage from a google cloud function
I have a requirement to create a cloud function which should check for a set of files in a GCS bucket and if all of those files arrives in GCS bucket then only it should launch the dataflow templates ...
1
vote
2
answers
753
views
Advice/Guidance - composer/beam/dataflow on gcp
I am trying to learn/try out cloud composer/beam/dataflow on gcp.
I have written functions to do some basic cleaning of data in python, and used a DAG in cloud composer to run this function to ...