0

I have built my custom Python image for streaming Dataflow job and tested it locally that it works. When deployed to GCP it pulls the images, JOB logs initialize the minimum 1 worker, job is in state RUNNING. I see logs that I printed when pipeline is initiated. However, when I publish pubsub messages to my stream nothing happens, just data freshness metric instreases, workers have strange errors (that are markes info severity) about flags that I do not control:

flag provided but not defined: -logging_endpoint
Usage of /opt/google/dataflow/python_template_launcher:
...

And when looking into dataflow_step logs (not shown in Dataflow Worker console) in GCP I see constant (not sure if those two are related):

"Error syncing pod, skipping" err="failed to \"StartContainer\" for \"sdk-0-0\" with CrashLoopBackOff: \"back-off 10s restarting failed container=sdk-0-0 pod=df-myjob-she-09170618-0m8p-harness-vmzx_default(ce159539391472095ac26571c8a6ceb2)\"" pod="default/df-myjob-she-09170618-0m8p-harness-vmzx" podUID=ce159539391472095ac26571c8a6ceb2

Here is the dockerfile (anything commented was there before from google-docs and tried it):

FROM gcr.io/dataflow-templates-base/python311-template-launcher-base:latest

WORKDIR /template

COPY . /template/

ENV FLEX_TEMPLATE_PYTHON_PY_FILE="src/dataflow.py"


RUN apt-get update \
    && apt-get install -y libffi-dev git \
    && rm -rf /var/lib/apt/lists/* \
    # Upgrade pip and install the requirements.
    # && pip install --upgrade pip \
    # && pip install -e .
    # && pip install --no-cache-dir -r "/template/requirements-pypi.txt" \
    # Download the requirements to speed up launching the Dataflow job.
    # && pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r "/template/requirements-pypi.txt"


ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"]

Here is how I launch my job :

gcloud dataflow flex-template run "myjob" \
  --project="xxx" \
  --region="europe-north1" \
  --template-file-gcs-location="gs://xxx/templates/myjob-template.json" \
  --parameters=^::^pubsub_topic=projects/xxx/topics/senior-cash-transactions::staging_location=gs://xxx/temp::sdk_container_image=europe-north1-docker.pkg.dev/xxx/xxx/xxx:latest \
  --temp-location="gs://xxx/temp" \
  --additional-experiments="enable_secure_boot" \
  --disable-public-ips \
  --network="xxx" \
  --subnetwork="https://www.googleapis.com/compute/v1/projects/xxx/regions/europe-north1/subnetworks/xxx-europe-north1" \
  --worker-machine-type="e2-standard-8" \
  --service-account-email="[email protected]"

Service account has pubsub admin, dataflow worker, dataflow admin, gcs admin, network user, log writer, anything I could think of while debugging.

Also, here are some logs I see (filter is dataflow_step and job_id): dataflow_step_logs1 dataflow_step_logs2

I have reviewed several related topics, such as Why did I encounter an "Error syncing pod" with Dataflow pipeline?, also the one in Google Docs per related error message (https://cloud.google.com/dataflow/docs/guides/common-errors#error-syncing-pod), so far I tried:

  • Changing apache beam versions
  • As per other thread removing requirements.txt and setting up dependencies with setup.py file and rebuilding the image
  • Removing requirements and setup at all from Dockerfile as looks like those dependencies already in the image to avoid conflicts and rebuilding the image
  • Checked firewall settings in my VPC and Firewall GCP, nothing there.
  • Checked all IAM permissions for service account
2
  • Hi @bozhyte, It appears that this issue has to be investigated further, so if you have a support plan please create a new GCP support case. Otherwise, you can open a new issue on the issue tracker describing your issue. Commented Sep 17, 2024 at 15:40
  • Please check github.com/GoogleCloudPlatform/python-docs-samples/tree/main/…. In particular, check what SDK_CONTAINER_IMAGE is used. The syncing pod errors are usually fine as long as it eventually downloads your SDK container successfully.
    – XQ Hu
    Commented Sep 21, 2024 at 19:54

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.