Newest 'apache-flink+apache-kafka' Questions

1 vote

1 answer

39 views

Early(?) Triggering of Flink's Event Time Windows and Non-Deterministic Results

I'm reading a small sample of data stored in a Kafka but instead of applying watermarks directly to the source I do some processing on the data then extract the event timestamps. I then apply event ...

K.M

39

asked Apr 4 at 10:13

0 votes

0 answers

44 views

Failed to create stage bundle factory, Pyflink and Kafka error

I'm trying the Pyflink - Kafka connection. I'm using Python 3.11 on Pycharm, with the flink-sql-connector-kafka-3.3.0-1.20.jar and apache-flink 1.20. I'm running Kafka on Docker, and I've tested the ...

userloser

1

asked Mar 12 at 6:43

1 vote

0 answers

28 views

Watermark strategy passed to env.from_source not being used,

Setup: flink 1.19.1 python 3.10 Apache flink 1.19.1 python package java 11 Kafka flink connector 3.4.0 I have am creating a KafkaSource that's reading from a Kafka topic where messages have ...

Abhishek Bhrushundi

11

asked Mar 7 at 4:04

0 votes

0 answers

41 views

Flink partition consumption behavior with watermark alignment

I am running a Flink job with watermark alignment enabled and a max drift of 12 hours. My input is a Kafka stream with 128 partitions. When running the job with 3 task managers, each with 4 slots, I ...

Sabmit

173

asked Mar 4 at 23:30

0 votes

0 answers

35 views

the transaction.id created by flink is wrong

I am using flink 1.20 with kafka-client 3.4.0. I am using KafkaSink with Exactly Once and checkpoints. I set the transactionIdPrefix and noticed there is a problem in the beginning of using my ...

GreenA

1

asked Mar 3 at 17:16

0 votes

0 answers

29 views

Hudi Compaction is very slow via Apache Flink

I have written a pipeline in which I am sinking the data from Kafka to Hudi-S3. It is working, but compaction is very very slow. It is a batch job that runs every hour and sinks the last hour data to ...

Vishal

109

asked Feb 27 at 10:47

0 votes

1 answer

56 views

Flink Connector vs Confluent Connector

What are the pros and cons of Confluent connectors vs Flink connector APIs? Are the Flink connector APIs supported on Confluent Flink? These would allow more customization but not managed like the ...

Sindhu

23

asked Feb 25 at 13:15

0 votes

1 answer

52 views

Stream processing with flink using BroadcastStream is providing inconsistent results

// Define the incoming raw data stream sourcing from kafka topic DataStream<RawMessage> mainStream = env.addSource(...); // Define the reference data stream sourcing from kafka topic DataStream&...

Hariharan Janakiraman

1

asked Feb 18 at 21:59

1 vote

1 answer

58 views

Problem with Kafka committing offsets when stopping a Flink job (AT_LEAST_ONCE and EXACTLY_ONCE)

We are using Apache Flink with Kafka and the AT_LEAST_ONCE strategy. The following problem occurs: when stopping a job, Flink commits offsets for messages that have not yet been processed, resulting ...

Руслан Цегельников

11

asked Feb 12 at 14:07

-1 votes

1 answer

69 views

What data is stored in flink checkpoints?

I have a case to handle restart flink job. I need use checkpointing and use metadata (state of kafkasource input) of it to process. Currently, checkpointing auto use metadata to recovery, but i wanna ...

ndycuong

1

asked Jan 21 at 8:22

0 votes

1 answer

62 views

Apache Flink and State Store from Kafka Data Source

I'm a bit confused about how Apache Flink works with commit offset when Kafka is the datasource. I have my consumer group configured and consuming messages normally, and committing it (because I ...

Ronaldo Lanhellas

3,356

asked Jan 19 at 8:14

0 votes

1 answer

114 views

'org.apache.flink.formats.avro.AvroDeserializationSchema org.apache.flink.formats.avro.AvroDeserializationSchema.forGeneric(org.apache.avro.Schema)'

Hey I am stuck with the same problem here with getting flink to read from kafka with avro and schema registry. I am able to see logs on schema-registry with flink trying to read from the server but I ...

Rajat Sinha

1

asked Jan 6 at 7:21

0 votes

1 answer

40 views

Flink job (64 tasks) routes records to 1 task

I have an AWS managed flink job writing to Kafka. I can't seem to figure out why most of my records get routed to a single task despite having 64 parallelism. The downstream kafka topic has 16 ...

nick_rinaldi

707

asked Jan 5 at 21:55

0 votes

1 answer

241 views

Data Loss in Flink Job with Iceberg Sink After Restart: How to Ensure Consistent Writes?

I am running a Flink job that reads data from Kafka, processes it into a Flink Row object, and writes it to an Iceberg Sink. To deploy new code changes, I restart the job from the latest savepoint, ...

sbrk

1,490

asked Dec 24, 2024 at 13:14

1 vote

0 answers

49 views

Flink Kafka connector timeout error: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listTopics

We are using FlinkKafkaconnector (v 3.3.0-1.20) with Flink 1.20. We have 2 jobs which are running in our flink k8s cluster, and while one of them is able to connect to kafka, the other one fails with ...

Logan

2,515

asked Dec 19, 2024 at 11:09

Collectives™ on Stack Overflow

All Questions

Early(?) Triggering of Flink's Event Time Windows and Non-Deterministic Results

Failed to create stage bundle factory, Pyflink and Kafka error

Watermark strategy passed to env.from_source not being used,

Flink partition consumption behavior with watermark alignment

the transaction.id created by flink is wrong

Hudi Compaction is very slow via Apache Flink

Flink Connector vs Confluent Connector

Stream processing with flink using BroadcastStream is providing inconsistent results

Problem with Kafka committing offsets when stopping a Flink job (AT_LEAST_ONCE and EXACTLY_ONCE)

What data is stored in flink checkpoints?

Apache Flink and State Store from Kafka Data Source

'org.apache.flink.formats.avro.AvroDeserializationSchema org.apache.flink.formats.avro.AvroDeserializationSchema.forGeneric(org.apache.avro.Schema)'

Flink job (64 tasks) routes records to 1 task

Data Loss in Flink Job with Iceberg Sink After Restart: How to Ensure Consistent Writes?

Flink Kafka connector timeout error: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listTopics

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags