Ctrl+K
Logo image Logo image

Site Navigation

  • API Reference
  • Examples

Site Navigation

  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
    • StreamExecutionEnvironment
    • DataStream
    • Functions
    • State
    • Timer
    • Window
    • Checkpoint
    • Side Outputs
    • Connectors
    • Formats
  • PyFlink Common

pyflink.datastream.connectors.kafka.Semantic#

class Semantic(value)[source]#

Semantics that can be chosen.

Data

EXACTLY_ONCE:

The Flink producer will write all messages in a Kafka transaction that will be committed to the Kafka on a checkpoint. In this mode FlinkKafkaProducer sets up a pool of FlinkKafkaProducer. Between each checkpoint there is created new Kafka transaction, which is being committed on FlinkKafkaProducer.notifyCheckpointComplete(long). If checkpoint complete notifications are running late, FlinkKafkaProducer can run out of FlinkKafkaProducers in the pool. In that case any subsequent FlinkKafkaProducer.snapshot- State() requests will fail and the FlinkKafkaProducer will keep using the FlinkKafkaProducer from previous checkpoint. To decrease chances of failing checkpoints there are four options:

  1. decrease number of max concurrent checkpoints

  2. make checkpoints mre reliable (so that they complete faster)

  3. increase delay between checkpoints

  4. increase size of FlinkKafkaProducers pool

Data

AT_LEAST_ONCE:

The Flink producer will wait for all outstanding messages in the Kafka buffers to be acknowledged by the Kafka producer on a checkpoint.

Data

NONE:

Means that nothing will be guaranteed. Messages can be lost and/or duplicated in case of failure.

Attributes

EXACTLY_ONCE

AT_LEAST_ONCE

NONE

previous

pyflink.datastream.connectors.kafka.FlinkKafkaProducer

next

pyflink.datastream.connectors.kafka.KafkaSource

On this page
  • Semantic
Show Source

Created using Sphinx 5.3.0.