Ctrl+K
Logo image Logo image

Site Navigation

  • API Reference
  • Examples

Site Navigation

  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
    • StreamExecutionEnvironment
    • DataStream
    • Functions
    • State
    • Timer
    • Window
    • Checkpoint
    • Side Outputs
    • Connectors
    • Formats
  • PyFlink Common

pyflink.datastream.state_backend.PredefinedOptions#

class PredefinedOptions(value)[source]#

The PredefinedOptions are configuration settings for the RocksDBStateBackend. The various pre-defined choices are configurations that have been empirically determined to be beneficial for performance under different settings.

Some of these settings are based on experiments by the Flink community, some follow guides from the RocksDB project.

DEFAULT:

Default options for all settings, except that writes are not forced to the disk.

Note

Because Flink does not rely on RocksDB data on disk for recovery, there is no need to sync data to stable storage.

SPINNING_DISK_OPTIMIZED:

Pre-defined options for regular spinning hard disks.

This constant configures RocksDB with some options that lead empirically to better performance when the machines executing the system use regular spinning hard disks.

The following options are set:

  • setCompactionStyle(CompactionStyle.LEVEL)

  • setLevelCompactionDynamicLevelBytes(true)

  • setIncreaseParallelism(4)

  • setUseFsync(false)

  • setDisableDataSync(true)

  • setMaxOpenFiles(-1)

Note

Because Flink does not rely on RocksDB data on disk for recovery, there is no need to sync data to stable storage.

SPINNING_DISK_OPTIMIZED_HIGH_MEM:

Pre-defined options for better performance on regular spinning hard disks, at the cost of a higher memory consumption.

Note

These settings will cause RocksDB to consume a lot of memory for block caching and compactions. If you experience out-of-memory problems related to, RocksDB, consider switching back to SPINNING_DISK_OPTIMIZED.

The following options are set:

  • setLevelCompactionDynamicLevelBytes(true)

  • setTargetFileSizeBase(256 MBytes)

  • setMaxBytesForLevelBase(1 GByte)

  • setWriteBufferSize(64 MBytes)

  • setIncreaseParallelism(4)

  • setMinWriteBufferNumberToMerge(3)

  • setMaxWriteBufferNumber(4)

  • setUseFsync(false)

  • setMaxOpenFiles(-1)

  • BlockBasedTableConfig.setBlockCacheSize(256 MBytes)

  • BlockBasedTableConfigsetBlockSize(128 KBytes)

Note

Because Flink does not rely on RocksDB data on disk for recovery, there is no need to sync data to stable storage.

FLASH_SSD_OPTIMIZED:

Pre-defined options for Flash SSDs.

This constant configures RocksDB with some options that lead empirically to better performance when the machines executing the system use SSDs.

The following options are set:

  • setIncreaseParallelism(4)

  • setUseFsync(false)

  • setDisableDataSync(true)

  • setMaxOpenFiles(-1)

Note

Because Flink does not rely on RocksDB data on disk for recovery, there is no need to sync data to stable storage.

Attributes

DEFAULT

SPINNING_DISK_OPTIMIZED

SPINNING_DISK_OPTIMIZED_HIGH_MEM

FLASH_SSD_OPTIMIZED

previous

pyflink.datastream.state_backend.CustomStateBackend

next

Timer

On this page
  • PredefinedOptions
Show Source

Created using Sphinx 5.3.0.