Ctrl+K
Logo image Logo image

Site Navigation

  • API Reference
  • Examples

Site Navigation

  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
    • StreamExecutionEnvironment
    • DataStream
    • Functions
    • State
    • Timer
    • Window
    • Checkpoint
    • Side Outputs
    • Connectors
    • Formats
  • PyFlink Common

pyflink.datastream.connectors.file_system.BulkFormat#

class BulkFormat(j_bulk_format)[source]#

The BulkFormat reads and decodes batches of records at a time. Examples of bulk formats are formats like ORC or Parquet.

Internally in the file source, the readers pass batches of records from the reading threads (that perform the typically blocking I/O operations) to the async mailbox threads that do the streaming and batch data processing. Passing records in batches (rather than one-at-a-time) much reduce the thread-to-thread handover overhead.

For the BulkFormat, one batch is handed over as one.

New in version 1.16.0.

previous

pyflink.datastream.connectors.file_system.StreamFormat

next

pyflink.datastream.connectors.file_system.FileSourceBuilder

On this page
  • BulkFormat
Show Source

Created using Sphinx 5.3.0.