Data Stream

Overview

The Data Stream metadata type provides a unified, high-performance way to stream rows between Hop pipelines and external systems (especially Python via PyHop), without always needing to land data on disk.

It is designed to be pluggable so that different streaming technologies can be added over time.

Goal

  • Enable fast, low-latency data exchange between Hop and Python (and other Arrow-compatible tools)

  • Provide backpressure and proper streaming semantics

  • Keep the user experience consistent (select a Data Stream by name in Input/Output transforms)

Usage

  1. Create a new Data Stream in the Metadata perspective

  2. Choose the desired implementation

  3. Use Data Stream Output or Data Stream Input transforms in your pipelines and select the stream by name