If You Know Kettle (Pentaho Data Integration)
Why Hop?
Hop wants to allow data engineers to be able to deliver high quality work, deliver that work fast and integrated with bleeding edge technology.
We want Hop to be completely open source, and are eager to hear your feedback on our chat and just as eager to see your bug tickets and feature request in Github Issues.
As an open source first project, we started the Apache Software Foundation incubation process in September 2020 and graduated as and ASF top level project in December 2021.
Check our Q&A for more information on why Hop was created and what the project is all about.
Concepts
A couple of things have been renamed to align Apache Hop with modern data processing platforms.
A lot has changed behind the scenes, but don’t worry, if you’re familiar with Kettle/PDI, you’ll feel right at home immediately.
Kettle | Hop | Difference |
---|---|---|
Spoon | Hop Gui | Spoon has been abandoned. Hop Gui was written from scratch. Check the Getting Started guide or the Hop Gui docs to find out more. |
Transformation | Pipeline | No conceptual changes. You’ll develop pipelines just like you would develop a transformation, but a pipeline in Hop can run on different runtimes |
Job | Workflow | No conceptual changes. You’ll develop a workflow just like you would develop a job, but a workflow in Hop can run on different runtimes |
Step | Transform | No conceptual changes. The underlying code has changed and the dialogs have been updated, but you’ll feel right at home. |
Job Entry | Action | No conceptual changes. The underlying code has changed and the dialogs have been updated, but you’ll feel right at home. |
Metastore | Metadata | All metadata objects in Hop are stored as metadata. This happens behind the scenes. Except for increased usability, as a Hop developer, you’ll hardly notice. |
Carte | Hop Server | Again, smooth sailing. A lot has changed behind the scenes, but you’ll hardly notice. Check the docs |
Pan/Kitchen/(Maitre) | Hop Run | Kitchen and Pan depended on the Spoon GUI code. With the rewrite of Spoon to Hop Gui, we’ve recreated the command line tools. We believe this now is more consistent while providing more options and being easier to use at the same time. Check the docs |
JNDI | gone | jndi in Kettle/PDI is based on an open source project that hasn’t been updated in about a decade. As there was no reason to keep this functionality in Hop, it was abandoned. |
Repositories | gone | Code repositories belong in a VCS these days. We’ve abandoned the file and database (and PDI EE repositories) repositories, but implemented Git integration instead. |
- | Projects, Environments, Run Config | The Kettle Environments Plugin has been integrated and significantly extended. Hop now has integrated functionality to support your projects, environments and run configurations. Check the docs. |
- | Hop Config | This is a new command line tool to configure your projects, environments and run configurations. |
Apache Beam
Apache Beam has been deeply integrated in Hop. Beam allows us to run pipelines directly on