Metadata Types
Metadata is one of the cornerstones in Hop and can be defined as workflows, pipelines and any other type of metadata objects.
Hop Gui has a Metadata Perspective to manage all types of metadata: run configurations, database (relational and NoSQL) connections, logging, and pipeline probes just to name a few.
Metadata is typically stored as json files in a projects' metadata folder as a set of json files, in subfolders per metadata type. The only exception to the rule are workflows and pipelines, which are defined as XML (for now, because of historical reasons). Since workflows and pipelines are what Hop is all about, these are typically stored in your project folder, not in your project’s metadata folder.
We’ve made it as easy as possible to add or remove plugins in Hop. Since metadata types are plugin types too, the available metadata types in your Hop installation may not match this list entirely. |
By default, Hop contains the following metadata types:
-
Asynchronous Web Service: Execute and query a workflow asynchronously through a web service.
-
Beam File Definition: Describes a file layout in a Beam Pipeline
-
Cassandra Connection: Describes a connection to a Cassandra cluster
-
Data Set: This defines a data set, a static pre-defined collection of rows
-
Hop Server: Defines a Hop Server
-
MongoDB Connection: Describes a MongoDB connection
-
Neo4j Connection: A shared connection to a Neo4j server
-
Neo4j Graph Model: Description of the nodes, relationships, indexes, … of a Neo4j graph
-
Partition Schema: Describes a partition schema
-
Pipeline Log: Allows to log the activity of a pipeline with another pipeline
-
Pipeline Probe: Allows to stream output rows of a pipeline to another pipeline
-
Pipeline Run Configuration: Describes how and with which engine a pipeline is to be executed
-
Pipeline Unit Test: Describes a test for a pipeline with alternative data sets as input from a certain transform and testing output against golden data
-
Relational Database Connection: Describes all the metadata needed to connect to a relational database
-
Splunk Connection: Describes a Splunk connection
-
Web Service: Allows to run a pipeline to generate output for a servlet on Hop Server
-
Workflow Log: Allows to log the activity of a workflow with a pipeline
-
Workflow Run Configuration: Describes how to run a workflow