Docker container
Introduction
This is the documentation of the official Apache Hop docker container published on:
It’s a Hop Docker image supporting both short-lived and long-lived setups. A short-lived setup executes a pipeline or workflow and stops right after. A long-lived setup starts a Hop server and waits for work.
Operating system
The docker container runs a minimal Linux system called Alpine. OpenJDK version 11 is then used to execute Apache Hop. The Linux user used to execute in the container is hop
and the group is hop
as well.
Container Folder Structure
Folder | Description |
---|---|
| The installation location of the hop client package. |
| This volume has read-write permissions for Linux user |
| The initial working directory location of the docker container. |
Environment Variables
You can provide values for the following environment variables:
Environment Variable | Default | Description |
---|---|---|
|
| The log level. Use one of: |
|
| The file path to the Hop log file. |
| Name of the Hop project to create in the container. You also need to specify the | |
| Path to the home of the Hop project. | |
|
| Name of the project config file. |
| The name of the Hop environment to create in the container. If you do not set this variable, no environment will be created. When using an environment a project has to be created too | |
| This is a comma separated list of paths to environment config files (including filename and file extension). | |
|
| Any JRE options you want to set. The |
| Path to a folder where your JDBC drivers are located. | |
| Optional path to a custom entrypoint extension script file. You can use this for example to fetch Hop project files from S3 or gitlab. | |
| The system properties that should be set during execution. You can specify the properties as a comma separated list, e.g. |
Here are the variables for short-lived containers, to execute a workflow or pipeline:
Environment Variable | Required | Description |
---|---|---|
| Yes | The path to the Hop workflow or pipeline to execute. If you configured a project you can use a relative path to the project home folder. |
| Yes | The name of the Hop run configuration to use to execute your pipeline or workflow. |
| No | Optionally, you can specify the parameters that should be passed to the pipeline or workflow you are executing. You can specify them as a comma separated list, e.g. |
| No | You can specify a metadata export file here which will contain all required metadata elements. See also the metadata export option in the Hop Conf tool. |
Below are the variables you can use for a long-lived container, running Hop Server:
Environment Variable | Default value | Description |
---|---|---|
|
| The IP address the server will listen to. |
|
| The port the server will listen to. |
|
| The port the server shutdown listener will listen to. |
|
| The username to log into the Hop server. |
|
| The password to log into the Hop server |
| (optional) | You can point to a folder containing metadata JSON files which are then available to the server. |
|
| The maximum number of log lines kept in memory by the server. |
|
| The time (in minutes) it takes for a log line to be cleaned up in memory. |
|
| The time (in minutes) it takes for a pipeline or workflow execution to be removed from the server status. |
| (optional) | The path to the Java keystore file you want to use to run the Hop server with SSL enabled to support https. |
| (optional) | The password of the Java keystore file you want to use to run the Hop server with SSL enabled to support https |
| (optional) | The password of the key if you want to use to run the Hop server with SSL enabled. If both passwords are the same you can omit setting this variable. |
Updating the Hop docker container image
Make sure to get the latest updates for the Hop image by pulling them:
docker pull apache/hop:<tag>
If you do not specify a value for :<tag>
the value latest
will be taken. Latest will contain the last officially released version of Apache Hop. You can also specify Development
as a tag. That image will contain the last built Development snapshot of Apache Hop. rxq7777
How to run the Container
The most common use case will be that you run a short-lived container to just complete one Hop workflow or pipeline.
The first example below runs the sample switch-case-basic.hpl
pipeline from the samples project.
Replace <tag>
with latest
, Development
or a release tag, and replace <HOP_SAMPLE_PROJECT_PATH>
with the path to the config/projects/samples
folder in your Hop installation to mount that folder as a volume. This will make the samples project folder available as the /files
folder in the container.
docker run -it --rm \
--env HOP_LOG_LEVEL=Basic \
--env HOP_FILE_PATH='${PROJECT_HOME}/transforms/switch-case-basic.hpl' \
--env HOP_PROJECT_FOLDER=/files \
--env HOP_PROJECT_NAME=samples \
--env HOP_RUN_CONFIG=local \
--name hop-test-container \
-v <HOP_SAMPLE_PROJECT_PATH>:/files \
apache/hop:<tag>
The second example below runs a workflow.
In addition to the most basic example below, this example adds the environment, based on HOP_ENVIRONMENT_NAME
, HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS
and run parameters with HOP_RUN_PARAMETERS
docker run -it --rm \
--env HOP_LOG_LEVEL=Basic \
--env HOP_FILE_PATH='${PROJECT_HOME}/pipelines-and-workflows/main.hwf' \
--env HOP_PROJECT_FOLDER=/files/project \
--env HOP_PROJECT_NAME=project-a \
--env HOP_ENVIRONMENT_NAME=project-a-test \
--env HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS=/files/config/project-a-test.json \
--env HOP_RUN_CONFIG=local \
--env HOP_RUN_PARAMETERS=PARAM_LOG_MESSAGE=Hello,PARAM_WAIT_FOR_X_MINUTES=1 \
-v /path/to/local/dir:/files \
--name my-simple-hop-container \
apache/hop:<tag>
If you need a long-lived container, this option is also available.
For more information on the long-lived container please also see the Hop Server documentation as it describes what can be done using the Hop Server
Run this command to start a Hop Server in a docker container:
docker run -it --rm \
--env HOP_SERVER_USER=admin \
--env HOP_SERVER_PASS=admin \
--env HOP_SERVER_SHUTDOWNPORT=8080 \
--env HOP_SERVER_PORT=8181 \
--env HOP_SERVER_PORT=8180 \
--env HOP_SERVER_HOSTNAME=0.0.0.0 \
-p 8080:8080 \
-p 8181:8181 \
-p 8180:8180 \
--name my-hop-server-container \
apache/hop:<tag>
localhost is a loopback to your machine, which may be the container but not the host (your laptop or server where your run the container). Use 0.0.0.0 instead to listen on all available interfaces. |
Hop Server is designed to receive all variables and metadata from executing clients. This means it needs little to no configuration to run.
If you want to use the web-services functionality additional information on how to configure your webserver can be found on the user manual Web Service page. For this to work properly the HOP_SERVER_METADATA_FOLDER
variable has to be set too.
When started can then access the hop-server UI from your host at http://0.0.0.0:8181
or http://localhost:8181
Custom Entrypoint Extension Shell Script
To make the Hop Docker image even more flexible, we added a
variable that accepts a path to a custom shell script (that you provide).This shell script will run when you start the container before your Hop project is registered with the container’s Hop config and before your Hop workflow or pipeline gets kicked off. This feature might come in handy when you want to run some custom logic upfront, e.g. source Hop project files from S3 or clone them from GitHub.HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH
The custom shell file can be provided in several ways (this is not a full list):
-
via the mount point (
)/files
-
You create your own Dockerfile, define this image as the base and then use the
instruction to copy your custom shell file in your Docker image.COPY
For the last scenario mentioned, it could be something like this:
We create a simple bash script called
in a sub-folder called clone-git-repo.sh
:resources
#!/bin/bash
cd /home/hop
git clone ${GIT_REPO_URI}
chown -R hop:hop /home/hop/${GIT_REPO_NAME}
We also make it parameter-driven, so it any other team can use it.We create our custom Dockerfile like so:
FROM apache/hop:Development
ENV GIT_REPO_URI=https://...
# example value: https://github.com/diethardsteiner/apache-hop-minimal-project.git
ENV GIT_REPO_NAME=repo-name
# example value: apache-hop-minimal-project
USER root
RUN apk update \
&& apk add --no-cache git
# copy custom entrypoint extension shell script
COPY --chown=hop:hop ./resources/clone-git-repo.sh /home/hop/clone-git-repo.sh
USER hop
Note that apart from defining the new environment variables (that go in line with the parameters we defined in the
earlier on ), we also clone-git-repo.sh
the COPY
file to user hop’s home folder.clone-git-repo.sh
Next let’s build a small script which builds our custom image and then tests it by spinning up a container and running a workflow:
#!/bin/zsh
DOCKER_IMG_CHECK=$(docker images | grep ds/custom-hop)
if [ ! -z "${DOCKER_IMG_CHECK}" ]; then
echo "removing existing ds/custom-hop image"
docker rmi ds/custom-hop:latest
fi
docker build . -f custom.Dockerfile -t ds/custom-hop:latest
echo " ==== TESTING ====="
HOP_DOCKER_IMAGE=ds/custom-hop:latest
PROJECT_DEPLOYMENT_DIR=/home/hop/apache-hop-minimal-project
docker run -it --rm \
--env HOP_LOG_LEVEL=Basic \
--env HOP_FILE_PATH='${PROJECT_HOME}/main.hwf' \
--env HOP_PROJECT_FOLDER=${PROJECT_DEPLOYMENT_DIR} \
--env HOP_PROJECT_NAME=apache-hop-minimum-project \
--env HOP_ENVIRONMENT_NAME=dev \
--env HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS=${PROJECT_DEPLOYMENT_DIR}/dev-config.json \
--env HOP_RUN_CONFIG=local \
--env HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH=/home/hop/clone-git-repo.sh \
--env GIT_REPO_URI=https://github.com/diethardsteiner/apache-hop-minimal-project.git \
--env GIT_REPO_NAME=apache-hop-minimal-project \
--name my-simple-hop-container \
${HOP_DOCKER_IMAGE}