Enhanced JSON Output transform Icon Enhanced JSON Output

Description

The Enhanced JSON Output transform allows you to generate JSON blocks based on input transform values.

Output JSON will be available as a Javascript array or Javascript object depending on transform settings.

Because this transform loops over the fields defined as Group Key and serializes JSON output accordingly, it is extremely important to sort the input data by the group key. Failing to do so may return incorrect or unexpected data.

Supported Engines

Hop Engine

Supported

Spark

Maybe Supported

Flink

Maybe Supported

Dataflow

Maybe Supported

Options

General Tab

General tab allows to specify the type of transform operation (output JSON for the next transform and/or in a file), as well as the transform behaviour.

Option Description

Transform name

Name of the transform; this name has to be unique in a single pipeline.

Operation

Specify transform operation type. Currently three types of operation are available:

  1. Output value - only pass output JSON as a transform output field, do not dump to output file

  2. Write to file - only write to file, do not pass to output field

  3. Output value and write to file - dump to file and pass generated JSON as a transform output field

JSON block name

If specified, the output of the transform will always be a JSON object with a single first-level node, whose name will be this value.

If empty, the transform can output a JSON array or object, depending on the settings in the other tabs.

Output value

The name of the output field (if enabled) for the generated JSON block.

Force arrays in JSON

If checked, the output will be an array even when the transform result is a single JSON (object) fragment.

Force single grouped Item

If checked, values will be grouped by column and all the values will be enclosed in an array. If unchecked, a JSON object fragment will be created for each input row, and then they will be grouped into an array.

Pretty Print JSON

If checked, JSON output will be pretty printed.

Output File

Option Description

Filename

full path to output file

Append

If not checked - new file will be created every time transform is running. If file with specified name exists already, it will be replaced by a new one. If checked - new JSON output will be appended to the end of existing file. Or if file does not exist, it will be created as in previous case.

TIP: If you want to create a ND-JSON (Newline-Delimited JSON) file, you may get better results by outputting the JSON rows and then printing them with a Text file output transform

Split JSON after n rows

If this number N is larger than zero, split the resulting JSON file into multiple parts of N rows.

Create Parent folder

Check this option to create the folders structure, if some of them are missing in the provided path. If this option is not checked and the full path cannot be found, the transform will fail.

Do not open create at start

If not checked - file (and in some cases parent folder) will be created/opened to write during pipeline initialization. If checked - file and parent folder will be created only after transform gets any first input data.

Extension

Output file extension. Default value is 'js'

Encoding

Output file encoding

Include date in filename?

If checked - output file name will contain File name value + current date. This may help to generate unique output files.

Include time in filename

If checked - output file name will contain file creation time. Same as for 'Include date in filename' option

Show filename(s) button

Can be useful to test full output file path

Add file to result filenames?

If checked - created output file path will be accessible from transform result files

Group Key Tab

This tab is used to define the key fields used for grouping the rows in JSON fragments.

Rows with the same values in the key fields allow you to generate JSON fragments from the row data, and group them in a single JSON array. The key fields defined here will also be forwarded to the next transform as they are.

If no group field is defined, all the rows will be grouped in a JSON array and the transform output will be a single row and a single column.

Option Description

Fieldname

Input transform field name that will contribute to define the input transform fields key.

Element name

JSON element name. For example "A":"B" - A is a element name, B is actual input value mapped for this Element name.

Fields Tab

This tab is used to map input transform fields to output JSON fragments.

The selected fields will be converted in a JSON fragment (usually a JSON object) that can also include the fields used for grouping (those will assume the same values in all fragments).

The fragments will then be grouped in JSON arrays, based on the rules defined in the Group Key tab above.

Option Description

Fieldname

Input transform field name. Use 'Get Fields' button to discover available input fields

Element name

The key name to use in JSON for this field (it can be different from the actual field name).

For example "A":"B" - A is an element name, B is actual input value mapped for this Element name.

JSON Fragment

If the value is set to Y the value contained in the filed is a JSON chunk and will be treated accordingly. You can use this option (and a chain of transforms, see the example pipeline in the Notes below) to generate complex JSON structures.

Remove Element name

If the value is set to Y it will ignore the Element name and insert the JSON Fragment without wrapping it. Only works with JSON Fragment = Y

Remove if Blank

If the value is set to Y and value in incoming field is null, the related attribute will be omitted from JSON output

Notes

Look at the sample provided json-output-generate-nested-structure.hpl for a better understanding about how the transform works