Join

../_images/joinnodeicon.png

Join Node icon

The Clario Join node is used to combine two data streams on key attribute(s) via one of four different join types: inner, left, right, and full. The key attribute(s) must be sorted in the same sequence on each of the input data streams. As a result of join, you can either write out a file or stream the resulting data to another node. The functionality of this node is similar to an SQL Join. The input node connector of the Join node must be connected to two separate Read File nodes.

Configuration

The join node has two tabs: Configuration and Summary.

Configuration Tab

../_images/joinconfigure.png

Configuration Tab

The Top Connector list box displays the data stream connected to the top node connector while the Bottom Connector list box displays data from the bottom node connector. The Joins list box displays the attribute pairings that define the join.

The first step within the configuration tab is to select an attribute from both the top connector list box and the bottom connector list box in which you would like to join. These selected attributes will become highlighted and a dotted line will appear connecting the two with a green plus sign in the middle (see Figure). Clicking on the plus sign will confirm the connection between the two selections, thus removing the attributes from the connector list boxes and moving this pairing to the Joins list box. At least one pair must be created by joining one attribute from the ‘Top Connector’ list to one attribute form the ‘Bottom Connector’ list. If there is a need to remove the connection between the two attributes, click on the connection in the Joins list box and then click the [-]. The attributes will then reappear in the connector list boxes. See tips on Finding and Selecting Attributes. More than one connection can be made between attributes. However, an attribute in the top node connector can be connected to one and only one attribute in the bottom node connector.

../_images/joincloseup.png

Join Top and Bottom Connectors

Within the Settings, select the desired Join Type. The default Join Type is Inner. Descriptions of each Join Type are as follows:

  • Inner: Returns all rows with exact matches on the join key attributes from both the left and right (top and bottom node connector) attribute list.
  • Left: Returns all rows from the left (top node connector) attribute list along with just the rows from the right (bottom node connector) attribute list containing exact matches with the join key attributes.
  • Right: Returns all rows from the right (bottom node connector) attribute list along with just the rows from the left (top node connector) attribute list containing exact matches on the join key attributes.
  • Full: Returns all rows from the left (top node connector) and the right (bottom node connector) attribute list where the join key attributes match, as well as records from each data set that do not match. Attribute values for the rows from the left or right data sets that do not match will have a value of null.

To merge the two (or more) join key attributes, put a check by the Merge Keys option. Left unchecked, the join node will output all join key attributes as is.

Summary Tab

../_images/joinsummary.png

Summary Tab

The Summary tab lists each outgoing attribute, its source connector (Top or Bottom), its source attribute, and the outgoing attribute type. If you wish to change the names of the outgoing attributes, click on the [Rename] button (bottom center) and edit the text in the resulting popup. It is also possible to save the join configuration for later review; to do so click on the [Export] button and you’ll be prompted to enter a filename before downloading the spreadsheet.

Results

There are no results for the Join node. It is assumed the Join node will be connected to another node to further analyze the newly joined dataset.

Output Stream

The new joined dataset is ready for immediate use in various nodes located throughout Clario. The data can be linked to the Write File node where the Formatted File Preview is available for viewing.

Table Of Contents

Previous topic

Filter

Next topic

Linear Regression