Bivariate Node icon
The Clario Bivariate node allows you to create three types of bivariate tables to examine the relationships between attributes on an input data stream:
Examples:
The Bivariate node can be connected to a variety of nodes, (e.g. Read File, Aggregate, Append, Missing, etc.), but requires a valid stream of data.
The Bivariate node has three configuration tabs: Two-Way Frequency Table, Means Table, and Correlation Matrix.
Two-Way Frequency Table Tab
Select at least two string attributes to produce this Table. Begin by adding at least one Base Attribute by clicking the [+] on the bottom left. This attribute name will be used as the Table name in results. Then select one or more attributes as the Compare Attribute by dragging an attribute from the Available Attributes box into the Compare Attributes box. Click Save to save Table configuration or Cancel to exit without saving. To erase a previously defined table, click on the table name (in Base Attributes) and click the [-] on the bottom left.
Note
The only attributes available for the Two-Way Frequency Table are string attributes.
Means Table Tab
To create the Means Table click [+] on the bottom left. Select a Base Attribute. After a Base Attribute is selected, the Available Attributes list will be filtered by data type. For example, if a string attribute is selected, only numeric attributes will be listed as available in the Available Attributes box and vice versa. Select one or more attributes as the Compare Attribute by dragging an attribute(s) from the Available Attributes box into the Compare Attributes box. Click Save to save Table configuration or Cancel to exit without saving. To erase a previously defined table, click on the table name (in Base Attributes) and click the [-] on the bottom left.
Correlation Matrix Tab
To create the Correlation Matrix, select one or more attributes as the Selected Attribute by dragging that attribute from the Available Attributes box into the Selected Attributes box. All numeric attributes will be available in the matrix results. See tips on Finding and Selecting Attributes.
The Bivariate Node results set has up to three tabs. Views will be populated based on the tables requested in the node configuration.
Bivariate Results
Click on the Two-Way Frequency tab. Select a Base Attribute (table name) from the top left drop-down box, and a Compare Attribute from the drop-down box beneath it. The count in each cell will appear, along with the overall %. Overall statistics also appear in the upper right corner. Export this table to a spreadsheet, if desired, by clicking on the Export to Spreadsheet button on the Toolbar. Note that only the counts (not the % values) export to the spreadsheet.
Two-Way Frequency Tab
Means Table Tab
Click on the Means Table tab. Select a Base Attribute (table name) from the top left drop-down box, and a Compare Attribute from the drop-down box beneath it. The means table will appear, with the string attribute defining the rows and the numeric attribute summarized across the columns. Numeric values displayed include: N, Mean, Minimum, Maximum, Standard Deviation, and Sum. Export this table to a spreadsheet, if desired, by clicking on the Export to Spreadsheet button on the Toolbar.
Click on the Correlation Matrix tab. By default, all attributes are selected and a symmetrical matrix is shown. To add or remove attributes displayed in the matrix, click the [Attributes...] button then drag the attribute names from the Available Attributes list to the Selected Attributes list. To view a symmetrical matrix, where the diagonal values will all be 1, check the Symmetrical Matrix box. If this box is not checked, Bivariate shows the correlations of the selected attributes with all numeric attributes on the incoming data stream. Click [Save] to view the matrix. To view the correlations between only one attribute with all other attributes (for example to look at the correlation between Age and all other attributes) select only Age and do not check the Symmetrical Matrix box. Bivariate will then show the correlations between Age and all numeric attributes on the incoming data stream. A slider in the upper right corner allows highlighting of cells over a certain absolute value. Export this correlation matrix to a spreadsheet, if desired, by clicking on the Export to Spreadsheet button on the Toolbar.
Correlation Results Tab
There is no data file output from the Bivariate node, as it is a terminal node. However, the Bivariate node results tables can be exported into Excel by clicking on the Export to Spreadsheet button on the Toolbar.