Bivariate

../_images/bivariatenodeicon.png

Bivariate Node icon

The Clario Bivariate node allows you to create three types of bivariate tables to examine the relationships between attributes on an input data stream:

  • A two-way frequency table or cross tabulation (often abbreviated as cross tab) for string attributes
  • A means table, for string vs. numeric attributes
  • A correlation matrix, for numeric attributes

Examples:

  • Compare counts or percentages of the number of Males vs. Females who order from a marketing campaign
  • Summarize mean age of residents by state
  • Look at the relationship between age and income

The Bivariate node can be connected to a variety of nodes, (e.g. Read File, Aggregate, Append, Missing, etc.), but requires a valid stream of data.

Configuration

The Bivariate node has three configuration tabs: Two-Way Frequency Table, Means Table, and Correlation Matrix.

Two-Way Frequency Table Tab

../_images/bivariatetwowayfreq.png

Two-Way Frequency Table Tab

Select at least two string attributes to produce this Table. Begin by adding at least one Base Attribute by clicking the [+] on the bottom left. This attribute name will be used as the Table name in results. Then select one or more attributes as the Compare Attribute by dragging an attribute from the Available Attributes box into the Compare Attributes box. Click Save to save Table configuration or Cancel to exit without saving. To erase a previously defined table, click on the table name (in Base Attributes) and click the [-] on the bottom left.

Note

The only attributes available for the Two-Way Frequency Table are string attributes.

Means Table Tab

../_images/bivariatemeanstable.png

Means Table Tab

To create the Means Table click [+] on the bottom left. Select a Base Attribute. After a Base Attribute is selected, the Available Attributes list will be filtered by data type. For example, if a string attribute is selected, only numeric attributes will be listed as available in the Available Attributes box and vice versa. Select one or more attributes as the Compare Attribute by dragging an attribute(s) from the Available Attributes box into the Compare Attributes box. Click Save to save Table configuration or Cancel to exit without saving. To erase a previously defined table, click on the table name (in Base Attributes) and click the [-] on the bottom left.

Correlation Matrix Tab

../_images/bivariatecorrelationmatrix.png

Correlation Matrix Tab

To create the Correlation Matrix, select one or more attributes as the Selected Attribute by dragging that attribute from the Available Attributes box into the Selected Attributes box. All numeric attributes will be available in the matrix results. See tips on Finding and Selecting Attributes.

Results

The Bivariate Node results set has up to three tabs. Views will be populated based on the tables requested in the node configuration.

../_images/bivariateresults.png

Bivariate Results

Two-Way Frequency Table Tab

Click on the Two-Way Frequency tab. Select a Base Attribute (table name) from the top left drop-down box, and a Compare Attribute from the drop-down box beneath it. The count in each cell will appear, along with the overall %. Overall statistics also appear in the upper right corner. Export this table to a spreadsheet, if desired, by clicking on the Export to Spreadsheet button on the Toolbar. Note that only the counts (not the % values) export to the spreadsheet.

../_images/bivariaterestwo.png

Two-Way Frequency Tab

Means Table Tab

../_images/bivariateresmeans.png

Means Table Tab

Click on the Means Table tab. Select a Base Attribute (table name) from the top left drop-down box, and a Compare Attribute from the drop-down box beneath it. The means table will appear, with the string attribute defining the rows and the numeric attribute summarized across the columns. Numeric values displayed include: N, Mean, Minimum, Maximum, Standard Deviation, and Sum. Export this table to a spreadsheet, if desired, by clicking on the Export to Spreadsheet button on the Toolbar.

Correlation Matrix Tab

Click on the Correlation Matrix tab. By default, all attributes are selected and a symmetrical matrix is shown. To add or remove attributes displayed in the matrix, click the [Attributes...] button then drag the attribute names from the Available Attributes list to the Selected Attributes list. To view a symmetrical matrix, where the diagonal values will all be 1, check the Symmetrical Matrix box. If this box is not checked, Bivariate shows the correlations of the selected attributes with all numeric attributes on the incoming data stream. Click [Save] to view the matrix. To view the correlations between only one attribute with all other attributes (for example to look at the correlation between Age and all other attributes) select only Age and do not check the Symmetrical Matrix box. Bivariate will then show the correlations between Age and all numeric attributes on the incoming data stream. A slider in the upper right corner allows highlighting of cells over a certain absolute value. Export this correlation matrix to a spreadsheet, if desired, by clicking on the Export to Spreadsheet button on the Toolbar.

../_images/bivariaterescor.png

Correlation Results Tab

Output Stream

There is no data file output from the Bivariate node, as it is a terminal node. However, the Bivariate node results tables can be exported into Excel by clicking on the Export to Spreadsheet button on the Toolbar.