Friday, March 8, 2013

[TABLEAU] Cumulative Distributions

Cumulative Distributions

Cumulative distributions are frequently used to analyze trends across several dimensions. Making strong use of the Tableau "Bin" feature, the finished view below shows the accumulation of a bell curve combined with the bell curve itself. This can be useful when you want to see the biggest contributors to a data set as well as the contribution running up to 100% of total. 

Before beginning, make sure you have determined what you want to analyze. This example calculates the percentage of each record that falls within each $100 profit range. You start by creating a histogram, then add table calculations as necessary to compute the cumulative distribution.

Build a cumulative distribution view.

This example uses the Superstore data.

Step 1
If an instance of Profit (bin) already appears in the Dimensions pane, right-click it and select Delete. Then in the Measures pane, right-click Profit and select Create Bins.

Creating bins converts Profit from a measure to a dimension.

Step 2
In the Create Bins dialog box, in the Size of bins text box, type 100.

Step 3
Under Range of Values, click Load to load the values to include in the bins.

Step 4
When finished, click OK.

Profit (bin) appears as a new field in the Dimensions pane.

Step 5
From the Dimensions pane, drag Profit (bin) to the Columns shelf.

Step 6
From the Measures pane, drag Number of Records to the Rows shelf to create a histogram similar to the one shown below. Scroll right to view the center of the histogram.

Step 7
Now you need to change the Number of Records from a number to a percentage. On the Rows shelf, right-click SUM(Number of Records) and select Add Table Calculation.

Step 8
In the Table Calculation dialog box, in the Calculation Type list, select Running Total.

Step 9
In the Running along list, select Profit (bin).

Step 10
Select the check box for Perform a secondary calculation on the result.

Step 11
In the Secondary Type list, select Percent of Total.

Step 12
In the Summarize the values from list, select Profit (bin).

Step 13
When finished, click OK.

Step 14
On the Marks card, in the list, select Line.

Add the original values in the view.

After building a basic cumulative distribution, you can optionally add a second line or bar indicating the original values. Follow the steps below to add the original values to the view.

Step 1
Scroll right until the right edge of the view is visible, then drag the original measure, Number of Records, to the right side of the view to create a secondary axis.

Step 2
On the Marks card, in the list, select either Bar or Line.

Step 3
If you select bar marks, from the Dimensions pane, drag Measure Names to the Size shelf on the Marks card to further differentiate the two sets of bars.

Step 4
To help associate each axis with its respective bar marks, you can shade the axis to match the corresponding colors. To shade an axis, right-click it and select Format. In the Format window, on the Axis tab, under Default, select a color in the Shading list. Repeat this formatting for the other axis.

Step 5
(Optional) To edit the colors in the view, click the menu arrow on the color legend title bar to open the Edit Colors dialog box. To edit the size of the bars in the view, click the menu arrow on the size legend title bar to open the Edit Sizes dialog box.

The final cumulative distribution view is shown below:

1 comment: