duelingdata.blogspot

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Friday, 1 September 2017

How To Gauge Chart in Tableau (UPDATED)

Posted on September 01, 2017 by kabir
A while back I wrote a post on how to create a gauge chart in Tableau. At the time I felt bad about writing it because I thought it was a bad chart. I have since come around on the gauge. See my previous post on bullets vs gauges for mobile dashboards. And this created some convo on the Twitters about the merits of the gauge. But I think it's an unfairly maligned chart type.
So I think this should be an available chart type to create simply in Tableau. But the old approach I outlined required too much data re-shaping. So at last year's Tableau Conference I presented a new way of creating a gauge that required no data re-shaping. Just a bit of math. Outlined below are updated steps for creating a gauge in Tableau.

For this example I used Superstore Sales; The best damn dummy data set in town. This gauge will show sales by Product Sub-Category relative to one another. So we start with Sub-Category [Sales]. 



Then we have to normalize sales to put it on a 0 to 1 scale. So I create a new calculated field called [Sales%] using the following formula:

Which converts the scales from dollars to percentages normalized from 0 to 1.



We will use this new field [Sales%] to create the X and Y coordinates which points the gauge needle to a point on the circumference of the gauge. For this gauge I am only using 180 degrees or a semi-circle from 9 o'clock to 3 o'clock on a watch face. Gauge ranges vary. To adjust the angle you simply need to adjust the number 180 in the below calculated field. To create the X and Y calculated fields I use a parametric equation with the [Sales%] field above. For now I am calling these fields [X All] and [Y All]



Sales x 180 essentially creates an angle such that the highest point is at 180 degrees and the lowest at 0 degrees. Now we have the following fields.


When you place [X All] in the columns shelf and [Y All] in the rows shelf then set the calculated field to compute using Sub-Category we get something that looks like this. Beautiful I know. All Sub-Categories are positioned from left to right, lowest to highest sales along the circumference. 


But we want a single point for each Product Sub-Category. To do this we will need a parameter to identify the specific point we want to show. So we create a parameter called Sub-Category with the same fields as Product Sub-Category. The fastest way to do this is to right-click on the dimension Sub-Category and select Create > Parameter.

Then we need to tie our X and Y coordinates to this parameter to only show a single point. So we modify the [X] and [Y] measures slightly. In addition to only showing the coordinates of the selected Sub-Category we also need a fixed origin point; in this case 1,1. This places all points not tied to the parameter at the origin point. Which we will use to create the line of the gauge.


Now we have the measures we need. We place the new [X] and [Y] measures into the columns and rows shelves respectively. Then set the calculated field to compute using Sub-Category and set the mark types to line. The new point on the circumference is tied to the parameter and the rest of the points are placed at the origin point (1,1).



We are almost there. Add a background image with shaded reference points setting the X field to X and the Y field to Y. The left is set at 0.5 and right at 2.5. The bottom at 1 and top 2.5  I created this one in PowerPoint using pie charts. This unfortunately makes the context points for the chart stagnant.



Finally, format the Tableau elements as you set fit. I thickened the bar, changed the color, and added a white border. I then added some labels and legends.



I hope this is helpful. Please let me know if you have any questions in the comments below.
Read More
Posted in | No comments

Sunday, 27 August 2017

Most Popular Beatles Songs

Posted on August 27, 2017 by kabir
This data visualization shows the popularity of Beatles songs using the Spotify API. I then assigned each song to a songwriter using Wikipedia data from an old Beatles project.


This is a new type of visualization I am calling a Marimekko Slope Plot. It combines several chart types and communicates a lot of information. The bar width shows the sample size (number of songs) by a discrete category (songwriter). The dots show a distribution of points (songs) with ability to compare across discrete categories (songwriter). Finally it show trends over time through the area chart slope (song popularity trend). I will post later about other applications of this chart and how it was created in Tableau.
Read More
Posted in | No comments

Sunday, 6 August 2017

Triangle Maps in Tableau

Posted on August 06, 2017 by kabir
This triangle map was inspired by a UK election map in the Financial Times created by Billy Ehrenberg. This map below shows voting results in Wisconsin by county comparing 2012 to 2016 and where Clinton lost compared to Obama. If you are interested in a how-to post let me know in the comments below. Data source: Wikipedia.


Read More
Posted in | No comments

Wednesday, 2 August 2017

Spiral of Unemployment

Posted on August 02, 2017 by kabir
This is a spiral heatmap of US unemployment rates from 1976 to 2017 by month. This viz was inspired by a similar viz by Tom Shanley created in D3. This version is created in Tableau.


Read More
Posted in | No comments

Sunday, 16 July 2017

Game of Thrones Season 6 Recap

Posted on July 16, 2017 by kabir
Read More
Posted in | No comments

Friday, 7 July 2017

Bullet vs Gauge

Posted on July 07, 2017 by kabir
The gauge chart is an oft-maligned chart type. It has a poor data-to-ink ratio and it's difficult to interpret accurately. But I think it has some merit. Firstly, it is a very familiar chart type that most people can interpret it consistently. After all, it has been used on car dashboards since the Model-T.

Gauges also have value when on a mobile dashboard displaying a single metric. The gauge uses more vertical real estate and it still difficult to interpret accurately. But I have found it is less confusing than a bullet chart in this instance. Bullet charts are hard to read because the origin point is confusing without multiple reference bars to compare. Furthermore, bullet charts are generally more confusing than gauges because they are new to most people. But, most people have seen a gauge and can consistently interpret the bands and reference points. Also a bullet charts require more horizontal real estate which is more valuable on a mobile device than a desktop since the aspect ratios are flipped.

Here is a side-by-side comparison of both chart types in a mobile dashboard in Tableau. See my review of the strengths and weaknesses below. Please let me know your thoughts below.

In my opinion both options have their strengths and weaknesses.

Bullet Chart Dashboard
   + Bullet chart is more accurate because you are comparing distance not angle
   + Requires less vertical real estate because the gauge chart is taller
   – Bullet charts are unfamiliar to most people and the bands can be confusing
   – A single metric looks odd and it's not obvious you read the bar left-to-right
   – Requires more horizontal real estate because bullet is longer

Gauge Chart Dashboard
   + Gauge chart is more common meaning consistent interpretation and fewer questions
   + When only displaying one metric the visual weight of the gauge focuses user's attention on that metric
   – Gauge chart is less accurate because you are comparing angle not distance
   – Gauges axes are not labeled so inaccurate interpretation is more likely
   – Requires more vertical real estate because bullet chart is shorter

This example is inspired by real life dashboard my team and I built at Deloitte. The audience were executives and each dashboard only showed one metric relative to target with metric performance over time. We used a bullet chart and the audience was confused and frustrated. I think the gauge approach is more familiar albeit not best practices.

Read More
Posted in | No comments

Friday, 23 June 2017

Trump Lies

Posted on June 23, 2017 by kabir
The New York Times recently cataloged all of President Trump's lies since he took office. Here is breakdown of the topics of those lies over time.
Read More
Posted in | No comments

Thursday, 1 June 2017

200 Songs of Springsteen

Posted on June 01, 2017 by kabir
Bruce Springsteen is awesome. His music is personally very important to me. So I wanted to make a viz to celebrate his amazing career. This analysis looks at patterns in his music by song including song loudness, valence, lyric sentiment, energy, acoustic levels, and popularity. You can also find similar songs to the ones you like at the bottom. Hope you enjoy.

Read More
Posted in | No comments

Thursday, 27 April 2017

Predicting QB Success in the NFL

Posted on April 27, 2017 by kabir
Last year I wrote and submitted a paper for the MIT Sloan Sports Analytics Conference. While my abstract was accepted my paper was not. The title of my paper was Reducing Risk in the NFL Draft: Using Machine Learning Algorithms to Predict Success in the NFL. You can read the full paper here. 

In it I describe a decision tree model that predicts a college QBs success in the NFL. To train the model I used over 40 variables including college stats, school competitiveness, combine performance, and text mining of pro scouting reports. Ultimately, the final model used 4 variables: college win %, body mass index (BMI), college games started per season, and age. The final model was 88% accurate in predicting whether a college player would be a success or a bust in the NFL. This model can be used to predict whether the top prospects in this year's draft will be successful in the NFL.

Below is an interactive version of that final QB model.

Read More
Posted in | No comments

Saturday, 1 April 2017

NFL Combine & Triangle/Ternary Plot

Posted on April 01, 2017 by kabir
Triangle/Ternary plots are a good way of displaying the relative positioning of points across three variables. This can be used to cluster and classify points based on these three variables. After the break I have outlined a way of creating an interactive version of a triangle/ternary plot in Tableau.

Here is an example of a triangle/ternary plot on the performance of collegiate players at the 2017 NFL combine across three variables: size (BMI), speed (40 time) and strength (bench press).


hot to make it using R & Tableau
To create this using R and Tableau is relatively easy. First, you need a file with just three variables scaled appropriately. For example, in the above chart, a high 40 time is not good or representative of high speed. So I normalized all of the variables using min-max normalization:
You can do this is in R, Alteryx, Excel or whatever. Then inverted the value for 40 times:
So I have a table of 3 normalized variables (40 time, BMI, and bench press) by player. I also add 3 anchor points to determine the corners of my triangle plot. So I have three extra rows where the value is 0 for each row except 1 for each variable.

I then save this as a csv file. I import these variables into R in order to create the triangle/ternary plot. Here is the R script below:

install.packages("ade4")
library(ade4)
tri_data <- read.csv('C:\\Users\\name\\Documents\\2017_Combine4.csv')
x <-triangle.plot(tri_data, label = as.character(1:nrow(ta)))
write.csv(x, 'C:\\Users\\name\\Documents\\2017_Combine_Tri_w_anchors.csv')


This will create X and Y coordinates associated with each point (player). I append this to my player data set and bring this into Tableau. To create the triangle plot in Tableau I place the X coordinates in the column shelf and the Y coordinates in the row shelf and convert the measures to continuous dimensions.


Finally, I add a triangle as a background image, fix the X and Y axes, and then remove anchor points (in the corners). I also do some formatting removing the gridlines, adding color, and increasing the transparency.



hot to make using it just Tableau
The talented Mike Cisneros (@mikevizneros) recommended an approach that skipped the R step and allows you to create this entirely in Tableau. Here is link to Mke's version on Tableau Public. It only requires two simple calcs and is much more clever than mine.

Essentially, Mike imported the same data, normalized in Tableau, and then created the X and Y coordinates in Tableau rather than R. In the below Benchn, Bmin and 40n are the normalized measure values for Strength/Bench, Size/BMI, and Speed/40 Time respectively. Here are Mike's calcs in Tableau:




See details on these calculations here on Wikipedia. You again place the X and Y values into the column and row shelves respectively and repeat the other steps above.

You can download my Tableau Public file above. It has both versions (R and Tableau) included. I hope this is helpful. If you have questions please leave them in the comments box below. Thank you.


Read More
Posted in | No comments

Sunday, 12 February 2017

Spark Bar Chart

Posted on February 12, 2017 by kabir
A spark bar chart, at least that is what I am calling it for now, combines a sparkline and a bar chart into one chart. The length of the bar is a value corresponding to the end of the sparkline which represents the last period or current value. In the example below the bar represents sales in December 2015 and the sparkline is sales by month for the last 4 years.


The height of the end point on the sparkline visualizes current value relative to height of the bar which represents the range of values across previous months. See an annotated version below.



Strengths
  • Contains a high data to ink ratio
  • Provides historical context or trend relative to a current point
  • The primary visual weight is still a bar chart which is easy to interpret
  • It is fun to say: spark bar chart, spark bar chart... do it, it's fun
Weaknesses
  • If the bar is too short a sparkline would be impossible to see
  • It is visually dense and difficult to understand for some audiences


Here are steps for creating this chart in Tableau. For this example I used the Superstore Sales data set. First, add your dimensions and sparkline measure to the row shelf. 


Drag a combined field to the Details shelf if you are using multiple dimensions such as Category and Region above. I called my combined dimension CR. Also drag Order Date to the Details shelf. You will need these for a table calculation later. 

Then, create your x-axis which I called Sales X. This is the sum of sales for the last period (Current Sales) plus 2 times the date value (Order Date). 2 is not required but depends on the measure you use. This creates the sparkline points starting at the current value. See the calculation for Current Sales further down.


Drag Sales X to columns shelf and then set the Table Calculation values. Select Specified Dimensions: your combined dimension (CR) and the sparkline date value (Order Date). Select at the Deepest level and Restarting at the combined dimension (CR).


Mark sure marks is set to Line. Edit the y-axis (Sales) to be independent axis ranges for each row or column. Then edit the x-axis (Sales X) to not start at zero. You should have something that looks like this.


Next drag Current Sales to the left of Sales X in the column shelf. Current Sales is just the sales value for the most current period. See calculation below.


Set the mark type to circle. Format the circle so it matches the line color (white) with a border. Add a reference line to the Current Sales axis. Set the scope at the Per Cell level, set the value to Current Sales and Average. Set No Label or Line. Then add a Fill Below color. This creates the bar in the spark bar chart.


Finally set the axis to a dual axes and you should have something that looks like this.


Let me know if you have any questions in the comments below. Thanks.
Read More
Posted in | No comments

Monday, 9 January 2017

Art & Political Entrenchment

Posted on January 09, 2017 by kabir
I recently visited the Phillips Collection gallery here in DC and saw the work of one of my favorite artists: Camille Pissarro. In one of his paintings, The Seine Valley at Les Damps, he uses an impasto technique in the clouds with bold, hatch brushstrokes. I wanted to try re-create this hatch effect in a viz.

This viz shows how every state voted in the Presidential election since 1964. Each mark is a state where the angle is the degree to which they voted democrat (left) or republican (right). The sharper the angle the more heavily they voted for one party. The thickness of the mark is how many people voted and the color is which party won the state. I didn't quite achieve the effect I wanted but am happy with the result nonetheless. See findings below.

As you can see party shifts were much more common in the past. In 1964, 45 states voted for Johnson (Democrat) and in 1972, 49 states voted for Nixon (Republican). However, since 2000 party shifts have been increasingly less likely. In the past five elections only six states have voted with either party more than once: Colorado, Florida, Iowa, Nevada, Ohio, and Virginia. Seemingly, political division and entrenchment are up.
Read More
Posted in | No comments

Tuesday, 6 December 2016

Fan Gauge

Posted on December 06, 2016 by kabir
The second biggest disaster on election night might have been the New York Time's jittery gauge for their live presidential forecast. Some people took issue with the random jitter effect used to display uncertainty even calling it "irresponsible". Gregor Aisch, who works at the NYT and co-created the viz, explained their rationale. I appreciate their desire to explain uncertainty and generally love the work of the NYT Graphics Department but agree the randomness effect was confusing.

Displaying uncertainty is tricky but sometimes very important to a data visualization. Below is my proposed alternative: a fan gauge. This approach is like a typical gauge in that it displays a single point relative to a range of points or targets. The addition is the uncertainty displayed by the "fan" around the single point. The angle of the fan is the degree of uncertainty. The fan is like the jitter effect but static and easier to interpret. 




Update: Here is another version in a bullet, non-gauge format. This has a better data-to-ink ratio but still display a point relative to a range of targets as well as the degree of uncertainty. The point is the circle and the length of the "pill" is the degree of uncertainty. Let me know what you think?

Read More
Posted in | No comments
Older Posts Home
Subscribe to: Comments (Atom)

Popular Posts

  • How To Gauge Chart in Tableau (UPDATED)
    A while back I wrote a post on how to create a gauge chart in Tableau . At the time I felt bad about writing it because I thought it was a b...
  • Spark Bar Chart
    A spark bar chart, at least that is what I am calling it for now, combines a sparkline and a bar chart into one chart. The length of the bar...
  • Art & Political Entrenchment
    I recently visited the Phillips Collection gallery here in DC and saw the work of one of my favorite artists: Camille Pissarro. In one of hi...

Blog Archive

  • September 2017 (1)
  • August 2017 (3)
  • July 2017 (2)
  • June 2017 (2)
  • April 2017 (2)
  • February 2017 (1)
  • January 2017 (1)
  • December 2016 (1)
  • November 2016 (2)
  • August 2016 (4)
Powered by Blogger.

Search This Blog

Report Abuse

  • Home

About Me

kabir
View my complete profile

How To Gauge Chart in Tableau (UPDATED)