1. Advanced Tableau Visualization 1.1. Clustering in Tableau - From the previous lecture, use the scatter plot sheet: - From the scatter plot with All Opiod RX~All OpiodDeaths (varied by County name) - Remove Trendline - Analysis/Cluster: drag to the plot and select 4cluster - Manually select the 3 points (Horry,Greenville, Charleston) - Right click and Create Set. - Enter name: “Cty HighOpiods withDeaths/Rx” - The new set will appear in the Data Tab - Drag the newly created set to Filter - Save the Sheet name to Cluster 1.2. Trend for both data sets -Drag Data Year to Columns -All Opoids Deaths & All Opoids Rx to Rows -Since the y-axis are in difference scale,we need to reduce the scale by clicking down arror to each Rows data and select “QuickTable calculation” – “Percentage Difference” -Notice that the unit will be the same andthere are reduction of number of deaths for the year 2016 in both data sets 1.3. Segmentation -Open new Sheet and name it Segmentation -Drag Alll Opoids Death & All Opoids RX toColumns -Drag County name to Rows -Drag County name to Colors -Select the sort for All OpoidDeaths 1.4. Baseline - Download Global_T_SST_MSL.csv and open it - Data\New Data source to open a new Data set - Drag Dates to Columns, click on the down arrow next to Date and change fromDiscrete to Continuous - Analytics: Reference Band and drag to Sheet then choose Table - Data tab: Drag Median to Rows, Date toDetail (click on down button and change to month) - Drag Median to Color - Change color spectrum to Sunrise – SunsetDiverging - Change the sheet name to “GlobalTemperature” 1.5. Forecast - Drag Dates to Columns, click on the down arrow next to Date and change from Discrete to Continuous - Drag MLS to Row - Analytics then Drag Forecast to the sheet - Change sheet name to Forecast 1.6. Heatmap 1.7. Plot in 2 axes 2. Configure R to operate in Tableau Download R: - For Window OS: download here: https://cran.r-project.org/bin/windows/base/ - For Mac OS: download here: https://cran.r-project.org/bin/macosx/ Download RStudio: https://rstudio.com/products/rstudio/download/ Install Rserve package: - Open R - Tools\Install Packages: type in Rserve Load Rserve package in R: > library(Rserve) > Rserve() Open Tableau: - Help\Settings and Performance\Manage External Service Connection\ - External Service: Rserve - Serve: localhost - Port 6311 - Click on Test Connection - If connection successful, press OK - You are ready to run R in Tableau 3. Data mining in Tableau using R 3.1. Load input data and working with Script() - Download the `mtcars.csv` from the link below - Open Tableau and load the text file `mtcars.csv` - This is a sample data sets with different specifications of car's type, fuel consumption, made, weight, etc. In our data mining example, we gonna simulate the "mile per gallon: mpg" variable based on different input files - Now let's create a simple script. - Go to Sheet - Analysis\Create Calculate Fields... - A new window appears. Enter the name for the Field: "SimpleR" - Select SCRIPT_REAL from the right window and start typing the input SCRIPT_REAL('output <- .arg1 + .arg2+.arg3',AVG([Mpg]),AVG([Cyl]),AVG([Disp])) - Press Ok to go back to Sheet - Drag SimpleR to Rows - Drag Mpg to Columns - Drag Car to Color - Observe the change 3.2. Linear Modeling with R and Tableau We will create the Linear Modeling model in R and visualize that in Tableau. Open RStudio - Tools\Install Packages, type in caret to install - Type in the following script into R script: library(caret) data(mtcars) set.seed(123) indT <- createDataPartition(y=mtcars$mpg,p=0.6,list=FALSE) training <- mtcars[indT,] modLM <- train(mpg ~ cyl + wt + hp, data = training,method="lm") save(modLM,file="c:/CLEMSON/Workshop/LMmodel.rda") - The LMmodel.rda has been saved to your local computer, next we will load this model into Tableau Open Tableau - Open a new sheet - Analysis\Create Calculate Fields... - A new window appears. Enter the name for the Field: "LinearMod" - Select SCRIPT_REAL from the right window and start typing the input SCRIPT_REAL(' mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4) load("c:/CLEMSON/Workshop/LMmodel.rda") output <- predict(modLM, newdata = mydata) ', AVG([Mpg]), AVG([Cyl]), AVG([Wt]), AVG([Hp])) - Press Ok to go back to Sheet - Drag LinearMod to Rows - Drag Mpg to Columns - Drag TrainTest to Color & Shape - Go to Analytics tab, Drag Trend line to the plot (over to Linear) - Observe the change in correlation for training and testing sets - Save the sheet with name: "Linear Modeling" 3.3. Random Forest with R and Tableau We will create the Random Forest model in R and visualize that in Tableau. Open RStudio - Type in the following script into R script: library(caret) data(mtcars) set.seed(123) indT <- createDataPartition(y=mtcars$mpg,p=0.6,list=FALSE) training <- mtcars[indT,] modRF <- train(mpg ~ cyl + wt + hp, data = training,method="rf") save(modRF,file="c:/CLEMSON/Workshop/RFmodel.rda") - The RFmodel.rda has been saved to your local computer, next we will load this model into Tableau Open Tableau - Open a new sheet - Analysis\Create Calculate Fields... - A new window appears. Enter the name for the Field: "RandomForest" - Select SCRIPT_REAL from the right window and start typing the input SCRIPT_REAL(' mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4) load("c:/CLEMSON/Workshop/RFmodel.rda") output <- predict(modRF, newdata = mydata) ', AVG([Mpg]), AVG([Cyl]), AVG([Wt]), AVG([Hp])) - Press Ok to go back to Sheet - Drag RandomForest to Rows - Drag Mpg to Columns - Drag TrainTest to Color & Shape - Go to Analytics tab, Drag Trend line to the plot (over to Linear) - Observe the change in correlation for training and testing sets - Save the sheet with name: "Random Forest" 3.4. Principal Component Analysis (PCA) with R and Tableau We will create the PCA model in R and visualize that in Tableau. Open RStudio - Type in the following script into R script: data(mtcars) #Ignore vs & am (PCA works good with numeric data ) datain <- mtcars[,c(1:7,10:11)] mtcars.pca <- prcomp(datain,center=TRUE,scale=TRUE) save(mtcars.pca,file="c:/CLEMSON/Workshop/PCAmodel.rda") - The PCAmodel.rda has been saved to your local computer, next we will load this model into Tableau Open Tableau - Open a new sheet - Analysis\Create Calculate Fields... - A new window appears. Enter the name for the Field: "PCA1" - Select SCRIPT_REAL from the right window and start typing the input SCRIPT_REAL(' load("c:/CLEMSON/Workshop/PCAmodel.rda") PCA1 <- mtcars.pca$x[,1] ',ATTR([Car])) - Similarly, create a new field name: "PCA2" SCRIPT_REAL(' load("c:/CLEMSON/Workshop/PCAmodel.rda") PCA2 <- mtcars.pca$x[,2] ' ,ATTR([Car])) - Press Ok to go back to Sheet - Drag PCA1 to Columns - Drag PCA2 to Rows - Drag Wt to Color - Drag Car to Tooltip, click on Tooltip, change to Label - Change the colorbar - Observe the change in correlation for training and testing sets - Save the sheet with name: "PCA" 3.5. Kmeans clustering with R and Tableau Open Tableau - Open a new sheet - Analysis\Create Calculate Fields... - A new window appears. Enter the name for the Field: "kmeans" - Select SCRIPT_REAL from the right window and start typing the input SCRIPT_REAL(' mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4) set.seed(123) km <- kmeans(mydata,3) #Split into 3 clusters km$cluster ', AVG([Mpg]), AVG([Cyl]), AVG([Wt]), AVG([Hp])) - Press Ok to go back to Sheet - Drag Wt to Columns - Drag Mpg to Rows - Drag kmeans to Color & Shape - Drag Car to Tooltip, click on Tooltip, change to Label - Change the colorbar - Observe the change in correlation for training and testing sets - Save the sheet with name: "k-means" 3.6. Fuzzy C-means clustering with R and Tableau Open Tableau - Open a new sheet - Analysis\Create Calculate Fields... - A new window appears. Enter the name for the Field: "Fuzzy C-Means" - Select SCRIPT_REAL from the right window and start typing the input SCRIPT_REAL(' library(ppclust) # You will need to install this package first using R mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4) set.seed(123) res.fcm <- fcm(mydata, centers=3) res.fcm$cluster ', AVG([Mpg]), AVG([Cyl]), AVG([Wt]), AVG([Hp])) - Press Ok to go back to Sheet - Drag Wt to Columns - Drag Mpg to Rows - Drag Fuzzy C-Means to Color & Shape - Drag Car to Tooltip, click on Tooltip, change to Label - Change the colorbar - Observe the change in correlation for training and testing sets - Save the sheet with name: "Fuzzy C-means" ================================================== You can find the Tableau file for R attached |
Geospatial Technologies at Clemson > Tableau Training - Opioid Epidemic in SC > Advanced Session: Python and R >
R Session
Selection | File type icon | File name | Description | Size | Revision | Time | User |
---|---|---|---|---|---|---|---|
Download | 119k | v. 1 | Oct 25, 2019, 7:51 AM | Tue Vu | |||
Download | 72k | v. 1 | Oct 25, 2019, 8:38 AM | Tue Vu | |||
Download | 27k | v. 1 | Oct 25, 2019, 7:32 AM | Tue Vu | |||
Download | 2k | v. 1 | Oct 25, 2019, 5:27 AM | Tue Vu | |||
Download | 4k | v. 1 | Oct 25, 2019, 7:32 AM | Tue Vu | |||
Download | 60k | v. 1 | Oct 25, 2019, 7:32 AM | Tue Vu |