R Session

1. Advanced Tableau Visualization
1.1. Clustering in Tableau
        - From the previous lecture, use the scatter plot sheet:
From the scatter plot with All Opiod RX~All OpiodDeaths (varied by County name)
Remove Trendline
Analysis/Cluster: drag to the plot and select 4cluster
Manually select the 3 points (Horry,Greenville, Charleston)
Right click and Create Set.
Enter name: “Cty HighOpiods withDeaths/Rx”
- The new set will appear in the Data Tab
- Drag the newly created set to Filter
- Save the Sheet name to Cluster



1.2. Trend for both data sets
       
-Drag Data Year to Columns
-All Opoids Deaths & All Opoids Rx to Rows
-Since the y-axis are in difference scale,we need to reduce the scale by clicking down arror to each Rows data and select “QuickTable calculation” – “Percentage Difference”
-Notice that the unit will be the same andthere are reduction of number of deaths for the year 2016 in both data sets


1.3. Segmentation

-Open new Sheet and name it Segmentation
-Drag Alll Opoids Death & All Opoids RX toColumns
-Drag County name to Rows
-Drag County name to Colors

-Select the sort for All OpoidDeaths


1.4. Baseline

- Download Global_T_SST_MSL.csv and open it

Data\New Data source to open a new Data set

Drag Dates to Columns, click on the down arrow next to Date and change fromDiscrete to Continuous
Analytics: Reference Band and drag to Sheet then choose Table


Data tab: Drag Median to Rows, Date toDetail (click on down button and change to month)
Drag Median to Color
Change color spectrum to Sunrise – SunsetDiverging

Change the sheet name to “GlobalTemperature”


1.5. Forecast


Drag Dates to Columns, click on the down arrow next to Date and change from Discrete to Continuous
- Drag MLS to Row
- Analytics then Drag Forecast to the sheet
- Change sheet name to Forecast



1.6. Heatmap


-Open a new Sheet
-Drag Date to Columns (Year)
-Drag Date to Rows (Change to Month)
-Drag Median to Color
-Change color scheme
-Change view to “Fit Width”
-Change color bar limit from -1 to +1
-Change sheet name to Heatmap of globaltemperature

    

1.7. Plot in 2 axes


-Open new Sheet
-Drag Date to Columns
-Drag Median to Rows
-Drag MSL to the right axis
-Drag Date to Detail, change to Month
-Notice the change
-Change sheet name: Global Temp and MSL




2. Configure R to operate in Tableau

Download R:
- For Window OS: download here: https://cran.r-project.org/bin/windows/base/
- For Mac OS: download here: https://cran.r-project.org/bin/macosx/


Install Rserve package:
- Open R
- Tools\Install Packages: type in Rserve

Load Rserve package in R:
    > library(Rserve)
    > Rserve()

Open Tableau:
- Help\Settings and Performance\Manage External Service Connection\
- External Service: Rserve
- Serve: localhost
- Port 6311
- Click on Test Connection
- If connection successful, press OK
- You are ready to run R in Tableau

 3. Data mining in Tableau using R
3.1. Load input data and working with Script()
- Download the `mtcars.csv` from the link below
- Open Tableau and load the text file `mtcars.csv`
- This is a sample data sets with different specifications of car's type, fuel consumption, made, weight, etc. In our data mining example, we gonna simulate the "mile per gallon: mpg" variable based on different input files
- Now let's create a simple script.
- Go to Sheet
- Analysis\Create Calculate Fields...
- A new window appears. Enter the name for the Field: "SimpleR"
- Select SCRIPT_REAL from the right window and start typing the input
        SCRIPT_REAL('output <- .arg1 + .arg2+.arg3',AVG([Mpg]),AVG([Cyl]),AVG([Disp]))
- Press Ok to go back to Sheet
- Drag SimpleR to Rows
- Drag Mpg to Columns 
- Drag Car to Color
 - Observe the change


3.2. Linear Modeling with R and Tableau
We will create the Linear Modeling model in R and visualize that in Tableau.
Open RStudio
- Tools\Install Packages, type in caret to install
- Type in the following script into R script:

        library(caret)
        data(mtcars)
        set.seed(123)
        indT <- createDataPartition(y=mtcars$mpg,p=0.6,list=FALSE)
        training <- mtcars[indT,]
        modLM <- train(mpg ~ cyl + wt + hp, data = training,method="lm")
        save(modLM,file="c:/CLEMSON/Workshop/LMmodel.rda") 

- The LMmodel.rda has been saved to your local computer, next we will load this model into Tableau

Open Tableau
- Open a new sheet
- Analysis\Create Calculate Fields...
- A new window appears. Enter the name for the Field: "LinearMod"
- Select SCRIPT_REAL from the right window and start typing the input
    
        SCRIPT_REAL('
        mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4)
        load("c:/CLEMSON/Workshop/LMmodel.rda")
        output <- predict(modLM, newdata = mydata)
        ',
        AVG([Mpg]),
        AVG([Cyl]),
        AVG([Wt]),
        AVG([Hp]))

- Press Ok to go back to Sheet
- Drag LinearMod to Rows
- Drag Mpg to Columns 
- Drag TrainTest to Color & Shape
- Go to Analytics tab, Drag Trend line to the plot (over to Linear)
- Observe the change in correlation for training and testing sets
- Save the sheet with name: "Linear Modeling"


3.3. Random Forest with R and Tableau
We will create the Random Forest model in R and visualize that in Tableau.
Open RStudio
- Type in the following script into R script:

        library(caret)
        data(mtcars)
        set.seed(123)
        indT <- createDataPartition(y=mtcars$mpg,p=0.6,list=FALSE)
        training <- mtcars[indT,]
        modRF <- train(mpg ~ cyl + wt + hp, data = training,method="rf")
        save(modRF,file="c:/CLEMSON/Workshop/RFmodel.rda")

- The RFmodel.rda has been saved to your local computer, next we will load this model into Tableau

Open Tableau
- Open a new sheet
- Analysis\Create Calculate Fields...
- A new window appears. Enter the name for the Field: "RandomForest"
- Select SCRIPT_REAL from the right window and start typing the input
    
        SCRIPT_REAL('
        mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4)
        load("c:/CLEMSON/Workshop/RFmodel.rda")
        output <- predict(modRF, newdata = mydata)
        ',
        AVG([Mpg]),
        AVG([Cyl]),
        AVG([Wt]),
        AVG([Hp]))

- Press Ok to go back to Sheet
- Drag RandomForest to Rows
- Drag Mpg to Columns 
- Drag TrainTest to Color & Shape
- Go to Analytics tab, Drag Trend line to the plot (over to Linear)
- Observe the change in correlation for training and testing sets
- Save the sheet with name: "Random Forest"


3.4. Principal Component Analysis (PCA) with R and Tableau
We will create the PCA model in R and visualize that in Tableau.
Open RStudio
- Type in the following script into R script:

        data(mtcars)
        #Ignore vs & am (PCA works good with numeric data )
        datain <- mtcars[,c(1:7,10:11)]
        mtcars.pca <- prcomp(datain,center=TRUE,scale=TRUE)
        save(mtcars.pca,file="c:/CLEMSON/Workshop/PCAmodel.rda")

- The PCAmodel.rda has been saved to your local computer, next we will load this model into Tableau

Open Tableau
- Open a new sheet
- Analysis\Create Calculate Fields...
- A new window appears. Enter the name for the Field: "PCA1"
- Select SCRIPT_REAL from the right window and start typing the input
    
        SCRIPT_REAL('
        load("c:/CLEMSON/Workshop/PCAmodel.rda")
        PCA1 <- mtcars.pca$x[,1]
        ',ATTR([Car]))

- Similarly, create a new field name: "PCA2"
        SCRIPT_REAL('
        load("c:/CLEMSON/Workshop/PCAmodel.rda")
        PCA2 <- mtcars.pca$x[,2]
        ' ,ATTR([Car]))

- Press Ok to go back to Sheet
- Drag PCA1 to Columns
- Drag PCA2 to Rows 
- Drag Wt to Color
- Drag Car to Tooltip, click on Tooltip, change to Label
- Change the colorbar 
- Observe the change in correlation for training and testing sets
- Save the sheet with name: "PCA"


3.5. Kmeans clustering with R and Tableau
Open Tableau
- Open a new sheet
- Analysis\Create Calculate Fields...
- A new window appears. Enter the name for the Field: "kmeans"
- Select SCRIPT_REAL from the right window and start typing the input
    
        SCRIPT_REAL(' 
        mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4)
        set.seed(123)
        km <- kmeans(mydata,3) #Split into 3 clusters
        km$cluster
        ',
        AVG([Mpg]),
        AVG([Cyl]),
        AVG([Wt]),
        AVG([Hp]))

- Press Ok to go back to Sheet
- Drag Wt to Columns
- Drag Mpg to Rows 
- Drag kmeans to Color & Shape
- Drag Car to Tooltip, click on Tooltip, change to Label
- Change the colorbar 
- Observe the change in correlation for training and testing sets
- Save the sheet with name: "k-means"


3.6. Fuzzy C-means clustering with R and Tableau
Open Tableau
- Open a new sheet
- Analysis\Create Calculate Fields...
- A new window appears. Enter the name for the Field: "Fuzzy C-Means"
- Select SCRIPT_REAL from the right window and start typing the input
    
        SCRIPT_REAL('
        library(ppclust) # You will need to install this package first using R
        mydata <- data.frame(mpg=.arg1, cyl=.arg2, wt=.arg3,hp=.arg4)
        set.seed(123)
        res.fcm <- fcm(mydata, centers=3)
        res.fcm$cluster
        ',
        AVG([Mpg]),
        AVG([Cyl]),
        AVG([Wt]),
        AVG([Hp]))

- Press Ok to go back to Sheet
- Drag Wt to Columns
- Drag Mpg to Rows 
- Drag Fuzzy C-Means to Color & Shape
- Drag Car to Tooltip, click on Tooltip, change to Label
- Change the colorbar 
- Observe the change in correlation for training and testing sets
- Save the sheet with name: "Fuzzy C-means"




==================================================
You can find the Tableau file for R attached
SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser

Download
 119kv. 1 Oct 25, 2019, 7:51 AMTue Vu

Download
 72kv. 1 Oct 25, 2019, 8:38 AMTue Vu

Download
 27kv. 1 Oct 25, 2019, 7:32 AMTue Vu

Download
 2kv. 1 Oct 25, 2019, 5:27 AMTue Vu

Download
 4kv. 1 Oct 25, 2019, 7:32 AMTue Vu

Download
 60kv. 1 Oct 25, 2019, 7:32 AMTue Vu