# Your turn 19. The data.frame `midwest` contains demographic information about 437 counties in the Midwest. The list of variables is displayed below.

**(a)** Calculate Pearson's r to measure the correlation between `percollege` (the percent of people in the county with a college education) and 'percbelowpoverty` (the proportion of people below the poverty line).

**(b)** Calculate Spearman's rho for the same two variables.

**(c)** Calculate Kendall's tau for those two variables.

**(d)** Calculate a distance correlation between those variables.

**(e)** Conduct a randomization-based test of Pearson's r and state your conclusions.

**(f)** Construct a bootstrap confidence interval for Pearson's r.

**(g)** Conduct a t-test for pearson's r. ```{r} # Load data data(midwest) # Display variable names and types str(midwest) # Display scatterplot ggplot(data=midwest, aes(x = percollege, y = percbelowpoverty)) + geom_point(alpha=.5, color="steelblue", fill="lightblue", shape=20, size=5) + theme_grey() ``` **(a)** Calculate Pearson's r to measure the correlation between `percollege` (the percent of people in the county with a college education) and 'percbelowpoverty` (the proportion of people below the poverty line). ```{r} # Calculate pearson's r. # The variables are percollege and percbelowpoverty ```

**(b)** Calculate Spearman's rho for the same two variables. ```{r} # Spearman's rho ```

**(c)** Calculate Kendall's tau for those two variables. ```{r} # Kendall's tau ```

**(d)** Calculate a distance correlation between those variables. ```{r} # Distance correlation # Load energy package. If you do not have it installed, # Use the command: # install.packages("energy") library(energy) # Calculate the distance correlation # You'll have the specify the variables as: # midwest$percollege and midwest$percbelowpoverty ```

**(e)** Conduct a randomization-based test of Pearson's r and state your conclusions. ```{r} # All the code has been provided for you. # Store our test statistic test.corr <- cor(percollege, percbelowpoverty, data=midwest) # Randomize our data and calculate correlations randomized_correlations <- do(10000) * cor(shuffle(percbelowpoverty) ~ percollege, data=midwest) # Plot distribution of randomized correlations histogram(~cor, data=randomized_correlations, xlab="Possible correlations assuming null hypothesis is true", groups=cor <= test.corr) # Highlight values > test statistic # Estimate p-value prop(~cor <= test.corr, data=randomized_correlations) ``` **TYPE YOUR CONCLUSION HERE**

**(f)** Construct a bootstrap confidence interval for Pearson's r. ```{r} # All code has been provided for you. # Generate 10000 bootstrap samples and calculate the test statistic for each boot.correlations <- do(10000) * cor(percollege ~ percbelowpoverty, data=resample(midwest)) # Create a plot of the bootstrap estimates of our test statistic densityplot(~cor, data=boot.correlations, plot.points = FALSE, col="steelblue", lwd=4) # Calculate confidence interval cdata(0.95, cor, data = boot.correlations) ```

**(g)** Conduct a t-test for pearson's r. ```{r} # t-test for correlation # Replace XXXXX with the appropriate alternative hypothesis cor.test(midwest$percollege, midwest$percbelowpoverty, alternative = c("XXXXX"), method=c("pearson")) ```

19. Calculate Kendall's tau for the NFL malevolence data we've used in this lesson. Then, conduct a randomization-based test for Kendall's tau. State your conclusion. ```{r} # Use syntax similar to what was provided in the previous question. # The variables are: malevolence and penalty # The data.frame is NFL ``` **TYPE CONCLUSION HERE**