require(mosaic)

#### Examples from Activity #20

##### 1-9) Suppose a company is willing to try Hlextime for one year to see if it reduces absenteeism. Based on historical data, this company knows its employees have averaged 6.3 days absent. The company chose 100 employees at random to try flextime for one year. During that year, those 100 employees averaged 5.5 days absent with a standard deviation of 2.9007 days.

In the activity, we first sketched the sampling distribution we’d get from repeatedly taking samples of size n=100 and calculating the mean of each sample. Under our null hypothesis (that mu=6.3), we know the mean of our sample means would be 6.3 and the standard error would be 0.29:

## Plot the sampling distribution
# Generate a t-distribution with a mean of 6.3, a std. error of 0.29007, and df=99
tdist = 0.29007*rt(100000, 99) + 6.3
# Plot the distribution
densityplot(~tdist, plot.points = FALSE, lwd=3, col="steelblue")

It looks like the likelihood of observing a value of 5.5 or less would be somewhat small. Let’s calculate it (our p-value):

# Convert our observed mean of 5.5 to a t-score
tObserved <- (5.5 - 6.3) / 0.29007
# Find the probability of observing a t as or more extreme
pt(tObserved, df=99, lower.tail = TRUE, log.p = FALSE)
## [1] 0.003464

Our p-value is 0.003464. If we had a two-tailed alternate hypothesis, we would double that p-value (to, approximately, 0.007). We could have calculated this p-value directly using the t.test() command. First, let’s load our data:

daysAbsent <- c(0,0,0,0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,10,10,10,10,10,11)
favstats(~daysAbsent)
##  min Q1 median Q3 max mean    sd   n missing
##    0  4      6  8  11  5.5 2.901 100       0
histogram(~daysAbsent, col="grey", width=1)

The normality assumption is a bit of a stretch, but our sample size of n=100 may be enough for the CLT to kick-in. We’ll worry about the normality assumption a bit later. For now, let’s run a one-sample t-test.

The t.test() command compares our observed mean against a null hypothesis of mu=0. Our null hypothesis is mu=6.3. To use this command, we can subtract 6.3 from each of our observations.

daysAbsent2 <- daysAbsent-6.3

Now we can use the t.test() command:

# One-sample t-test; alternate hypothesis is less than
t.test( ~ daysAbsent2, alternative="less", sig.level=0.05)
##
##  One Sample t-test
##
## data:  data$daysAbsent2 ## t = -2.758, df = 99, p-value = 0.003464 ## alternative hypothesis: true mean is less than 0 ## 95 percent confidence interval: ## -Inf -0.3184 ## sample estimates: ## mean of x ## -0.8 Once again, we see the p-value of 0.003464. If we wanted to run a two-tailed t-test, we’d simply delete the alternative=“less” option: # One-sample t-test; alternate hypothesis is not equal to t.test( ~ daysAbsent2, sig.level=0.05) ## ## One Sample t-test ## ## data: data$daysAbsent2
## t = -2.758, df = 99, p-value = 0.006928
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -1.3756 -0.2244
## sample estimates:
## mean of x
##      -0.8

The p-value has doubled, as expected, and we’re given a 95% confidence interval for the population mean days absent under flextime.

With R, we can easily estimate the power of this t-test. To do so, we need to make an assumption about the actual difference between our observed and hypothesized means. Since we observed a mean 0.8 lower than what we hypothesized (5.5-6.3 = 0.8), let’s assume that’s the actual difference in means:

power.t.test(n=100, delta=0.8, sd=2.9, sig.level = 0.05, alternative="one.side", type="one.sample")
##
##      One-sample t test power calculation
##
##               n = 100
##           delta = 0.8
##              sd = 2.9
##       sig.level = 0.05
##           power = 0.8632
##     alternative = one.sided

Our power is estimated to be 0.86321. That’s not too bad. What would happen to power if we increased our sample size (and kept everything else constant)? Let’s see…

power.t.test(n=150, delta=0.8, sd=2.9, sig.level = 0.05, alternative="one.side", type="one.sample")$power ## [1] 0.9571 As the sample size increased, so did power. Let’s put the sample size back to n=100 and increase the hypothesized difference in means (delta). Suppose the true difference was only 0.5 (instead of 0.8). How would that impact power? power.t.test(n=100, delta=0.5, sd=2.9, sig.level = 0.05, alternative="one.side", type="one.sample")$power
## [1] 0.5269

As the true difference in means reduced, so did power. Finally, what would happen if we increase our alpha level?

power.t.test(n=100, delta=0.8, sd=2.9, sig.level = 0.10, alternative="one.side", type="one.sample")$power ## [1] 0.9286 As alpha increased, power increased. #### Another example ##### Let’s look at the Miami Heat free-throw example from Assignment #20b. ##### The Miami Heat averaged 27.902 free throws per game in 2010-11. The rest of the league averaged around 25. Let’s load our dataset and take a look: freeThrows <- c(25,25,23,17,18,28,30,39,24,31,27,18,20,20,28,20,28,14,27,12,25,34,29,29,23,34,34,38,21,37,26,23,19,22,31,29,24,47,33,26,18,39,24,29,38,19,22,37,30,30,29,35,41,37,32,11,24,32,47,42,18,32,28,27,28,21,41,21,32,16,36,45,26,35,36,21,25,30,19,22,29,26) favstats(~freeThrows) ## min Q1 median Q3 max mean sd n missing ## 11 22 28 32.75 47 27.9 7.881 82 0 histogram(~freeThrows, col="grey", width=4) We’ll use the t.test() command to test our null hypothesis that mu=25. Our alternate hypothesis will be one-sided (that mu > 25). # Convert data from our null hypothesis (25) to zero freeThrows2 <- freeThrows-25 # One-sample t-test; alternate hypothesis is less than t.test( ~ freeThrows2, alternative="greater", sig.level=0.05) ## ## One Sample t-test ## ## data: data$freeThrows2
## t = 3.335, df = 81, p-value = 0.0006447
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
##  1.454   Inf
## sample estimates:
## mean of x
##     2.902

Our p-value of 0.0006447 gives us strong evidence that our null hypothesis is not true (but did we really even think there was a chance that it may be true?). Let’s investigate the power of this test (assuming the true difference in means was 2.5)

power.t.test(n=82, delta=2.5, sd=7.881, sig.level = 0.05, alternative="one.side", type="one.sample")
##
##      One-sample t test power calculation
##
##               n = 82
##           delta = 2.5
##              sd = 7.881
##       sig.level = 0.05
##           power = 0.8856
##     alternative = one.sided

Let’s construct a 90% confidence interval for the mean. That will match our one-sided t-test (with 0.05 in the one tail):

# Convert data into data.frame
miamiHeat <- data.frame(freeThrows)
# Construct parametric confidence interval
confint(t.test(~freeThrows, conf.level=0.90), data=miamiHeat)
## mean of x     lower     upper     level
##     27.90     26.45     29.35      0.90

We get interval bounds of 26.454 and 29.351.

What interval would we get with the bootstrap method (discussed in the syntax for Activity #18)? Let’s find out:

# Calculate 10,000 sample means from bootstrap samples:
meanFT <- do(10000) * mean(~freeThrows, data = resample(miamiHeat))
## Loading required package: parallel
# Examine the distribution
densityplot(~result, data=meanFT, plot.points = FALSE, col="darkblue", lwd=4)
plotDist("norm", mean(meanFT), sd(meanFT), col="red", add=TRUE)

That dark blue distribution is the distribution of our bootstrap means. The red distribution is a normal curve. They’re close enough that we probably could have made our normality assumption.

Let’s get the confidence interval from our bootstrap samples:

confint(meanFT, level = 0.9, method = "quantile")
##     name lower upper level   method
## 1 result 26.49 29.34   0.9 quantile

Bootstrap Confidence Interval (26.50, 29.3) with a width of 285 and centered at 653.5

** Confidence Interval with Normality Assumption** (26.45, 29.35) with a width of 294 and centered at 649

#### Since I didn’t print the output from the last example in the last activity, I’ll do an example here. Here’s the output from a Bayesian estimation of our true population mean (free throws attempted by the Miami Heat):

# Load the Bayesian package
require(BEST)
## Loading required package: BEST
## Loaded modules: basemod,bugs
# Extract the data to a vector
# Run an analysis - it took my laptop a little more than 2 minutes
BESTout1 <- BESTmcmc(freeThrows)
## Setting up the JAGS model...
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 97
##
## Initializing model
##
## Burning in the MCMC chain...
## Sampling final MCMC chain...
# Get the output
BESTout1
## MCMC fit results for BEST analysis:
## 100002 simulations saved.
##         mean      sd median  HDIlo   HDIup Rhat n.eff
## mu    27.816  0.8960 27.818 26.066  29.576    1 59794
## nu    40.168 31.1665 31.513  3.334 102.131    1 19543
## sigma  7.736  0.6856  7.702  6.434   9.117    1 46530
##
## 'HDIlo' and 'HDIup' are the limits of a 95% HDI credible interval.
## 'Rhat' is the potential scale reduction factor (at convergence, Rhat=1).
## 'n.eff' is a crude measure of effective sample size.
plot(BESTout1)

From this output, we can say we’re 95% sure the population mean is between 26.072 an 29.572. Likewise, we’re 95% sure the population standard deviation is between 6.432 and 9.104. I’ll explain this method if you take STAT 301 or 305.