Scenario: Arachnophobia

Are you scared of spiders?        Drawing

If you have arachnophobia, you might even be scared of that picture! Suppose I’m interested in seeing if arachnophobia is specific to real spiders or if photos of spiders can evoke similiar anxiety levels.

To study this, I find 24 arachnophobe volunteers. I randomly split them into two groups and send them into separate rooms. In the center of one room is a real spider in a cage; the other room only has a photo of that same spider. After 5 minutes, I measure the anxiety of each subject in this study (using some errorless measure of anxiety that doesn’t really exist).

Our question of interest is: Do the subjects in the room with the real spider experience more anxiety than subjects in the other room? To answer this question, we’ve decided to compare the mean anxiety level of each group.



Data

Since this study only dealt with 24 people, I’ll enter the data manually:

# Create a vector to contain group membership (photo vs real groups)
group <- c(rep("photo", 12), rep("real", 12))

# Create a vector of the 24 anxiety scores
anxiety<-c(30, 35, 45, 40, 50, 35, 55, 25, 30, 45, 40, 50, 
           40, 35, 50, 55, 65, 55, 50, 35, 30, 50, 60, 39)

# Combine the vectors into a data.frame called "spider"
spider <- data.frame(group, anxiety)

# Look at the first several rows of our data
head(spider)
##   group anxiety
## 1 photo      30
## 2 photo      35
## 3 photo      45
## 4 photo      40
## 5 photo      50
## 6 photo      35


In this study, we’re going to model anxiety as a function of the group (real spider vs photo): anxiety ~ group.

Before we begin testing our model, we’ll want to explore our data with summary statistics and visualizations. First, let’s calculate some favorite summary statistics:

favstats(anxiety ~ group, data=spider)
##   group min    Q1 median    Q3 max mean        sd  n missing
## 1 photo  25 33.75     40 46.25  55   40  9.293204 12       0
## 2  real  30 38.00     50 55.00  65   47 11.028888 12       0

Based on this, I notice:   \(\overline{X}_{1}-\overline{X}_{2}=\) 47 – 40 = 7


Now let’s visualize our data:

# Side-by-side densityplots
densityplot(~anxiety | group, data=spider, lw=4, bw=4, col="steelblue", layout=c(1,2), cex=1)

# Strip plot
stripplot(anxiety ~ group, data=spider, jitter.data = TRUE, 
          xlab = "anxiety", cex=1.3, col="steelblue", aspect="1", pch=20)

# Boxplot
bwplot(anxiety ~ group, data=spider, col="steelblue", xlab="Group", ylab="scores",
       par.settings=list(box.rectangle=list(lwd=3), box.umbrella=list(lwd=3)))


  1. Describe the distribution of anxiety levels for each group. Ultimately, do you think we’re going to conclude the mean anxiety of the real spider group is larger than the mean for the photo group?









Model

  1. State the null and alternative hypotheses for this study.







  1. Think about the subjects in this study. What factors could have influenced each subject’s anxiety?







Let’s construct a model that could have generated the data in this study. First, we’ll define:

\(y_{ig}=\) the anxiety of subject i in group g

\(\mu=\) a constant, overall mean anxiety level all humans possess

\(\alpha _{g}=(\mu _{g}-\mu )=\) the change in anxiety associated with being assigned to group g

\(\epsilon_{ig}=(y_{ig}-\mu _{g})=\) errors or deviations in anxiety among individuals in a group (due to other factors not analyzed in this study).


With this notation, we can write our full model: \(y_{ig}=\mu +\alpha _{g}+\epsilon_{ig}\).


  1. Substitute terms and simplify our model to show it’s a tautology.









  1. Write out the simplified null model we’d expect if our null hypothesis were true.







Our goal is to determine which model (the null or full) best fits our data. Additionally, we want to estimate \(\alpha _{g}=(\mu _{g}-\mu )\): the effect of the treatment (real spider vs. photo) on anxiety.



Methods

We’re already familiar with several methods we can use to compare the means of two groups. We can classify these methods into two general frameworks:


Randomization-based framework

   •• Randomization-based null hypothesis significance tests

   •• Bootstrap confidence intervals


Theory-based framework

   •• t-tests (with or without assuming equal variances)

   •• Confidence intervals

   •• Effect sizes


Regardless of which framework we choose, we will proceed by:

• Hypothesizing the two population means are equal:   \(H_{0}: \mu _{1}-\mu _{2}\)

• Calculating a test statistic of interest:   \(\overline{X}_{1}-\overline{X}_{2}=\) 47 – 40 = 7

• Estimating the sampling distribution of our test-statistic under our null hypothesis.



Randomization-based framework

Randomization-based null hypothesis significance testing

Recall that in our study, subjects were randomly assigned to groups. Under our null hypothesis, the real spider and photo evoke identical anxiety levels. In other words, the “real” and “photo” groups are only labels – they do not have any effect on anxiety.

  1. The first subject in this study, assigned to the “photo” group, had an anxiety level of 30. Under our null hypothesis, what would this subject’s anxiety level be if he or she would have been randomly assigned to the “real spider” group?





Using the actual data from this study, our test statistic is:   \(\overline{X}_{1}-\overline{X}_{2}=7\)

We want to know the distribution of test statistics we’d get if we could go back in time and repeatedly assign our subjects at random to the two groups. To do this, we’ll:

• Randomly shuffle the groups assigned to the 24 subjects in this study

• Calculate our test statistic, \(\overline{X}_{1}-\overline{X}_{2}\)

• Repeat


This is easy to do in R. First, let’s define our test statistic as test.stat:

# Define the test statistic as a difference in means
test.stat <- diffmean(anxiety ~ group, data=spider)
test.stat
## diffmean 
##        7

Let’s randomize our data by randomly assigning 12 subjects to the “real” group and 12 subjects to the “photo” group. Since we’re hypothesizing that the treatment does not affect anxiety, a subject’s score won’t change if he or she is moved from one group to the other.

Let’s try 3 randomizations of our data to see what test statistics we get:

# Randomize our data 3 times and calculate the difference in means each time
do(3) * diffmean(anxiety ~ shuffle(group), data=spider)
##    diffmean
## 1 -7.000000
## 2  1.333333
## 3 -2.000000

Due to the randomization, we get a different value of the test statistic each time. If we calculate our test statistic for 10,000 randomizations of our data, we’ll have a good sense of the sampling distribution of \(\overline{X}_{1}-\overline{X}_{2}\).

# Randomize our data 10,000 times and calculate the test statistic for each
randomizedMEANdiffs <- do(10000) * diffmean(anxiety ~ shuffle(group), data=spider)

# Histogram of the randomized sampling distribution
histogram(~diffmean, data=randomizedMEANdiffs,
          xlim=c(-20,20), # Set the x-axis range
          width=2.5,      # Set the width of each bar
          xlab="Possible mean differences assuming null hypothesis is true", 
          groups=diffmean >= test.stat)   # Highlight values > test statistic
ladd(panel.abline(v=test.stat))   # Add vertical line at test statistic


The histogram shows the randomized distribution of our test statistic (under a true null hypothesis). The vertical line shows the value of our observed test statistic.

We can estimate the likelihood of observing a test statistic as or more extreme than our observed test statistic:

p-value = proportion of randomizations resulting in a test statistic ≥ our observed test statistic

# We want the proportion of our 10,000 values in which the mean difference
# is greater than or equal to our observed test.stat
randomizedPVALUE <- prop(~ (diffmean >= test.stat), data=randomizedMEANdiffs)
randomizedPVALUE
##   TRUE 
## 0.0628


  1. Based on the randomized distribution of our test statistic and the p-value, what conclusions could you make regarding the difference in means between the “real” and “photo” groups?









  1. Flashback question: In this example, we randomized our data 10,000 times. If we wanted to list out all possible randomizations, how many would there be? Remember, we’re randomizing our data by randomly assigning 24 subjects to 2 evenly-sized groups.








Bootstrap Confidence Intervals

Instead of calculating the likelihood of our test statistic under the null hypothesis, we may be interested in a range of likely values for our test statistic.

To construct a bootstrap confidence interval, we need to repeatedly resample our data with replacement and calculate our test statistic. In effect, we’re treating our data as the population and taking random samples from it.

Let’s resample our data 3 times and see what values of the test statistic we get:

# Generate 3 bootstrap samples and calculate the test statistic for each
do(3) * diffmean(anxiety ~ group, data=resample(spider))
##    diffmean
## 1  3.728571
## 2 10.168067
## 3 13.279720

Notice we got a different value of our test statistic each time. That’s due to our resampling – each time we sample with replacement, we may choose one subject multiple times and ignore other subjects.

Now let’s generate 10,000 bootstrap samples and calculate the test statistic for each one.

# Generate 10000 bootstrap samples and calculate the test statistic for each
bootMEANdiffs <- do(10000) * diffmean(anxiety ~ group, data=resample(spider))

# Create a plot of the bootstrap estimates of our test statistic
densityplot(~diffmean, data=bootMEANdiffs, plot.points = FALSE, col="steelblue", lwd=4)

  1. Briefly explain what that density plot represents.








To get our bootstrap confidence interval, we simply need to find the values of the test statistic that cut-off the top and bottom 2.5% of the bootstrap distribution:

bootstrapCI <- confint(bootMEANdiffs, level = 0.95, method = "quantile")
bootstrapCI
##       name      lower    upper level   method estimate
## 1 diffmean -0.8951049 14.91667  0.95 quantile        7

Our 95% confidence interval is: (-0.895, 14.917).

  1. How do you interpret this interval? What can you conclude from the interval?











Theory-based framework

t-test

Remember, we’ve defined our test statistic (\(\overline{X}_{1}-\overline{X}_{2}\)) as test.stat. We want to estimate the distribution of test statistics we’d get if we were to repeatedly take samples from our populations (and calculate the test statistic for each sample).

In the previous sections, we estimated this sampling distribution via resampling (randomization or bootstrap samples).

If you look at the randomized and bootstrap distributions we constructed, you’ll notice they have the same unimodal, symmetric shape. Rather than take the time to resample our data, we may be able to use theory-based methods to approximate these unimodal, symmetric sampling distributions. That could speed up our calculations (and let us estimate p-values and confidence intervals without the use of a computer).


Assumptions

Recall the statistical model we constructed: \(y_{ig}=\mu +\alpha _{g}+\epsilon_{ig}\).

Using randomization-based methods, we did not make any assumptions about the distribution of those model components. If we’re willing to make some assumptions, we can use theory-based methods.

Our assumptions can be summarized as: \(y_{ig}\sim\mathrm{N(\mu _{i},\sigma ^{2})}\).

This implies that while each treatment group may have its own mean, all the data are independent and normally distributed with a constant variance.

Another way to write this model would be to add an extra statement to our original model:

\(y_{ig}=\mu +\alpha _{g}+\epsilon_{ig}\), with \(\epsilon_{ig}\sim\mathrm{N(0,\sigma ^{2})}\)

This implies our errors follow a normal distribution centered at zero.

  1. What did those errors represent in our study?







Let’s restate our assumptions one more time. In order to use these theory-based methods, we’re going to assume:

• a null hypothesis that states, \(H_{0}: \mu _{1}-\mu _{2}=0\)

• anxiety scores within and across our treatment groups are independent

• the sampling distribution of our test-statistic approximates a normal distribution


Let’s see how these assumptions allow us to estimate the sampling distribution of \(\overline{X}_{1}-\overline{X}_{2}\).


• If we assume \(\mu _{1}-\mu _{2}=0\), then: \(E[\overline{X}_{1}-\overline{X}_{2}]=E[\overline{X}_{1}]-E[\overline{X}_{2}]=\mu _{1}-\mu _{2}=0\).
This tells us our sampling distribution will be centered at zero.


• If we assume normality, we know the shape of the sampling distribution.


• Now that we know the center and shape of our sampling distribution, we only need to figure out its standard error.


When you first learned about the Central Limit Theorem, you learned that the sampling distribution of the sample mean has a standard error of \(\sigma _{\overline{x}}=\frac{\sigma _{x}}{\sqrt{n}}\).


In MATH 300, we assumed independence to derive the standard error of our test statistic: \(\sigma _{(\overline{x}_{1}-{\overline{x}_{2})}}=\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}\). This is called the Satterthwaite approximation to the standard error. It does not make any assumptions about the variances of our two groups.


If we’re willing to assume the population variances of our two groups are equal, we can calculate the standard error with a different formula. This formula pools (or finds a weighted average) of the two group standard deviations:

\(SE_{pooled}=s_{pooled}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}=\sqrt{\frac{(n_{1}-1)s_{1}^{2}+(n_{2}-1)s_{2}^{2}}{n_{1}+n_{2}-2}}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}\)


Checking assumptions

Before we choose a method (Satterthwaite or SEpooled) and calculate our standard error, we should evaluate the assumptions we’ll need to make.

Regardless of the method we choose, we need to assume our errors will follow a normal distribution. Later on in this course, we’ll learn how to check our residuals. For now, we’ll check for normality by examining plots of the sample data (shown at the beginning of this example), by looking at Q-Q plots, or by conducting a normality test.

  1. Below, you can see Q-Q plots for each group. What do these plots represent? Does normality appear to be a reasonable assumption based on these Q-Q plots?








# Q-Q plot for real spider group
qqnorm(spider$anxiety[group=="real"], ylab = "Anxiety scores", main="Real spider"); qqline(spider$anxiety[group=="real"])

# Q-Q plot for photo group
qqnorm(spider$anxiety[group=="photo"], ylab = "Anxiety scores", main="Real spider"); qqline(spider$anxiety[group=="photo"])


We could also conduct a test of normality, like the Shapiro-Wilk test.

# Shapiro-Wilks test
shapiro.test(spider$anxiety[group=="real"])
## 
##  Shapiro-Wilk normality test
## 
## data:  spider$anxiety[group == "real"]
## W = 0.94887, p-value = 0.6206
shapiro.test(spider$anxiety[group=="photo"])
## 
##  Shapiro-Wilk normality test
## 
## data:  spider$anxiety[group == "photo"]
## W = 0.96502, p-value = 0.8523
  1. You can look up the details on the Shapiro-Wilk test online. For now, just know it tests the null hypothesis that the data come from a normal distribution. Based on the p-values, does it appear as though our normality assumption is reasonable?








How can we check the reasonableness of our equal variances assumption? We’ll figure that out in the next lesson. For now, let’s calculate the standard error using both the Satterthwaite approximation and the pooled estimate (assuming equal variances):

Satterthwaite: \(\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}=\sqrt{\frac{11.0289^{2}}{12}+\frac{9.2932^{2}}{12}}=4.163\)

SEpooled: \(s_{pooled}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}=\sqrt{\frac{(12-1)11.0289^{2}+(12-1)9.2932^{2}}{12+12-2}}\sqrt{\frac{1}{12}+\frac{1}{12}}=4.163\)

In this case, both estimates gave virtually identical estimates. Here’s the code I used to calculate these standard errors:

# First let's define the terms in the formulas
# Calculate standard errors of each group
SD1 <- sd(anxiety[group=="real"], data=spider)
SD2 <- sd(anxiety[group=="photo"], data=spider)

# Calculate sample size in each group
n1 <- length(spider$anxiety[group=="real"])
n2 <- length(spider$anxiety[group=="photo"])

# Now to get Satterthwaite's approximation
Satt <- sqrt( ((SD1^2)/(n1)) + ((SD2^2)/(n2)) )
Satt
## [1] 4.163332
# SEpooled
Spooled <- sqrt( (((n1 - 1) * (SD1^2)) + ((n2 - 1) * (SD2^2))) / (n1 + n2 - 2) )
SEpooled <- sqrt( (1/n1) + (1/n2) ) * Spooled
SEpooled
## [1] 4.163332


  1. Below, I’ve sketched the sampling distribution based on our assumptions. Label the x-axis, mean, and standard error of this distribution. Finally, briefly explain what this distribution represents.




Now that we have a sampling distribution, we can calculate a p-value. Remember, a p-value estimates the answer to the following question:

How likely were we to observe a test statistic of at least 7 if this sampling distribution (based on the null hypothesis) were true?

We can estimate this by seeing where our observed test statistic falls on this distribution.

  1. On the sampling distribution displayed in question #14, locate the observed mean difference of 7. Then, roughly estimate the p-value from this study.






If we want a more accurate estimate of this p-value, we need to understand even more about the sampling distribution. Since you learned this method in a previous statistics class, you know we’re conducting an independent samples t-test and the sampling distribution follows a t-distribution with \(n_{1}+n_{2}-2\) degrees-of-freedom.

To more formally conduct this test, we first convert our observed test statistic into a t-statistic:

\(\mathrm{t_{observed}} = \frac{\overline{X}_{1}-\overline{X}_{2}}{\mathrm{SE_{pooled}}}\).

Since our Satterthwaite and SEpooled methods yielded virtually identical values for the standard error, we can calculate our t-statistic as:

\(\mathrm{t_{observed}} = \frac{7}{4.163}=1.6813\)


Now that we have an observed t-statistic, we can see where it falls on a t-distribution. Because we have 24 observations (and we’re comparing 2 group means), we have 22 degrees of freedom:

# Let's first plot a t-distribution with df = 48
plotDist("t", df=22, lw=5, col="steelblue")

# Now let's determine the likelihood of observing a t-statistic
# greater than or equal to our observed t of 1.6813
xpt(c(1.6813), df=(n1+n2-2))

## [1] 0.9465759
  1. From this, what’s our p-value? What conclusions can we make from this study?








Thankfully, we don’t have to go through all these steps every time we conduct a t-test. Instead, we can use the built-in t.test function. Let’s go ahead and do that to verify the p-value we just estimated:

# t-test with no equal variance assumption
# Welch-Satterthwaite method
# Notice the weird degrees of freedom in the output
t.test(anxiety ~ group, data=spider, alternative = c("less"), var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  anxiety by group
## t = -1.6813, df = 21.385, p-value = 0.05362
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##       -Inf 0.1580825
## sample estimates:
## mean in group photo  mean in group real 
##                  40                  47
# t-test with equal variance assumption
t.test(anxiety ~ group, data=spider, alternative = c("less"), var.equal = TRUE, conf.level = 0.95)
## 
##  Two Sample t-test
## 
## data:  anxiety by group
## t = -1.6813, df = 22, p-value = 0.05342
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##       -Inf 0.1490421
## sample estimates:
## mean in group photo  mean in group real 
##                  40                  47

If you look at the p-values from the output of those commands, you’ll find they replicate what we calculated earlier.


Theory-based confidence intervals

Having already calculated the standard error of our test statistic, constructing a 95% confidence interval is simple:

\(95\%\ \mathrm{ CI}: (\overline{X}_{1}-\overline{X}_{2})\pm t_{n_{1}+n_{2}-2}^{\alpha /2}(\mathrm{SE})\)

We can plug in the Satterthwaite approximation or the pooled standard error into that formula:

# Satterthwaite - no equal variance assumption
confint(t.test(anxiety ~ group, data=spider, conf.level = 0.95))
## mean in group photo  mean in group real               lower 
##           40.000000           47.000000          -15.648641 
##               upper               level 
##            1.648641            0.950000
# SEpooled with equal variance assumption
confint(t.test(anxiety ~ group, data=spider, conf.level = 0.95, var.equal=TRUE))
## mean in group photo  mean in group real               lower 
##           40.000000           47.000000          -15.634222 
##               upper               level 
##            1.634222            0.950000


Effect size

Instead of conducting a null hypothesis significance test or constructing a confidence interval, we may be interested in estimating the magnitude of the difference between the group means. One such effect size could be estimated with Cohen’s d: \(\mathrm{Effect \ size}=\frac{\overline{X}_{1}-\overline{X}_{2}}{\sigma}\), where \(\sigma\) represents a standard deviation representing one or both groups.

We already know the difference in means is 7. Before we calculated SEpooled, we first calculated Spooled (the pooled standard deviation) to be 10.198. Our effect size is, then:

\(\mathrm{Effect \ size}=\frac{\overline{X}_{1}-\overline{X}_{2}}{\sigma }=\frac{7}{10.198}=0.686\).


  1. Interpret this effect size.









Summary

Let’s summarize the p-values and confidence intervals from the randomization- and theory-based methods:

Randomized p-value: 0.063

t-test (equal variances assumed) p-value: 0.053

t-test (equal variances not assumed) p-value: 0.054


Bootstrap 95% confidence interval: (-0.895 , 14.917)

Theory-based 95% confidence interval (equal variances assumed): (-1.634, 15.634)

Theory-based 95% confidence interval (equal variances not assumed): (-1.649, 15.649)


Cohen’s d (effect size): 0.686


The results from the randomization- and theory-based methods were similar in this example. This is because the assumptions we make in the theory-based methods (the normality assumption, especially) were fairly reasonable assumptions to make in this situation.




Dependent samples

Suppose that instead of finding 24 volunteers for the arachnophobia study, I only find 12. With such a small sample size, I decide to have each subject go into both the room with the real spider and the room with the photo. I flip a coin to randomly determine which room each subject enters first.

Further suppose that miraculously, we get the exact same data we had before.

# Create a vector of the 12 anxiety scores from the real spider room
real <- c(40, 35, 50, 55, 65, 55, 50, 35, 30, 50, 60, 39)

# Create a vector of the 12 anxiety scores from the photo room
photo<-c(30, 35, 45, 40, 50, 35, 55, 25, 30, 45, 40, 50) 

# Combine the vectors into a data.frame called "spider"
spideranxiety <- data.frame(real, photo)

# Look at the data
spideranxiety
##    real photo
## 1    40    30
## 2    35    35
## 3    50    45
## 4    55    40
## 5    65    50
## 6    55    35
## 7    50    55
## 8    35    25
## 9    30    30
## 10   50    45
## 11   60    40
## 12   39    50

The only difference is that we no longer have 24 independent anxiety scores. Instead, we have a pair of anxiety scores for each of 12 subjects. Higher-anxiety subjects are probably high-anxiety with both the real spider and the photo (and lower-anxiety subjects probably have lower anxiety in both rooms), so our data are now matched (or dependent).

What we’re going to do is calculate the difference in anxiety levels for each subject (real - photo).

# Calculate the difference and store it in a variable called "difference"
spideranxiety$difference <- spideranxiety$real - spideranxiety$photo

# Examine the first several rows of data
head(spideranxiety)
##   real photo difference
## 1   40    30         10
## 2   35    35          0
## 3   50    45          5
## 4   55    40         15
## 5   65    50         15
## 6   55    35         20
  1. We’re going to analyze the difference scores. Write out a null hypothesis for the population mean difference score.







Before we run any test, remember we should explore our data. Here are some descriptive statistics and a density plot:

# You can see the mean of our difference scores = 7
favstats(~difference, data=spideranxiety)
##  min Q1 median Q3 max mean       sd  n missing
##  -11  0    7.5 15  20    7 9.807233 12       0
# Density plot
densityplot(~difference, data=spideranxiety, lw=4, bw=4, col="steelblue", cex=1.1)


Randomization-based method

Consider the first subject in this study, who had anxiety scores of 40 and 30. The randomization null hypothesis is that the real spider and the photo of the spider are completely equivalent. This means the first subject would have had scores of 40 and 30 no matter what order he or she visited the rooms with the real spider or the photo.

Thus, under the randomization null hypothesis, we had an equal chance of observing a mean difference of +10 or -10 for this subject.

This is the key to our randomization-based method. We’re simply going to randomize whether the mean difference for each subject is positive or negative. The magnitude of the differences won’t change.

To do this, I’ll have R randomly multiply each subject’s mean difference by +1 or -1. Then, I’ll find the mean of all the difference scores. I’ll repeat this many times.

# Create a vector of -1 and +1
spideranxiety$posneg <- c(-1,1)

# Randomly sample 12 values of +1 or -1.
# Multiply those values by the difference scores for each subject.
# Find the mean of those randomized difference scores
# Do this 10,000 times
# Store the result as depRand
depRand <- do(10000) * mean( sample(spideranxiety$posneg,12) * spideranxiety$difference )

# Histogram of the randomized mean difference scores
histogram(~mean, data=depRand,
          xlim=c(-11,11), # Set the x-axis range
          xlab="Possible mean differences assuming null hypothesis is true", 
          groups=mean >= test.stat)   # Highlight values > test statistic
ladd(panel.abline(v=test.stat))   # Add vertical line at test statistic

# Calculate p-value
# Proportion of 10,000 randomizations yielding mean difference >7
prop(~ (mean >= test.stat), data=depRand)
##   TRUE 
## 0.0058
  1. From this, what conclusions can we make?










Theory-based method (dependent samples t-test)

In a previous statistics course, you learned the formula for a dependent samples t-test:

\(t_{\mathrm{n-1}}=\frac{\overline{X}_{\mathrm{diff}}-\mu _{0}}{\frac{s}{\sqrt{n}}}\)

Our data yields a t-statistic of:

\(t_{\mathrm{11}}=\frac{{7}}{\frac{9.807}{\sqrt{12}}}=2.47\)

We can visualize this t-statistic within its sampling distribution:

# plot t-distribution with 11 degrees-of-freedom
xpt(c(2.4725), df=(11))

## [1] 0.9845082


We can also conduct a dependent samples t-test in a single command (by telling R that we have paired data):

# Dependent samples t-test
t.test(real, photo, data=spideranxiety, paired=TRUE, alternative="greater")
## 
##  Paired t-test
## 
## data:  real and photo
## t = 2.4725, df = 11, p-value = 0.01549
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  1.915663      Inf
## sample estimates:
## mean of the differences 
##                       7
  1. This p-value does not exactly match the p-value from the randomization-based method. Explain why. What assumptions did we make in conducting a dependent samples t-test?










Your turn

  1. Use randomization-based methods to compare the medians of the real spider vs. photo groups. To calculate the difference in medians, you can use a diffmedian() function I wrote for this lesson.

  2. Let’s pretend as though our spider dataset simply contains anxiety scores for 24 individuals. We’ll forget the fact that the individuals were randomly assigned to two groups. Calculate the standard deviation of the anxiety scores with the command: sd(~anxiety, data=spider). Then, use bootstrap methods to construct a 95% confidence interval for the population standard deviation.

  3. Let’s compare results from all our methods to compare two group means. Given a dataset called d with scores for groups A and B, do the following:     (a) Calculate summary statistics using the favstats command,     (b) Create a visualization of the group distributions,     (c) use shapiro.test test to test for normal distributions,     (d) run t.test with and without the equal-variance assumption,     (e) run a randomization-based test to compare the means and report a p-value,     (f) calculate an effect size of the mean scores difference between groups A and B,     (g) write out any conclusion(s) you can make about the difference in means (be sure to identify the results you’re using to make these conclusions).




Publishing your solutions

When you’re ready to try the Your Turn exercises:


Spider dataset: Field, Andy (2012). Discovering Statistics Using R. ISBN: 978-1446200469

This document is released under a Creative Commons Attribution-ShareAlike 3.0 Unported license.