![](dolphins.png)

The researcher, randomly deciding to make the light shine constant or flash, tested the dolphins' ability to communicate. **Out of 16 attempts, the dolphins pushed the correct button 15 times.**
1. Introduce yourself to another student in the class. Together, decide if the results of this experiment suggest dolphins **can** communicate abstract concepts. Explain your reasoning.

![](Transparent.gif)

2. Explain why you would **not** be impressed if the dolphins pushed the correct button 8 or 9 times.
![](Transparent.gif)

3. How many successful trials (out of 16) would it take for you to conclude that dolphins can, perhaps, communicate?
![](Transparent.gif)

*****
## Physical simulation
4. Assuming dolphins **cannot** communicate, how likely was Buzz to push the correct button 15 out of 16 trials? Estimate this likelihood through a simulation and record your results with a dotplot.
```{r 'dotplot-blank', message=FALSE, fig.height=1.4, echo=FALSE}
x <- data.frame(correct = 0:16)
ggplot(data = x, aes(x)) +
scale_x_continuous(breaks=seq(0, 16, 1), minor_breaks=NULL) +
labs(
x = "number correct"
)
```
![](Transparent.gif)

### Null hypothesis significance testing
This is the logic behind a type of statistical inference called *null hypothesis significance testing*. We:
• **Express a prior belief (or assumption) that nothing special happened** Dolphins cannot communicate; Buzz is randomly pushing buttons.
• **Collect data**. Buzz pushed the correct button 15 out of 16 trials in this experiment.
• **Estimate the likelihood of observing this data (or something even more extreme) under our prior belief/assumpion**.
• **Draw a conclusion** or update our beliefs.

***** ## Computer simulation We can use online applets or R.

### Online Applet Open [http://www.rossmanchance.com/applets/OneProp/OneProp.htm](http://www.rossmanchance.com/applets/OneProp/OneProp.htm) To use this applet, we must input:

`probability of heads` = __________

`number of tosses` = __________

`number of repetitions` = __________

`number of heads as extreme as`: _________________________.

Click **Draw Samples** and you should see a histogram similar to this one: ![](applet.png) 5. What does this distribution represent? What conclusions can we make from it? Could Buzz have pushed the correct button 15 times by chance?

![](Transparent.gif)

6. If dolphins cannot communicate, how likely was Buzz to push the correct button 15 out of 16 times? To estimate this **p-value** in the applet, click the `count` button.
**p-value** = ____________________

Note: If you check the *Exact Binomial* box, you'll see **p = 0.003**. We'll learn how to calculate this using the *Binomial Distribution*.

### R We can load the **Mosaic package** to quickly run a simulation. ```{r 'mosaic', message=FALSE, warning=FALSE} # If you need to install the Mosaic package, use: # install.packages("mosaic") # Load the mosaic package library(mosaic) ``` The `rflip()` and `nflip()` functions simulate coin tosses. The default inputs for the functions are: `nflip(n = 1, prob = 0.5)`, where: `n` = number of coins to toss `prob` = the probability of obtaining heads on any toss. To simulate multiple replications, we'll use the `Do()` function. Let's run 10,000 replications of our simulation and plot the results ```{r 'nflip'} # Flip 16 coins 10,000 times # In other words... # do 10,000 replications of flipping 16 coins # We'll store the results in a data frame called "sims" sims <- do(10000) * nflip(n = 16, prob = 0.5) # Look at some of the results sims # This is the simple code I would use to create # a histogram of the results # The simulations are stored in "sims" # The variable we want to plot is "nflip" # We want to draw a vertical line at x = 15 # I set the width of the bars equal to 1: # histogram(~nflip, data=sims, v=15, width=1 ) # To make it look nice, I'll use ggplot2 # The syntax looks more complicated sims %>% ggplot(aes(x = nflip)) + geom_histogram(binwidth = 1, fill="lightblue", color="white", alpha = 0.8) + annotate("segment", x = 15, xend = 15, y = 0, yend = 1500, color = "red") + labs( title = "10,000 replications of 16 coin tosses", x = "number of heads" ) + scale_x_continuous(breaks=seq(0, 16, 1), minor_breaks=NULL) + theme( axis.text.x = element_text(size = 11, color="grey10"), legend.position = "none", panel.grid.major.y = element_line(colour = "white"), panel.grid.major.x = element_line(colour = "white", size=.15), panel.grid.minor = element_blank(), panel.background = element_rect(fill = "grey93") ) ``` From this, we can calculate the proportion of replications that resulted in 15 or more heads (the proportion to the right of the red line): ```{r 'scientific-notation', echo=FALSE} # Get rid of the scientific notation for small values options(scipen=999) ``` **p-value** = `prop(~nflip >= 15, data=sims)` = `r prop(~nflip >= 15, data=sims)`. ```{r 'pvalue'} # Here's the code to calculate this proportion # Using mosaic package # We want the proportion # of nflip values >= 15 # in the sims dataset prop(~nflip >= 15, data=sims) # Alternatively, we can calculate this using dplyr sims %>% summarize(proportion = sum(nflip >= 15) / n() ) ```

## Check for understanding 7. Suppose Buzz had only pushed the correct button **10** times out of 16. Estimate the p-value and explain what it represents.

![](Transparent.gif)

8. This time, suppose Buzz had pushed the correct button **100** times out of **160** trials. Predict which of the following statements is true. Then, estimate the p-value.
**(a)** P(100+ correct out of 160) is **<** P(10+ correct out of 16). **(b)** P(100+ correct out of 160) is **=** P(10+ correct out of 16). **(c)** P(100+ correct out of 160) is **>** P(10+ correct out of 16).

**p-value** = P(100+ correct out of 160) = _________________________________________.

***** ### Paul the Octopus [Paul the Octopus](https://en.wikipedia.org/wiki/Paul_the_Octopus) — in choosing which of two boxes of food to eat first — successfully predicted the winner in 12 of 14 World Cup matches.

![](pto.jpg)

9. Simulate coin flips in R to estimate the likelihood of Paul getting 12 out of 14 correct. What assumptions are you making in calculating this p-value?
Hint: Fill-in-the-blanks to run the simulation, plot a histogram, and estimate the p-value.

`_______________ <- do(__________) * nflip(n = __________, prob = __________)`

`histogram(~nflip, data=__________, v=__________, width=1)`

`prop(~nflip >= __________, data = _______________)`

**p-value** = _________________________________

10. Based on this p-value, would you conclude that Paul actually had the ability to predict the winner of World Cup matches? Explain.

![](Transparent.gif)

*****
# Bank supervisors
In 1972, personnel files were given to **48** male bank supervisors. The supervisors were asked if the applicant described in the file should be promoted to a branch manager position.
The personnel files given to each supervisor contained identical information except for the gender of the applicant. Files for 24 of the supervisors indicated the applicant was **female**; the other 24 files indicated the applicant was **male**.
Ultimately, 35 of the 48 supervisors recommended the applicant be promoted.
![](supervisor.png)
11. **Experimental** studies involve random assignment. Explain why we would want to randomly assign bank supervisors to receive male or female personnel files. Why couldn't we simply assign subjects based on some characteristic (e.g., giving female personnel files to the oldest 24 supervisors)?

![](Transparent.gif)

12. Let's conduct a null hypothesis significance test. First, state our assumption that nothing special happens.
**Null hypothesis** = $H_{0}$:

13. Assuming this null hypothesis is true, what results should we **expect** from this study? Remember, 35 files were selected for promotion. Complete the table. ![](expect.png)

The actual (observed) results from this study were: ![](observe.png)

From this, we can calculate: 21/24 = 87.5% of the male applicants were recommended for promotion 14/24 = 58.3% of the female applicants were recommended for promotion **If gender had no influence on supervisor decisions, how likely were we to observe these results?** Soon, you'll be able to quickly calculate this likelihood using the *hypergeometric distribution*. For now, let's simulate the random assignment of supervisors to receive male or female files.

## Physical simulation 14. Suppose I gave you 48 blank index cards. Explain how you could use these cards to simulate this experiment. How would you estimate the likelihood of observing the results we observed in this study (or results that are even more extreme)?

![](Transparent.gif)

## Computer simulation
### Applet
Use the applet at [http://www.rossmanchance.com/applets/ChisqShuffle.htm](http://www.rossmanchance.com/applets/ChisqShuffle.htm).
Check the **Sample Data (2x2)** box and input the results we observed in this study.
Choose a "success statistic" and conduct at least 5,000 replications.
15. Estimate the p-value from this study.
**p-value** = _________________________

### R We'll once again use the **Mosaic package** to conduct our simulation. Click `code` to see the code. ```{r 'bank-mosaic', message=FALSE, warning=FALSE} # We already loaded the Mosaic package # library(mosaic) # Let's input the data # We need 21 promoted males, 3 non-promoted males # 14 promoted females, and 10 non-promoted females supervisors <- tibble( gender = c(rep("male", 24), rep("female", 24)), promote = c(rep("yes", 21), rep("no", 3), rep("yes", 14), rep("no", 10))) # Display the data as a table tally(~promote + gender, data = supervisors, margins = TRUE) # Create a mosaic plot mosaicplot(gender ~ promote, data = supervisors, color=TRUE) ``` That's a mosaic plot of our observed results. Look how easy it is to see the different promotion rates for males and females. Now, let's randomly assign the male and female personnel files to the supervisors (35 of whom will promote the applicant) and see one set of possible results: ```{r 'every-day-Im-shuffling', message=FALSE, warning=FALSE} # Shuffle gender in the plot mosaicplot(shuffle(gender) ~ promote, data = supervisors, color=TRUE) ``` That's one possible set of results (assuming gender had no impact on promotion decisions). Let's replicate this process 10,000 times to get a representative sample of possible results: ```{r 'shuffling-shuffling', message=FALSE, warning=FALSE} # We want to do something 10,000 times. What do we want to do? # We want to calculate the number of promoted males # if we were to randomly assign (shuffle) the gender # of the files to each of the 48 supervisors banksims <- do(10000) * tally(~(shuffle(gender) == "male" & promote == "yes"), data=supervisors) # A faster way to do this is to use the length() function # to count the number of results # banksims <- do(10000) * length(which(shuffle(supervisors$gender) == "male" & supervisors$promote == "yes")) # Now let's create a histogram of the results # (The number of promoted males in each of our 10,000 reps) # I'll draw a vertical line at 21. # I'll make each bar have a width equal to 1 # I'll also change the color when 21+ males receive promotion # histogram(~TRUE., data=banksims, v=21, width=1, groups=TRUE.>=21) # I'll use ggplot2 (below) to generate a more attractive histogram # Now I'll calculate the proportion of results that show # 21 or greater promoted males prop(~TRUE.>=21, data=banksims) ``` ```{r 'histogram-supervisors', message=FALSE, warning=FALSE} # Here's the syntax to create the histogram that appears in the activity banksims %>% ggplot(aes(x = TRUE.)) + geom_histogram(binwidth = 1, fill="lightblue", color="white", alpha = 0.9) + annotate("segment", x = 21, xend = 21, y = 0, yend = 1500, color = "red") + labs( title = "10,000 replications of gender shuffles", x = "number of promoted males" ) + scale_x_continuous(breaks=seq(0, 35, 1), minor_breaks=NULL) + theme( axis.text.x = element_text(size = 11, color="grey10"), legend.position = "none", panel.grid.major.y = element_line(colour = "white"), panel.grid.major.x = element_line(colour = "white", size=.15), panel.grid.minor = element_blank(), panel.background = element_rect(fill = "grey93") ) ``` From this, we can calculate the proportion of replications that resulted in 21 or more promoted males: ```{r 'bank-pvalue'} # Here's the code to calculate this proportion # Using mosaic package # We want the proportion # of nflip values >= 15 # in the sims dataset prop(~TRUE. >= 21, data=banksims) # Alternatively, we can calculate this using dplyr banksims %>% summarize(proportion = sum(TRUE. >= 21) / n() ) ``` **p-value** = `prop(~TRUE. >= 21, data=banksims)` = `r prop(~TRUE. >= 21, data=banksims)`.

16. Interpret this p-value.

![](Transparent.gif)

*****
## Dolphins, revisited
Can swimming with dolphins be therapeutic for patients suffering from clinical depression?
To investigate this, researchers recruited 30 subjects aged 18-65 with a clinical diagnosis of mild-to-moderate depression. The subjects — who were required to discontinue use of any antidepressant drugs or psychotherapy 4-weeks prior to the experiment — were sent to an island off the coast of Honduras, where they were randomly assigned to one of two treatment groups.
Both groups engaged in the same amount of swimming and snorkeling each day, but one group did so in the presence of bottlenose dolphins and the other group did not. At the end of two weeks, each subject's level of depression was evaluated, as it had been at the beginning of the study, and it was determined whether they showed substantial improvement (reducing their level of depression) by the end of the study.
**Does the presence of dolphins reduce depression levels?**
17. TRUE or FALSE:
• The subjects in this study were **randomly selected** from a population

• The subjects were **randomly assigned** to groups

18. 10 of 15 subjects in the dolphin therapy group (compared to only 3 of 15 in the control group) showed improvement. Organize these results in the following 2x2 table: ![](dolphtable.png) 19. Calculate the following proportions:

• Proportion in the dolphin group showing improvement = ________________

• Proportion in the control group showing improvement = ________________

![](Transparent.gif)

20. Assuming dolphin therapy has no effect on depression, estimate the likelihood of observing 10 or more subjects in the dolphin therapy group who show improvement. Use the [applet](http://www.rossmanchance.com/applets/ChisqShuffle.htm) or R to conduct a simulation and record your p-value.
**p-value** = _________________________

![](Transparent.gif)

**Sources:**
• Bastian, J. (1967). The transmission of arbitrary environmental information between bottlenose dolphins. In Animal sonar systems: Biology and bionics, ed. R.-G. Busnel, pp. 807-873. Jouy-en Josas, France: Laboratoire de Physiologie Acoustique.
• Tintle, N., et al. (2015). Introduction to Statistical Investigations, Preliminary edition
• Rosen, B. & Jerdee, T. (1974). Influence of sex role stereotypes on personnel decisions. Journal of Applied Psychology, 59: 9-14.
[Creative Commons Attribution-ShareAlike 3.0 Unported](http://creativecommons.org/licenses/by-sa/3.0) license.