# Load the tidyverse packages
library(tidyverse)

Can dolphins communicate?

In 1964, a researcher trained two dolphins (Buzz and Doris) to push one of two buttons in a pool in reaction to a light. If the light flashed, the dolphins pushed the button on the left to get a fish. If the light was constant, the dolphins needed to push the button on the right.

Once they learned this behavior, the dolphins were separated by a wall (so only Doris could see the light and only Buzz could push the buttons). To get a fish, Doris would need to communicate with Buzz.

The researcher, randomly deciding to make the light shine constant or flash, tested the dolphins’ ability to communicate. Out of 16 attempts, the dolphins pushed the correct button 15 times.

Introduce yourself to another student in the class. Together, decide if the results of this experiment suggest dolphins can communicate abstract concepts. Explain your reasoning.

Explain why you would not be impressed if the dolphins pushed the correct button 8 or 9 times.

How many successful trials (out of 16) would it take for you to conclude that dolphins can, perhaps, communicate?

Physical simulation

Assuming dolphins cannot communicate, how likely was Buzz to push the correct button 15 out of 16 trials? Estimate this likelihood through a simulation and record your results with a dotplot.

Null hypothesis significance testing

This is the logic behind a type of statistical inference called null hypothesis significance testing. We:

• Express a prior belief (or assumption) that nothing special happened Dolphins cannot communicate; Buzz is randomly pushing buttons.

• Collect data. Buzz pushed the correct button 15 out of 16 trials in this experiment.

• Estimate the likelihood of observing this data (or something even more extreme) under our prior belief/assumpion.

• Draw a conclusion or update our beliefs.

Computer simulation

We can use online applets or R.

Online Applet

Open http://www.rossmanchance.com/applets/OneProp/OneProp.htm

To use this applet, we must input:

probability of heads = __________

number of tosses = __________

number of repetitions = __________

number of heads as extreme as: _________________________.

Click Draw Samples and you should see a histogram similar to this one:

What does this distribution represent? What conclusions can we make from it? Could Buzz have pushed the correct button 15 times by chance?

If dolphins cannot communicate, how likely was Buzz to push the correct button 15 out of 16 times? To estimate this p-value in the applet, click the count button.

p-value = ____________________

Note: If you check the Exact Binomial box, you’ll see p = 0.003. We’ll learn how to calculate this using the Binomial Distribution.

R

We can load the Mosaic package to quickly run a simulation.

# If you need to install the Mosaic package, use:
# install.packages("mosaic")

# Load the mosaic package
library(mosaic)

The rflip() and nflip() functions simulate coin tosses. The default inputs for the functions are:

nflip(n = 1, prob = 0.5), where:

n = number of coins to toss

prob = the probability of obtaining heads on any toss.

To simulate multiple replications, we’ll use the Do() function.

Let’s run 10,000 replications of our simulation and plot the results

# Flip 16 coins 10,000 times
# In other words...
#    do 10,000 replications of flipping 16 coins
# We'll store the results in a data frame called "sims"
sims <- do(10000) * nflip(n = 16, prob = 0.5)

# Look at some of the results
sims
#    # A tibble: 10,000 × 1
#       nflip
#       <dbl>
#    1      7
#    2      5
#    3      8
#    4      8
#    5     10
#    6      7
#    7      6
#    8     10
#    9      8
#    10     8
#    # ... with 9,990 more rows

# This is the simple code I would use to create
# a histogram of the results
# The simulations are stored in "sims"
# The variable we want to plot is "nflip"
# We want to draw a vertical line at x = 15
# I set the width of the bars equal to 1:

# histogram(~nflip, data=sims, v=15, width=1  )

# To make it look nice, I'll use ggplot2
# The syntax looks more complicated
sims %>%
  ggplot(aes(x = nflip)) +
  geom_histogram(binwidth = 1, fill="lightblue", color="white", alpha = 0.8) +
  annotate("segment", x = 15, xend = 15, y = 0, yend = 1500, color = "red") +
  labs(
      title = "10,000 replications of 16 coin tosses",
      x = "number of heads"
      ) +
  scale_x_continuous(breaks=seq(0, 16, 1), minor_breaks=NULL) +
    theme(
    axis.text.x = element_text(size = 11, color="grey10"),
    legend.position = "none",
    panel.grid.major.y = element_line(colour = "white"),
    panel.grid.major.x = element_line(colour = "white", size=.15),
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "grey93")
  )

From this, we can calculate the proportion of replications that resulted in 15 or more heads (the proportion to the right of the red line):

p-value = prop(~nflip >= 15, data=sims) = 0.0001.

# Here's the code to calculate this proportion

# Using mosaic package
# We want the proportion
# of nflip values >= 15
# in the sims dataset
prop(~nflip >= 15, data=sims)
#      TRUE 
#    0.0001

# Alternatively, we can calculate this using dplyr
sims %>%
  summarize(proportion = sum(nflip >= 15) / n() )
#    # A tibble: 1 × 1
#      proportion
#           <dbl>
#    1     0.0001

Check for understanding

Suppose Buzz had only pushed the correct button 10 times out of 16. Estimate the p-value and explain what it represents.

This time, suppose Buzz had pushed the correct button 100 times out of 160 trials. Predict which of the following statements is true. Then, estimate the p-value.

(a) P(100+ correct out of 160) is < P(10+ correct out of 16).

(b) P(100+ correct out of 160) is = P(10+ correct out of 16).

(c) P(100+ correct out of 160) is > P(10+ correct out of 16).

p-value = P(100+ correct out of 160) = _________________________________________.

Paul the Octopus

Paul the Octopus — in choosing which of two boxes of food to eat first — successfully predicted the winner in 12 of 14 World Cup matches.

Simulate coin flips in R to estimate the likelihood of Paul getting 12 out of 14 correct. What assumptions are you making in calculating this p-value?

Hint: Fill-in-the-blanks to run the simulation, plot a histogram, and estimate the p-value.

_______________ <- do(__________) * nflip(n = __________, prob = __________)

histogram(~nflip, data=__________, v=__________, width=1)

prop(~nflip >= __________, data = _______________)

p-value = _________________________________

Based on this p-value, would you conclude that Paul actually had the ability to predict the winner of World Cup matches? Explain.

Bank supervisors

In 1972, personnel files were given to 48 male bank supervisors. The supervisors were asked if the applicant described in the file should be promoted to a branch manager position.

The personnel files given to each supervisor contained identical information except for the gender of the applicant. Files for 24 of the supervisors indicated the applicant was female; the other 24 files indicated the applicant was male.

Ultimately, 35 of the 48 supervisors recommended the applicant be promoted.

Experimental studies involve random assignment. Explain why we would want to randomly assign bank supervisors to receive male or female personnel files. Why couldn’t we simply assign subjects based on some characteristic (e.g., giving female personnel files to the oldest 24 supervisors)?

Let’s conduct a null hypothesis significance test. First, state our assumption that nothing special happens.

Null hypothesis = \(H_{0}\):

Assuming this null hypothesis is true, what results should we expect from this study? Remember, 35 files were selected for promotion. Complete the table.

The actual (observed) results from this study were:

From this, we can calculate:

21/24 = 87.5% of the male applicants were recommended for promotion

14/24 = 58.3% of the female applicants were recommended for promotion

If gender had no influence on supervisor decisions, how likely were we to observe these results?

Soon, you’ll be able to quickly calculate this likelihood using the hypergeometric distribution.

For now, let’s simulate the random assignment of supervisors to receive male or female files.

Physical simulation

Suppose I gave you 48 blank index cards. Explain how you could use these cards to simulate this experiment. How would you estimate the likelihood of observing the results we observed in this study (or results that are even more extreme)?

Computer simulation

Applet

Use the applet at http://www.rossmanchance.com/applets/ChisqShuffle.htm.

Check the Sample Data (2x2) box and input the results we observed in this study.

Choose a “success statistic” and conduct at least 5,000 replications.

Estimate the p-value from this study.

p-value = _________________________

R

We’ll once again use the Mosaic package to conduct our simulation. Click code to see the code.

# We already loaded the Mosaic package
# library(mosaic)

# Let's input the data
# We need 21 promoted males, 3 non-promoted males
# 14 promoted females, and 10 non-promoted females
supervisors <- tibble(
  gender = c(rep("male", 24), rep("female", 24)),
  promote = c(rep("yes", 21), rep("no", 3),
              rep("yes", 14), rep("no", 10)))

# Display the data as a table
tally(~promote + gender, data = supervisors, margins = TRUE)
#           gender
#    promote female male Total
#      no        10    3    13
#      yes       14   21    35
#      Total     24   24    48

# Create a mosaic plot
mosaicplot(gender ~ promote, data = supervisors, color=TRUE)

That’s a mosaic plot of our observed results. Look how easy it is to see the different promotion rates for males and females.

Now, let’s randomly assign the male and female personnel files to the supervisors (35 of whom will promote the applicant) and see one set of possible results:

# Shuffle gender in the plot
mosaicplot(shuffle(gender) ~ promote, data = supervisors, color=TRUE)

That’s one possible set of results (assuming gender had no impact on promotion decisions). Let’s replicate this process 10,000 times to get a representative sample of possible results:

# We want to do something 10,000 times.  What do we want to do?
# We want to calculate the number of promoted males
# if we were to randomly assign (shuffle) the gender
# of the files to each of the 48 supervisors
banksims <- do(10000) * tally(~(shuffle(gender) == "male" & promote == "yes"), data=supervisors)

# A faster way to do this is to use the length() function
# to count the number of results
# banksims <- do(10000) * length(which(shuffle(supervisors$gender) == "male" & supervisors$promote == "yes"))

# Now let's create a histogram of the results
# (The number of promoted males in each of our 10,000 reps)
# I'll draw a vertical line at 21.
# I'll make each bar have a width equal to 1
# I'll also change the color when 21+ males receive promotion
# histogram(~TRUE., data=banksims, v=21, width=1, groups=TRUE.>=21)

# I'll use ggplot2 (below) to generate a more attractive histogram

# Now I'll calculate the proportion of results that show 
# 21 or greater promoted males
prop(~TRUE.>=21, data=banksims)
#      TRUE 
#    0.0258


# Here's the syntax to create the histogram that appears in the activity
banksims %>%
  ggplot(aes(x = TRUE.)) +
  geom_histogram(binwidth = 1, fill="lightblue", color="white", alpha = 0.9) +
  annotate("segment", x = 21, xend = 21, y = 0, yend = 1500, color = "red") +
  labs(
      title = "10,000 replications of gender shuffles",
      x = "number of promoted males"
      ) +
  scale_x_continuous(breaks=seq(0, 35, 1), minor_breaks=NULL) +
    theme(
    axis.text.x = element_text(size = 11, color="grey10"),
    legend.position = "none",
    panel.grid.major.y = element_line(colour = "white"),
    panel.grid.major.x = element_line(colour = "white", size=.15),
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "grey93")
  )

From this, we can calculate the proportion of replications that resulted in 21 or more promoted males:

# Here's the code to calculate this proportion

# Using mosaic package
# We want the proportion
# of nflip values >= 15
# in the sims dataset
prop(~TRUE. >= 21, data=banksims)
#      TRUE 
#    0.0258

# Alternatively, we can calculate this using dplyr
banksims %>%
  summarize(proportion = sum(TRUE. >= 21) / n() )
#    # A tibble: 1 × 1
#      proportion
#           <dbl>
#    1     0.0258

p-value = prop(~TRUE. >= 21, data=banksims) = 0.0258.

Interpret this p-value.

Dolphins, revisited

Can swimming with dolphins be therapeutic for patients suffering from clinical depression?

To investigate this, researchers recruited 30 subjects aged 18-65 with a clinical diagnosis of mild-to-moderate depression. The subjects — who were required to discontinue use of any antidepressant drugs or psychotherapy 4-weeks prior to the experiment — were sent to an island off the coast of Honduras, where they were randomly assigned to one of two treatment groups.

Both groups engaged in the same amount of swimming and snorkeling each day, but one group did so in the presence of bottlenose dolphins and the other group did not. At the end of two weeks, each subject’s level of depression was evaluated, as it had been at the beginning of the study, and it was determined whether they showed substantial improvement (reducing their level of depression) by the end of the study.

Does the presence of dolphins reduce depression levels?

TRUE or FALSE:

• The subjects in this study were randomly selected from a population

• The subjects were randomly assigned to groups

10 of 15 subjects in the dolphin therapy group (compared to only 3 of 15 in the control group) showed improvement. Organize these results in the following 2x2 table:

Calculate the following proportions:

• Proportion in the dolphin group showing improvement = ________________

• Proportion in the control group showing improvement = ________________

Assuming dolphin therapy has no effect on depression, estimate the likelihood of observing 10 or more subjects in the dolphin therapy group who show improvement. Use the applet or R to conduct a simulation and record your p-value.

p-value = _________________________

Sources:

• Bastian, J. (1967). The transmission of arbitrary environmental information between bottlenose dolphins. In Animal sonar systems: Biology and bionics, ed. R.-G. Busnel, pp. 807-873. Jouy-en Josas, France: Laboratoire de Physiologie Acoustique.

• Tintle, N., et al. (2015). Introduction to Statistical Investigations, Preliminary edition

• Rosen, B. & Jerdee, T. (1974). Influence of sex role stereotypes on personnel decisions. Journal of Applied Psychology, 59: 9-14.

Creative Commons Attribution-ShareAlike 3.0 Unported license.

Intro to probability & inference

Can dolphins communicate?

Physical simulation

Null hypothesis significance testing

Computer simulation

Online Applet

R

Check for understanding

Paul the Octopus

Bank supervisors

Physical simulation

Computer simulation

Applet

R

Dolphins, revisited