Assignment 2, Problem 2#

Load the email data from the openintro package into your R session and run the following code:

library(openintro)
data(email)
quarter <- seq(from = {{ params.variant }}, to = 3921, by = 4)
tab <- table(email$spam[quarter], email$format[quarter])

This will produce a contingency table (without row/column totals).

Suppose that an email is randomly selected from the sample emails use in this question, and define events A = {spam = 1} and B = {format = 1}.

Part 1#

Access the help file for this data set to determine what format = 0 represents. Select the correct answer from below.

Answer Section#

  • The email was sent from a known contact

  • The email was sent from an unknown contact

  • The email was written using HTML

  • The email was not written using HTML

Part 2#

\(P(A\cap B)\) in words:

Answer Section#

  • the probability that a spam email is written using HTML

  • the probability that an email written using HTML is spam

  • the probability that the email is written using HTML and is spam

  • the probability that the email is written using HTML or is spam

Part 3#

\(P(B \mid A)\) in words:

Answer Section#

  • the probability that a spam email is written using HTML

  • the probability that an email written using HTML is spam

  • the probability that the email is written using HTML and is spam

  • the probability that the email is written using HTML or is spam

Part 4#

Please create a single R object named x containing the following, rounded to 3 decimal places:

  • a numeric pa with value \(P(A)\)

  • a numeric pab with value \(P(A\cap B)\)

  • a numeric pamb with value \(P(B \mid A) = \dfrac{P(A\cap B)}{P(B)}\)

Answer Section#

Attribution#

Problem is from the OpenIntro Statistics textbook, licensed under the CC-BY 4.0 license.
Image representing the Creative Commons 4.0 BY license.