Let’s finish up the examples we were working through about the Empirical Rule (following the order in our slides), then we’ll revisit the others we hadn’t yet gotten to in class.

In going through this part of the content, use it in whatever way best suits your practice and learning. For example, you might skip certain items, practice more intently with others, etc.

First up, let’s take a second to remember what you know about the Empirical Rule. For example, when does it apply? What percentages fit in each “section” of the graph?

Now, grab your statistical tools (calculator, paper, pen, brain) and let’s work through some examples. (This also works out nicely as a bit of review for the upcoming Quiz on Wednesday!)

# Empirical Rule Examples

**1. GPA of graduating seniors must fall between 2.0 and 4.0. Consider the possible values of standard deviation {-10, 0, 0.4, 1.5, 6}, and assume this variable shows an approximately normal shape. Which value is impossible? Why? Which is the most realistic value? Why?**

**2. When calculated using ultrasound in the second trimester, the average pregnancy duration is estimated to be 280.6 days (with a standard deviation of 9.7 days). What is the approximate likelihood that a randomly selected pregnancy will last 300 days or more?**

Alright, now let’s put our newfound pregnancy duration stats knowledge to some serious use…

**3. A husband and wife have a baby who is born not long after the husband returns from working overseas for many months. The baby is born 312 days after the husband left the country for work. **

**As we saw in the previous problem, when calculated using ultrasound in the second trimester, the average pregnancy duration is estimated to be 280.6 days (with a standard deviation of 9.7 days).**

**What is the approximate likelihood that the husband is the father of the baby?**

Let’s walk through what we need to solve for and the answer together.

Whew, so now that you’re convinced that statistics gives us all the answers we need in life… let’s switch back to something lighter for our last bit of Empirical Rule practice.

**4. Suppose fans of Cincinnati’s NFL team, the Bengals, attend an average of 3 football games per year (s = 0.9), while fans of the Kansas City Chiefs attend an average of 6 games per year (with s = 1.7). Assume this variable is normally distributed. **

**What is the approximate chance that a randomly selected Bengals fan attended 5 or more games? **

**What is the approximate chance that a randomly selected Chiefs fan attended 5 or more games?**

**And if it were NOT normally distributed, what shape do you think this variable’s distribution could reasonably take, and why?**

And, what about a reasonable, *non-normal* shape that could possibly be the case for the variable # of Games Attended Per Year?

You could have picked any shape here, so long as you have a *reasonable* (for example, logical, realistic) and *matching* (that is, the explanation you provide matches the shape you selected) justification for it.

For example, we could have reasoned that most fans only get to attend 1 game per year, with fewer fans attending more games. That would mean the distribution **could reasonably be right-skewed**, with higher frequencies (more fans) “scoring” lower (attending lower numbers of games).

**Or, a bimodal shape might be reasonable too** – you could have the same situation as we just described, but have a second “peak” on the upper end, with a group of super-fans attending every game. That would give you the “rare” attendees on the left, then the “every game” attendees on the right, making a bimodal shape a possibility.

I hope you’re feeling good by now about the Empirical Rule, and the cool applications and uses of it for answering many different types of questions! And, know that you have extra practice options with your problem sets as well as in a separate “extra practice” PDF posted on Moodle. Do get in touch with me if you’ve got questions.

Let’s rewind now and hit the examples that we’ve passed over previously in class. Again, these are probably especially helpful to review now with our Quiz coming up.

# Examples from Earlier Material

## Sampling

Let’s start by remembering what **simple** **random sampling** is, and I want to show you how this can actually be carried out.

Want another example of that simple random sampling process?

And let’s revisit **stratified** and **cluster random sampling**. First, think back to what these are…

And let’s also talk briefly about how these work in practice.

For a **stratified sample** (sampling only *some* people from *each* group), you’ll want to have groups in mind… let’s use the fraternities on campus as our groups for this example. The table below gives us numbers of members from Fall 2019.

We’d first figure out the percentage of the population (all fraternity members) that each group represents. For example, the first group represents 3.69% of the whole population.

Then, we apply that percentage to our desired sample size. Let’s say we want *n* of 100 as an example. That would mean we want to sample 3.69… well, we need to round to whole numbers!… 4 people from the first group.

Then we repeat that same calculation (% of population, corresponding # to sample) for every other group to determine how many members to randomly select within each fraternity.

For a **cluster random sample** (sampling *everyone* from a given group, but only sample *some *of the groups), you’ll need to randomly select the groups themselves to be sampled. Then, sample all of the members but only from those selected groups.

## Research Design

Alright, what were our** three types of research designs**?

Let’s think through some examples of these designs.

First, let’s talk through the case of examining the IV or *X* variable of sunscreen use and the DV or *Y* variable of skin cancer.

With those examples in mind, think through how we could design another study, this time on a popular response when I asked y’all about studying tricks… chewing gum while you study, then chewing gum again when you take a test. (Apparently, it might trigger your memory by association with the gum chewing!)

How could we design an **experiment** to test the effectiveness of this gum-chewing technique for studying? What about a **quasi-experiment**? And an **observational** study?

## Tables and Graphs

Moving right along to descriptive statistics… First up, fill in the blanks in this** frequency distribution**. The variable here is rideshare drivers’ agreement with the survey item, “Most days I am enthusiastic about my job.”

Item Response | # | % | Cum. % |

Strongly disagree | 275 | ||

Disagree | 141 | ||

Neutral | 89 | ||

Agree | 188 | ||

Strongly Agree | 231 | ||

Total | 924 |

**Some follow up Qs based on this table:** What’s the mode? What about the median? How would you interpret both of those? What’s the shape of this variable’s distribution?

Last little bit, folks…

Let’s think back over when it is appropriate to use the **mean**, **median**, and/or the **mode**.

For each of the following variables, state which measure or measures of central tendency listed above should be used.

1. Temperature (in Fahrenheit)

2. Number of intramural games won in a semester

3. Favorite music type (*country, pop, rap, alternative*, or *R&B*)

4. Productivity level (*very productive, somewhat productive, *or* unproductive*)